Press "Enter" to skip to content

Month: August 2022

Network Analysis in R via netUtils

David Schoch has an R package for us:

During the last 5 years, I have accumulated various scripts with (personal) convenience functions for network analysis and I also implemented new methods from time to time which I could not find in any other package in R. The package netUtils gathers all these functions and makes them available for anyone who may also needs to apply “non-standard” network analytic tools. In this post, I will briefly highlight some of the most prominent functions of the package. All available functions are listed in the README on github.

Click through to see what’s available in the package. H/T R-Bloggers.

Comments closed

Developing a Flask App with RStudio Connect

Parisa Gregg crosses the language barrier:

One of the Python applications you can deploy to RStudio Connect is Flask. Flask is a WSGI (Web Server Gateway Interface) web application framework and provides a Python interface to enable the building of web APIs. It is useful to data scientists, for example for building interactive web dashboards and visualisations of data, as well as APIs for machine learning models. Deploying a Flask app to a publishing platform such as RStudio Connect means it can then be used from anywhere and can be easily shared with clients.

This blog post focuses on how to deploy a Flask app to RStudio Connect. We will use a simple example but won’t go into detail on how to create Flask apps. If you are getting started in Flask you may find this tutorial useful.

Read on for a demo.

Comments closed

PolyBase in SQL Server 2022: Cosmos DB via MongoDB API

I have gotten back on the data virtualization wagon:

Back in the 2019 days, I noted a problem when CU2 of SQL Server 2019 came out. This is because the Cosmos DB collection I was using reported a wire version of 2 rather than the minimum version of 3. The official fix at that time was to create a new collection using the then-latest version of 3.6 but that didn’t work for me. My workaround was to use the old MongoDB drivers that shipped with SQL Server 2019 RTM.

Well, as of 2022, that solution won’t work anymore. The original MongoDB drivers don’t ship with SQL Server 2022, so we can’t use that workaround. I had a Cosmos DB account that was originally built on version 3.6. Even after upgrading to server version 4.2, it still reported wire version 2 when I connected to the endpoint that was relevant 3 years ago. Therein lies the solution to the problem.

It turns out there are two viable solutions now and I show both of them.

Comments closed

Power Automate and Dataset-Driven Power BI Subscriptions

Dan English follows up on a prior topic:

In the last post I went over using Power Automate to perform a data driven report subscription using a Paginated report referencing an AAS database. The flow referenced an Excel file with the information to make the process data driven and generate 2000 PDF files that could then be emailed to users. In the flow the PDF files were simply placed in a OneDrive folder for testing purposes to validate the flow would run as expected and to review the metrics after the fact to evaluate the impact of running the process.

For the follow up there were two items that I wanted to compare against the original flow

1. Moving the AAS database being referenced to a Power BI dataset hosted in the same capacity as the Paginated report

2. Using a Power BI report instead of a Paginated report

In this post I will cover the first comparison. 

Check out what changes and what stays the same between using Azure Analysis Services and Power BI-hosted datasets.

Comments closed

Data Modification with Synapse Link for SQL Server 2022

Kevin Chant changes some data:

In this post I want to cover some things that happen internally when you do updates and deletes with Azure Synapse Link for SQL Server 2022 whilst it is running.

Because recently somebody asked if Azure Synapse Link for SQL Server 2022 captures updates and deletes after they had read a previous post. Where I covered my initial tests for Azure Synapse Link for SQL Server 2022.

Anyway, short answer is that Azure Synapse Link for SQL Server 2022 captures updates and deletes. In this post I will go into more detail about some of the things that appear to happen along the way.

Click through for Kevin’s tests and what the results look like.

Comments closed

Azure Synapse Analytics August 2022 Updates

Ryan Majidimehr has a changelog for us:

Full support for MLflow

MLflow is a platform for managing the machine learning lifecycle and streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. We are very happy to announce that SynapseML models now integrates with MLflow with full support for saving, loading, deployment, and autologging!

To learn more, read MLflow in SynapseML getting started guide and SynapseML Autologging.

There are quite a few changes on this list, so they’ve definitely been busy.

Comments closed

The SQL ConstantCare Population Report

Brent Ozar surveys the SQL Server landscape:

Ever wonder how fast people are adopting new versions of SQL Server, or what’s “normal” out there for SQL Server adoption rates, hardware sizes, or numbers of databases? Let’s find out in the summer 2022 version of our SQL ConstantCare® population report.

Out of the 3,151 servers sending in data recently, the most popular version of SQL Server is still 2016:

Every time I link to this, I say the same two things: one, that it’s important to keep in mind that this is necessarily a biased sample (in that it includes the subset of organizations which has people who are familiar with Brent’s work, spends enough money to get ConstantCare support, and has a need for this support); and two, that I appreciate this sample. Yes, it’s biased and not necessarily indicative of the broader market but it’s also useful information, especially seeing trends over time.

Comments closed

Understanding Write-Ahead Logging

Kevin Sookocheff explains how write-ahead logging protects data in databases:

A central tenet of databases is that any committed data survives a crash or a failure. Write-ahead logging is a fundamental primitive that ensures all changes to data are first written safely to stable storage before being applied. Coupling that with some careful use of sequence numbers and we can guarantee that changes made to a database can survive system crashes.

This is a core feature in pretty much every relational database and Kevin dives into how one of the key algorithms behind it works.

Comments closed

KQL BETWEEN

Robert Cain proves it’s not the end of the line in his KQL series:

It’s not uncommon to want to use a range of values when creating a Kusto query. This might be a range of numeric values, or perhaps a range of dates.

Kusto provides this ability using the between operator. In this post we’ll see how to use it when authoring your Kusto queries.

Click through to see how you can use between as well as logical alterations such as not between.

Comments closed