Press "Enter" to skip to content

Curated SQL Posts

MLflow in Action and Responsible AI

Tomaz Kastrun continues an advent of Azure ML. Day 16 shows off MLflow:

Yesterday we have looked into how to start the MLflow configurations and today, let’s put this to the test.

We will create a new notebook and use Heart dataset (link to dataset) to toy around. We will also import xgboost classifier to asses the accuracy of the presence of heart disease in the patient. We will be using a categorical (integer) variable with values from 0 (no presence) to 4 (strong presence) and attempt to classify based on 15+ attributes (out of more than 70 attributes).

Day 17 pivots to using the responsible AI dashboard:

Azure ML has provided users with collection of model and data exploration with the Studio user interface. But it also provides compatible solutions with Azure ML and Python package responsibleai. With the help of widgets, we will create an sample of dashboard to explore the solution with assessing the responsible decisions and actions.

Comments closed

Building Retry Logic for Database Calls

Jose Manuel Jurado Diaz tries and tries again:

Today, I worked on a case that our customer faced an execution command timeout “Msg -2, Level 11, State 0, Line 0 – Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding

As this business process is running over the night and they need to ensure that the execution will be completed, they asked if there is possible to implement an Execution Retry Logic. 

In the similar way that we have for Retry-Logic for Transient Failure We could implement a similar mechanism to retry the operation, the only thing that we need is to change the commandTimeout parameter, for example, in .NET. 

Click through for an example of how you can implement this in code. I’d also recommend Polly, which is a library explicitly built for these sorts of issues.

Comments closed

Reverse File Order and Rename via Powershell

Jana Sattainathan gets things backwards and then forwards:

In the case of this app, I just did “Select All” within the app and moved all the videos over to “Photos”. When I downloaded the content to my computer, I noticed that it downloaded the most recently downloaded video first and the oldest video last. This meant the file names given to the videos were in reverse order of chronological order.

Read on to see how you can use Powershell to sort this all out.

Comments closed

Reducing Child Procedure Temp Table Plan Cache Pollution

Joe Obbish resolves a specific type of plan cache issue involving shared temp tables:

The problem can be observed with a simple repro script. The child stored procedure performs a series of updates on a temp table. The parent procedure creates the temp table, calls the child stored procedure, and returns data from the temp table.

Click through to see an example of the issue, as well as one technique to mitigate the problem.

Comments closed

Fixing Formula.Firewall Issues in Power Query

Imke Feldmann shorts out a firewall issue:

Formula.Firewall issues can hit you when designing your queries or even “out of the blue” when suddenly refreshes in the service are failing due to changes in the query evaluation.
You will find a lot of methods published on the internet which are good and cover different scenarios. But there is also a very quick fix method that I learned from Miguel Escobar that I want to demonstrate in this post. This will basically circumvent the data privacy level, so make sure that you understand the implications (risk of data leakage from one source to another). If not, please read Miguels article first!

After reading Miguel’s post, read on for a fix.

Comments closed

Certifying Content in Power BI

Soheil Bakhshi certifies the quality of this Power BI content:

In the previous post, we discussed that a Power BI administrator must enable certification and grant sufficient rights to the security groups. Therefore, all members of the specified security group are authorised to certify the content. If you are a Power BI administrator, follow these steps to do so:

This post is a step-by-step guide to enabling content certification, as well as how to certify specific types of content.

Comments closed

File Seeding with dbt

Ust Oldfield sneaks in a file:

File seeding is a useful way of maintaining and deploying static, or very slowly changing, reference data for a data model to use repeatably and reliably across environments, whilst benefitting from source control.

If you aren’t in the Databricks world, this also feels like a job for DVC.

Comments closed

Sharing Results between Notebooks with MSSparkUtils

Liliam Leme provides an answer to a common Synapse Spark pool question:

I’ve been reviewing customer questions centered around “Have I tried using MSSparkUtils to solve the problem?”

One of the questions asked was how to share results between notebooks. Every time you hit “run” in a notebook, it starts a new Spark cluster which means that each notebook would be using different sessions. Making it impossible to share results between executions of notebooks. MSSparkUtils offers a solution to handle this exact scenario. 

Read on to see what MSSparkUtils is and how it helps in this case.

Comments closed

Becoming Familiar with MLflow

Tomaz Kastrun continues an advent series on Azure ML:

MLFlow is an open-source framework for registering, managing and tracking machine learning models. It is multiplatform, bringing consistent model training and model consumption across different platforms. This means, that training a model locally and uploading it to Azure or training a model on remote compute instances and downloading it, is a great feature for MLflow.

You can use MLflow with Azure CLI, Azure Python SDK or in the studio and it will deliver a consistent experience (note, some functionalities are limited to the language).

Click through for a quick overview of MLflow.

Comments closed

Running Power BI Report Server

Reza Rad stays on-premises:

Power BI is not only a cloud-based reporting technology. Due to the demand for some businesses to have their data and reporting solutions on-premises, Power BI also has the option to be deployed fully on-premises. Power BI on-premises hosting is called Power BI Report Server. This post concerns using Power BI in a fully on-premises solution with Power BI Report Server.

This post will teach you everything you need about the on-premises world of Power BI. You will learn how to install Power BI Report Server, learn all requirements and configurations for the Power BI Report Server to work correctly, and see all the pros and cons of this solution. At the end of this post, you will be able to decide if Power BI on-premises is the right choice for you, and if it is, then you will be able to set a Power BI on-premises solution up and running easily.

I used Power BI Report Server for a few years. My short version is that it’s really useful if you aren’t allowed to use Power BI Online (as was my case) but if you know what’s in the Online version, you’ll see just how much you’re missing out on.

Comments closed