Press "Enter" to skip to content

Month: April 2022

Rounding Differences in Power BI

Marco Russo explains the importance of data types for rounding in Power BI:

In one of the last classrooms I delivered, students were wondering why the results of their formulas were close but not identical to the proposed solution. We quickly identified the problem being an issue of data type conversion already covered in Understanding numeric data type conversions in DAX. However, the issue is interesting as a simpler example to show that different DAX calculations can produce different results because of a different way of rounding numbers!

Read on for Marco’s example.

Comments closed

Keeping Secrets in Azure DevOps

Kevin Chant has a secret:

In this post I want to cover how you can keep your Azure Synapse secrets secret in Azure DevOps. Because you need to do this if you are working with production deployments.

With this in mind, I want to raise more awareness about it and make sure others avoid putting secrets directly in their pipelines like in the below example.

Read on to understand what options are available to you. My preference involves Key Vault references but there are alternatives available.

Comments closed

Custom Model Evaluation Metrics with MLflow

Mark Zhang shows off a new bit of functionality in MLflow:

According to an internal customer survey, 75% of respondents say they frequently or always use specialized, business-focused metrics in addition to basic ones like accuracy and loss. Data scientists often utilize these custom metrics as they are more descriptive of business objectives (e.g. conversion rate), and contain additional heuristics not captured by the model prediction itself.

In this blog, we introduce an easy and convenient way of evaluating MLflow models on user-defined custom metrics. With this functionality, a data scientist can easily incorporate this logic at the model evaluation stage and quickly determine the best-performing model without further downstream analysis

Click through to see how to use built-in metrics but also how to create your own.

Comments closed

String Concatenation in R

Benjamin Smith creates a function:

While it is possible to use the paste() or paste0() for string concatenation. I do understand how it can be messy to deal with, especially when working with loops and/or nested functions. In this short blog I share a remedy for this by writing a special function which can lend for cleaner code as opposed to using paste() or paste0().

It’s not quite as nice as a here string (e.g., @"{FirstName} just referenced the name here string at {UserTime}" user.FirstName DateTime.UtcNow) but this is a good reminder that operator creation in R is pretty easy. H/T R-Bloggers.

Comments closed

Azure Data Studio April 2022 Updates

Timi Oshin has some release notes for us:

We are excited to announce the general availability of the Azure SQL Migration extension for Azure Data Studio. Among many other capabilities, this extension can be used for migrating SQL Server databases to Azure for an enhanced user experience. With this extension, users can get right-sized Azure recommendations based on performance data collected from your source SQL Server databases to optimize for cost and scale. The migration experience is powered by the Azure Database Migration Service which provides a scalable, resilient, and secure way to meet the needs of your organization. See below for a snapshot UI of this extension.

Click through for more notes on Azure SQL migration, the table designer, and more.

Comments closed

Subscribing to Power BI Reports

Reza Rad looks at e-mail subscriptions of Power BI reports:

Have you ever wondered is it possible to have updates of the Power BI report to be emailed to you (or some other colleagues) on a daily basis? Power BI, fortunately, has this feature, it is called Subscription. Subscriptions are helpful ways to send an up-to-date version of the report and dashboard to the users’ email addresses on a scheduled basis. In this article and video, I’ll explain what a subscription is and how it works in Power BI.

Click through for the video and complete blog post.

Comments closed

Splitting Strings with Quoted Names

Daniel Hutmacher mixes separators with regular characters:

Suppose you have a delimited string input that you want to split into its parts. That’s what STRING_SPLIT() does:

DECLARE @source nvarchar(max)='Canada, Cape Verde, '+    'Central African Republic, Chad, Chile, China, Colombia, Comoros';

SELECT TRIM([value]) AS[Country]
FROM STRING_SPLIT(@source, ',');

Simple enough. But delimited lists are tricky, because the delimiter could exist in the name itself. Look for yourself what happens when we add the two Congos to the list:

Daniel has a clever solution to the problem.

Comments closed

Optimizing Index Spools

Francisco looks at index spools:

When we are analyzing execution plans, we may come across different types of Spool operators – Table Spools, Row Count Spools, Window Spools or Index Spools – that the Query Optimizer chooses for specific purposes. In this post we are going to briefly look into the Index Spool, how it can sometimes lead to suboptimal query performance, and what can be done to easily fix it.

My favorite description of this is Erik Darling’s: spools are SQL Server’s passive-aggressive way of telling you “I’m not saying you need an index but you need an index.”

Comments closed

Logic Apps: Source Control and Deployment

Koen Verbeeck has a two-parter. First up is storing Logic App code in source control:

At a data warehouse project I’m using a couple of Logic Apps to do some lightweight data movements. For example: reading a SharePoint list and dumping the contents into a SQL Server table. Or reading CSV files from a OneDrive directory and putting them in Blob storage. Some of those things can be done in Azure Data Factory as well, but it’s easier and cheaper to do them with Logic apps.

Logic Apps are essentially JSON code behind the scenes, so they should be included into the source control system of your choice (for the remainder of the blog post we’re going to assume this is git).

The second post covers deployment:

It’s easy to duplicate an Azure Logic App in a resource group, but unfortunately you cannot duplicate a Logic App between environments (you might try to copy paste the JSON though). So unless you want to hand craft every Logic App yourself on each of your environments, you need a way to automatically deploy your Logic Apps. It’s easier, faster and less error-prone than any manual method.

Check out both posts.

Comments closed