Press "Enter" to skip to content

Month: June 2022

Delta Live Tables and Power BI Data Modeling

Tahir Fayyaz goes from Delta Lake to Power BI:

To get the optimal performance from Power BI it is recommended to use a star schema data model and to make use of user-defined aggregated tables. However, as you build out your facts, dimensions, and aggregation tables and views in Delta Lake, ready to be used by the Power BI data model, it can become complicated to manage all the pipelines, dependencies, and data quality as you need to consider the following:

– How to easily develop and manage the data model’s transformation code.

– How to run and scale data pipelines for the model as data volumes grow.

– How to keep all the Delta Lake tables updated as new data arrives.

– How to view the lineage for all tables as the model gets more complex.

– How to actively stop data quality issues that result in incorrect reports.

Read on for recommendations, a couple architectural diagrams, and some sample code.

Comments closed

The top Operator in KQL

Robert Cain has top men working on this. Top. Men:

Top 10 lists are all the rage on the internet. Everywhere you look you see “Top 10 Cute Kitten Videos” or “Top 10 Pluralsight Videos by ArcaneCode”.

KQL includes a top operator so you can generate your own top lists. Even better, you are not limited to just ten items either.

Read on to see how you can use the top operator in KQL.

Comments closed

Constructing JSON Objects in SQL Server

Hasan Savran checks out a couple of functions new to SQL Server 2022:

JSON Functions are introduced to SQL Server in version 2016. Saving JSON documents and retrieving documents using JSON Functions brings many possibilities to SQL Server. It is great to see that Microsoft continues to add different functions to the original JSON functions set.

    Today, I will explain two new JSON functions which are available in SQL Server 2022 and Azure SQL Database. 

Read on to learn more about these functions.

Comments closed

Creating a Self-Signed Certificate in Powershell

Tom Collins says ‘trust me’:

Use the Powershell cmdlet New-SelfSignedCertificate

Warning: I only ever use this method for testing purposes – never use for Production environments. A typical test would be for  SQL Server Network Encryption SSL Certificate Management

Encryption strategy  should be part of a wider review of Database Security Countermeasures against hacker attacks

For more details check the Microsoft documentation but here are some notes on some of the parameters  attached to the Powershell cmdlet – New-SelfSignedCertificate.

Read on to see how you can create the certificate and use it in SQL Server. Connections will complain (and rightfully so) about the self-signed certificate but you’ll be able to try functionality which requires a certificate in an environment in which you don’t have one available for whatever reason.

Comments closed

Making DAX’s SWITCH Run Fastest

Marco Russo and Alberto Ferrari explain when SWITCH is fast and when it’s not so fast:

The SWITCH function in DAX is widely used to define different execution paths depending on the condition of an expression. A very common case is when we want to check the state of a slicer, so that the selection made by the report user directly affects the result of a DAX formula.

Read on for three separate cases and how they can affect the performance of the SWITCH function.

Comments closed

Converting between Decimal and Binary

Tomaz Kastrun has run out of useless functions and has to create useful ones:

How does the conversion between decimal to binary or from binary to decimal behave? With another useless function, I have plotted the points (x = decimal number, y = converted binary number) on a scatter plot. Just to find out that the graph shows the binomial distribution function.

Read on for the conversion process and a fun analysis.

Comments closed

Currency Conversion with priceR

Bryan Shalloway needs to make change for a trillion Zimbabwe dollars (prior to revaluation):

In this post I’ll walk through an example of how to convert between currencies. A challenge is that the conversion rate is constantly changing. If you have historical data you’ll want the conversion to be based on what the exchange rate was at the time. Hence the fields you need when doing currency conversion are:

1. Date of transaction

2. Start currency (what you’ll be converting from)

3. End currency (what you’ll be converting to)

4. Price (in units of starting currency)

Bryan also makes the smart move by memoizing the data first, as those API calls can get expensive otherwise.

Comments closed

Power BI Field Parameters and Type 2 SCDs with Bonus Fields

Koen Verbeeck extends the type 2 slowly changing dimension:

Power BI field parameters are a new feature in Power BI Desktop, and it’s one of the best of the past months. In short, Power BI field parameters allow you to easily switch between dimensions attributes or measures in a filter. Previously, you had to do all sorts of DAX wizardry to make this happen, but now it’s just a couple of clicks.

The goal of this blog post is not to tell you exactly how they work, but rather showcase an interesting use case. You can find more info about Power BI field parameters in the official blog post, but also herehere and here. The use case I’m talking about is slowly changing dimensions of Type 2, you know, the one where we insert a record for every change. Often, I also include an extra column for each column of which we’re tracking history: the “current value column”. For example, if we keep history of the department for an employee, I have a column “CurrentDepartment”. If a type 2 change occurs, the values of this columns are updated to the last known value for this dimension member. This allows to answer different types of questions, because sometimes users are interested in the historical values, but sometimes they just want to know the current value.

Read on for the use case as well as how you might combine field parameters with the idea of current values on type-2 slowly changing dimensions.

Comments closed

Sending E-mails on Database Blocking

Thomas Williams combines a few tools:

If your SQL Servers are under pressure, you want to know. Blocking and blocked processes impact end-users, and if not addressed can slow or even stop a database. In this post, I’ll outline a method I use to get timely notifications of blocking processes that you can use too.

I adapted my approach from Tom Collins’s excellent – and still relevant – 2017 article “How to monitor Blocked Processes with SQL Alert and email sp_whoisActive report” at https://www.sqlserver-dba.com/2017/01/how-to-monitor-blocked-processes-with-sql-alert-and-email-sp_whoisactive-report.html. You could implement exactly what Tom covers in his post and come out on top. I’ve gone one small step further to send a formatted HTML e-mail with a table of blocking & blocked processes; like Tom, I generate the table using Adam Machanic’s fantastic sp_WhoIsActive stored procedure, which I’ve assumed is present in the master system database. The complete solution is the sp_WhoIsActive stored procedure (which you’ll need to download and create, see http://whoisactive.com/downloads/), a SQL Agent job with a job step that runs sp_WhoIsActive and sends the e-mail, and an alert that calls the SQL Agent job when there’s blocked processes.

Read on for the script, as well as some important notes.

Comments closed

Building a Rank in Power Query

Reza Rad ranks things:

Although, calculating rank dynamically is often useful, It is not rare to need to do ranking on a pre-calculated basis. Let’s say, for example, you want to create an aggregated table by customers and year. and you want to rank customers based on their sales amount in that table. the period and other factors are remaining static. This pre-calculated aggregated table can speed up the performance of your report significantly in the future. In a case like this, the rank calculation can be done as a pre-calculation. And when you do this as a pre-calculation, Power Query is a useful option. Because then you can do that even in a Dataflow, and the result can be used in multiple Power BI datasets.

Click through to see how you can do it, as well as what Power Query calls its equivalents to ROW_NUMBER(), RANK(), and DENSE_RANK().

Comments closed