Press "Enter" to skip to content

Author: Kevin Feasel

Linear Regression in Power BI

Joseph Yeates shows how to implement linear regression in Power BI:

The goal of a simple linear model is to fit a line onto this plot to summarize the shape of the data using the equation above.

The “a” value is the slope of the fitted line (rise over run) and the “b” value is the intercept on the y-axis (when x is equal to zero).

In the gapminder example, the life expectancy column was assigned as the “y” variable, as it is the outcome that we are interested in predicting or understanding. The year1950 column was assigned as the “x” variable, as it is what we are using to try and measure the change in life expectancy.

This is a little more complicated than adding a regression line to a scatterplot (the “normal” way to do linear regression with Power BI) but this method lets you work with the outputs in a way that the normal method doesn’t.

Comments closed

Secret Sharing and Neural Networks

Adrian Colyer reviews an interesting paper:

Take a system trained to make predictions on a language (word or character) model – an example you’re probably familiar with is Google Smart Compose. Now feed it a prefix such as “My social security number is “. Can you guess what happens next?

Read the whole thing. There’s a bit of discussion at the end around how you can stop this learning of secrets.

Comments closed

Multi-Table, Multi-Column Keys in Power BI

Alexander Arvidsson has a riff on a common Power BI question:

The keys we need for connecting the two tables are “region” and “certification”, respectively. A key on just one of these columns won’t ensure uniqueness and here is hurdle number one: how do we create a relationship in Power BI that is based on more than one key? Simple answer is that we can’t. But what we CAN do is create a concatenated column with the data we need to create a unique key, and then do our relationships based on that. 

Click through for the process.

Comments closed

Forced Parameterization and Filtered Indexes

Aaron Bertrand walks us through a case where filtered indexes become unhelpful:

Again, focusing on the areas highlighted in orange: the statement has a parameter @0 (previously it had @1) but, more importantly, the clustered index is scanned now instead of the filtered index. This has impacts throughout the plan, including how many rows are both estimated to be read and actually read in order to return those 11 rows. You can see a much higher I/O cost (about 22X), the predicate is now listed explicitly in the tooltip, and you can see warnings about residual I/O (which just means a lot more rows were read than necessary). The root operator still has the warning about the unmatched index, so at least the plan gives you some clue that a filtered index exists that might be useful if you change the parameterization setting for the database (or add OPTION (RECOMPILE) to the statement):

There are still ways to make filtered indexes work with forced parameterization, such as index hints, but Aaron does a great job explaining why something which seems like it should just work doesn’t always.

Comments closed

Azure Kubernetes Service Max Volume Count

Chris Taylor explains an error message in Azure Kubernetes Service:

Whilst playing around with my session for Techorama.nl I encountered an error I hadn’t seen previously whilst deploying SQL Server on Linux in Azure Kubernetes Service (AKS)

0/1 nodes are available: 1 node(s) exceed max volume count

The yaml I used was only slightly modified (mainly names) from scripts used on minikube and docker-desktop so I was a little confused as to why I was getting this in AKS.

Read on to understand what’s happening here and how you can fix it.

Comments closed

Building Graph Queries with SQL Server Graph

Mala Mahadevan takes us through a few examples of queries in SQL Server’s built-in graph engine:

The main goal behind a graph design is to help you answer queries – so what are the questions you’d ask of a movie database, if you had one? Mine would typically be like below.

1 Who are the actors in this movie?
2 Who is this movie directed by?
3 Who is the most prolific actor, according this dataset?
4 How many actors are also directors?
..and so on.

Read on to see how you can write these queries.

Comments closed

Read-Only versus Read-Write and SQL Server

Jack Vamvas takes us through what it takes to turn a read-write database in SQL Server read-only and vice versa:

There are some considerations for deciding if a Developer should be able to include as part of an ETL process , the capacity to change the READ STATE of a SQL database

1) Requires ALTER permission on the database. This is an elevated privilege – and may break the organisations sql server security policy

2) Is the developer on the hook for maintaining Production data? What is the developer’s role in supporting production data ?

Read on for the commands as well as additional considerations before you grant these permissions.

Comments closed

Calculating YARN Utilization Metrics

Dmitry Tolpeko shows how you can calculate per-second cluster utilization measures from YARN’s resource manager logs:

But even if you query YARN REST API every second it still can only provide a snapshot of the used YARN resources. It does not show which application allocates or releases containers, their memory and CPU capacity, in which order these events occur, what is their exact timestamp and so on.

For this reason I prefer a different approach that is based on using the YARN Resource Manager logs to calculate the exact per second utilization metrics of a Hadoop cluster.

This is a bit more complicated than hitting the REST API, but Dmitry shows some of the benefits of doing so.

Comments closed

Registering SignalR to the Cosmos DB Change Feed

Hasan Savran shows us how we can hook up SignalR to view the Cosmos DB Change Feed:

SignalR allows server code to send asynchronous notifications to client-side web applications. By using it, Azure Functions can send real-time messages to your web applications. Prices can get change whenever data changes in database. Notices can be sent if user needs to be notified. Numbers in dashboard can change dynamically when data changes in Cosmos DB. You can do all those with Azure Cosmos DB + Azure Functions and SignalR. This combination works like David Copperfield magic.

There’s a bit of work involved but Hasan shows us how to get it done.

Comments closed

Custom Power BI Date and Time Formats

Chris Webb continues a series on Power BI custom formats:

In my last post I showed lots of examples of how Power BI’s new custom format string feature can be used to format numbers. This post, looking at dates and times, will be a bit different for two reasons: there are a lot more useful examples of custom date and time formats built into Power BI Desktop, and some of the format placeholders listed in the VBA documentation aren’t supported in Power BI. As a result I’m going to concentrate on some useful formats that aren’t covered well by the examples and highlight a few things that aren’t possible right now.

Read on for a slew of demos.

Comments closed