Power BI Pareto Charts

Matt Allington shows how you can build up a Pareto chart in Power BI:

Pareto Analysis is a statistical technique that applies the Pareto Principle to data. This is more commonly known as the 80:20 Rule. The Pareto Principle is based on the presumption that a relatively small number of inputs (20%)  have most impact on the results/output (80%).  The 80:20 rule can be applied to a wide variety of data in most businesses. 

Examples include:
– Which 20% of products make up 80% of sales
– Which 20% of customers make up 80% of profit.

Pareto analysis is a rule-of-thumb technique but it does provide reasonably useful results much of the time.

Finding Gaps in Dates

Jason Brimhall shows how you can find gaps in your data:

This method is the much maligned recursive CTE method. In my testing it runs consistently faster with a lower memory grant but does cause a bit more IO to be performed. Some trade-off to be considered there. Both queries are returning the desired data-set which happens to be my missing question days. Only, I have added an extra output in the second query to let me know the day of the week that the missing question occurred on. Maybe I forgot to enter it because it was a weekend day or maybe I opted to not create one at all because the day lands on a Holiday. Let’s take a small peek at the results.

This is a good use for tally tables (or for a calendar table, which is basically a date dimension called something else so you can feel comfortable dropping in a non-warehouse system).

Azure SQL Database Hyperscale

Jeroen ter Heerdt explains the basics behind Azure SQL Database Hyperscale:

Connecting to your Hyperscale database is exactly the same as any other Azure SQL or SQL Server database – for example, you can use SQL Server Management Studio or Azure Data Studio. That is the exactly point. Hyperscale provides capabilities not found in other cloud databases such as scale and query performance, but it has no impact on your applications. If you do an in-place migration of a SQL database that is already on another service tier to Hyperscale, the connection string would not even change. Your application does not have to be changed and it will benefit from bigger scale and improved query performance. And who does not like the extra scale and query performance? 

This post covers the basics of database creation but does not go much further into it than that. Jeroen does promise a multi-part series, however.

When Values Disappear in Power BI

Chris Webb explains what happens when a selected value on a slicer disappears in the underlying data set in Power BI:

There is a slicer on the left with five items in it, a table showing the actual contents of the table (I’ve disabled visual interactions so the slicer doesn’t slice the table) with five rows and a card showing the output of the following measure:

Selected Number = SELECTEDVALUE(MyNumbers[Column1], "Nothing Selected")

In the screenshot above you can see I have selected the value 78 in the slicer and the measure – as you would expect – displays that value.

Now what happens when you refresh the dataset and the table contains a different set of numbers? 

Read on for the full explanation.

Biases in Tree-Based Models

Nina Zumel looks at tree-based ensembling models like random forest and gradient boost and shows that they can be biased:

In our previous article , we showed that generalized linear models are unbiased, or calibrated: they preserve the conditional expectations and rollups of the training data. A calibrated model is important in many applications, particularly when financial data is involved.

However, when making predictions on individuals, a biased model may be preferable; biased models may be more accurate, or make predictions with lower relative error than an unbiased model. For example, tree-based ensemble models tend to be highly accurate, and are often the modeling approach of choice for many machine learning applications. In this note, we will show that tree-based models are biased, or uncalibrated. This means they may not always represent the best bias/variance trade-off.

Read on for an example.

Cloudera and 100% Open Source Software

Alex Woodie notes a change at Cloudera:

The old Cloudera developed and distributed its Hadoop stack using a mix of open source and proprietary methods and licenses. But the new Cloudera will be 100% open source, just like Hortonworks, its one-time Hadoop rival that it acquired in January. But will developing its data platform completely in the open differentiate it from cloud competitors?

In a blog post published yesterday under the title “Our Commitment to Open Source Software,” Cloudera executives Charles Zedlewski and Arun Murthy laid out the company’s new plan to develop and distribute everything in the open.

This was one of the big reasons I preferred Hortonworks over Cloudera when they were separate companies: Hortonworks had this model. Hopefully it leads Cloudera to success.

July Azure Data Studio Update

Alan Yu announces some great things in the July update to Azure Data Studio:

One of the most requested features from customers around the world is enhanced execution plan support. Although we have basic query plan support in Azure Data Studio, it’s not as robust as similar functionality built into SQL Server Management Studio and what other vendors provide.

Today, we’re pleased to announce that one of our valued Microsoft partners, SentryOne is shipping their SentryOne Plan Explorer extension for Azure Data Studio. This is a free extension that provides enhanced plan diagrams for queries that are run in Azure Data Studio, with optimized layout algorithms and intuitive color-coding to help quickly identify the most expensive operators affecting query performance.

The other big thing I like is that notebooks have keyboard shortcuts. These were two of the things keeping me from using ADS as much as I’d wanted. Now I’m that much closer to full-on migration.

Notifications when Admins Connect to a SQL Server

Jon Shaulis builds a logon trigger to notify when sysadmins log into his systems:

I was helping someone set up some monitoring in their database and they were asking about being notified when someone with administrative privileges logs into SQL Server. While many of you know that I try to use the right tool for the right job, some may cringe when I say triggers can help out in this scenario.

Now, there are other methods here but the goal was to be notified when someone logs into your SQL Server with administrative privileges. Triggers will consistently fire when this event occurs and allows you to perform an action based on this event.

Just make sure you get the trigger right and don’t block everybody from logging in. That’s an awkward situation.

Adaptive Joins and Index Width

Erik Darling wants to pump your indexes up:

Now, there’s an Extended Event that… Used to work.

These days it just stares blankly at me. But since I’ve worked with this before, I know the problem.

It’s that Key Lookup — I’ll explain more in a minute.

Adaptive joins won’t do all the work for you, so Erik explains how you can set yourself up for success.

Adding Line Breaks on DAX Measures

Gilbert Quevauvilliers shows how you can add line breaks to DAX measures for formatting results in Power BI:

I was at a customer and they were drilling through between pages, and I wanted to put in what the filter selections were when they drilled through to the new page.

This would allow them to easily see what had been on the previous page, and to avoid going backwards and forwards.

I quickly created the DAX measure to show the filters, but one thing that bugged me was that it looked a bit clunky and not clear in terms of what the filters were.

I knew that I wanted to use a Line Break to put each filter on a new line. 

Read on to see how you can do this.

Categories

July 2019
MTWTFSS
« Jun  
1234567
891011121314
15161718192021
22232425262728
293031