Press "Enter" to skip to content

Curated SQL Posts

Showing Ranges of Data with a Single Slicer

Marco Russo and Alberto Ferrari have another good use case for calculation groups:

Because the axis of the visual must show months outside of the slicer selection, we cannot use the usual Date[Calendar Year Month] column. Indeed, if we put the Date[Calendar Year Month] column on the X-axis, the only visible month would be the selected month. It is worth remembering that the selection of the values to show on the axis is independent from the measure. If a slicer is filtering one month, there is no way to show additional months from the same table on either the rows or the columns (or the X-axis, as in the example).

Therefore, we must create a separate table that is not subject to filtering from the slicer. This way, columns from that table show all the rows, and we can control their visibility through a measure. Once the new table is in place, we write a measure that produces a value for only the last six months out of all the months visible, and leaves the remaining months blank in order to hide them.

Read on to see how.

Comments closed

Data Hubs, Warehouses, and Lakes

Trevor Legg compares and contrasts data hubs, data warehouses, and data lakes:

Data hubs, data warehouses, and data lakes are significant investment areas for data and analytics leaders and are vital to support increasingly complex, distributed, and varied data workloads.

Gartner finds that 57% of data and analytics leaders are investing in data warehouses, 46% are using data hubs, and 39% are using data lakes. However, they also found that these same data and analytics leaders don’t necessarily understand the difference between the three…

To best support specific business requirements, it’s vital to understand the difference and purpose of each type of structure, and the role it can play in modern data management infrastructure.

Click through for the definitions and comparisons.

Comments closed

Ending the Number Series Generator Challenge

Itzik Ben-Gan wraps up a great series:

This is the fifth and last part in the series covering solutions to the number series generator challenge. In Part 1Part 2Part 3 and Part 4 I covered pure T-SQL solutions. Early on when I posted the puzzle, several people commented that the best performing solution would likely be a CLR-based one. In this article we’ll put this intuitive assumption to the test. Specifically, I’ll cover CLR-based solutions posted by Kamil Kosno and Adam Machanic.

Many thanks to Alan Burstein, Joe Obbish, Adam Machanic, Christopher Ford, Jeff Moden, Charlie, NoamGr, Kamil Kosno, Dave Mason, John Nelson #2, Ed Wagner, Michael Burbea, and Paul White for sharing your ideas and comments.

Read on for the conclusion.

Comments closed

From Azure Analysis Services to Power BI Premium Per User

Gilbert Quevauvilliers picks back up on a new series:

Welcome to the first in my blog post series on evaluating the different aspects when looking to migrate from Azure Analysis Services (AAS) to Power BI Premium Per User (PPU).

Apologies for this taking a few extra weeks to get started, life has been super busy, but as they say “Better late than never”.

In this post I am going to compare the Query Performance of an AAS Cube compared to a PPU Cube.

Click through to see how Power BI Premium Per User stacks up against Azure Analysis Services.

Comments closed

Checking Out Power BI Goals

Treb Gatte takes a step back:

In part 1, I gave you an introduction to Power BI Goals and provided some context on what information could be tracked with the feature. I received some feedback that it would be great to do an “unboxing” post, to get a look at the feature. Thus, this post will be that. Part 3 will be a Power BI Goals 101 post where I step you through the process.

Power BI Goals requires you to either have a Power BI Premium per user license or to have access to Power BI Premium and have a Power BI Pro license assigned to you. If you do, then you will notice a new trophy icon when you go to PowerBI.com.

Click through for the overview.

Comments closed

Fitting Excel Macros into Data Science Pipelines

Bryan Shalloway has a process for us:

While I no longer use it regularly for the purposes of analysis, I will always have a soft spot in my heart for excel. Furthermore, using a “correct” set of data science tools often requires a bridge. Integrating a rigorous component into a messy spreadsheet based pipeline can be an initial step towards the pipeline or team or organization starting on a path of continuous improvement in their processes. Also, spreadsheets are foundational to many (probably most) BizOps teams and therefore are sometimes unavoidable…

In this post I will walk through a short example and some considerations for when you might decide (perhaps against your preferences) to integrate your work with extant spreadsheets or shadow “pipelines” within your organization.

Click through for Bryan’s thoughts on the topic.

Comments closed

Building a Kafka Test Environment with Kafdrop

Diogo Souza walks us through an interesting project:

From a daily life standpoint, it’s challenging to manage Kafka brokers, partitions, topics, producers, and consumers all via command line. An interface would be quite helpful.

There is a ton of available options for managing your Kafka brokers for web UI applications. Perhaps Confluent’s version is one of the most complete, although it is part of a paid combo for mostly enterprise means.

Amongst the myriad of open-source options, Kafdrop stands out for being simple, fast, and easy to use. It is an open-source web project that allows you to view information from Kafka brokers as existing topics, consumers, and even the content of messages sent.

This article explores creating a more flexible test environment to work alongside the .NET app built in the previous article. This way, you’ll have more powerful tools to understand what’s happening with your topics.

Read on to learn how you can install and use Kafdrop.

Comments closed

Tracking Query Compile Time

Grant Fritchey doesn’t have time to wait:

A question that came up recently was how to track the query compile time. It’s actually a pretty interesting question because, there aren’t that many ways to tell how long it took to compile the query, and they don’t necessarily agree. For most of us, most of the time, compile time for a given query doesn’t matter. However, I love telling the story of the query I had on an old system that could run in 90ms, but took 5 minutes to compile. In short, sometimes compile time matters.

Read on to learn several ways to determine how long it took that query to compile.

Comments closed

Power Query Data Profiling

Ed Hansberry takes us through the data profiler in Power Query:

A solid green bar is usually best. It means there are no issues in that column, as shown in the Discounts field. On the far right in the COGS field, there is a tiny bit of gray. That means there is no data, or a null, in one or more records. The bigger the gray bar, the higher percentage of nulls there are. In the middle though we have a problem. Anything in red is an error and must be fixed. These may cause more errors in further transformations downstream, and will usually cause refresh errors.

Before we address this, let’s get a bit more information about our data. Go to the View ribbon, and there are 3 settings we want to look at.

I really like what the data profiler provides us. If you’re a regular Power BI user, I highly recommend checking it out if you haven’t already.

Comments closed