Press "Enter" to skip to content

Curated SQL Posts

Non-Kimball Relationships in Power BI

Paul Turley continues a series on relationship modeling in Power BI:

So far, you’ve seen that the essential components of a data model include tables related to each other. Filtering records in one table prorogates the filter to related tables(s), causing records in the related table to be filtered. This dimensional model schema is the foundation of the reporting solution. Once the fact and dimension tables are in-place, you can create more advanced solutions by working outside of this standard relationship pattern.

Read on for the full story.

Comments closed

Incremental Backoff with Powershell

Shane O’Neill implements a linear backoff strategy:

Pushups are hard! Even when I’m not trying to rep out as many as I can, they still take a toll on the body. Soon a five-minute break is not enough, and I’m taking longer and longer rests.

Fine, if that’s the way we’re going to do this, then I’m going to go with the flow.

I can confirm that pushups are hard. Also, click through for a link to the backoff script.

Comments closed

The Complexity of Adding Simple Features

Chris Webb answers a timeless question:

One question I get asked all the time is this:

Why don’t you add [insert feature idea here] to Power BI?

It’s sometimes followed up by one or more of the following comments:

It would be so easy for you to do
I can’t believe you haven’t done it already
Power BI is unusable without it
[insert competitor name here] has had this feature for years

…and a real or virtual exasperated sigh.

Read on for the answer. This isn’t special to Power BI or even Microsoft—once you start to have customers with competing interests, these decisions get a lot harder.

Comments closed

Powershell Quick Hits

Jess Pomfret shares a few Powershell tips:

My goal was to gain more stars than last year, which I succeeded at. I only got 6 total stars last year. Now my goal for next year will be to beat this year’s performance.  I did learn several neat things while working on these puzzles and those I thought were worth sharing.

Read on for those things.

Comments closed

Bayesian Modeling of Holiday Behavior

Daniel Marthaler and Brian Coffey have an interesting post:

As the year unfolds, our demand fluctuates. Two big drivers of that fluctuation are seasonality and holidays. With the holiday season upon us, it’s a great time to describe how both seasonality and holiday effects can be estimated, and how you can use this formulation in a predictive time series model.

In this post, we describe the difference between seasonality and holiday effects, posit a general Bayesian Holiday Model, and show how that model performs on some Google Trends data.

Read the whole thing.

Comments closed

Deleting Messages and Topics in Kafka

The Hadoop in Real World team has a pair of related posts. The first is on how to remove messages in a Kafka topic:

The easiest way to purge or delete messages in a Kafka topic is by setting the retention.ms to a low value. retention.ms configuration controls how long messages should be kept in a topic. Once the age of the message in a topic hits the retention time the message will be removed from the topic.

Note the below steps delete or purge messages in your topic. Use precaution when executing the below.

Because Kafka is an immutable log rather than “final” storage, the ideal scenario has you never deleting data. But sometimes you just run low on disk space. You can also set the max retention size as another option. But note that these aren’t going to let you delete a single message—that’s not a good thing to do with a log; rather, you offset or cancel out the message and submit a new one.

The second post covers deletion of a Kafka topic:

In this post we will see how to delete a Kafka topic and get the details of the topic before deleting it.

Comments closed

Chaining with DirectQuery for Power BI Datasets

Wolfgang Strasser explains the notion of chaining when working with Power BI datasets:

In my last blog post I introduced the new concept of DirectQuery for Power BI datasets. This feature allows you to extend and modify a (remote) published Power BI dataset with the help of a local model.

The local model does not contain a copy of the remote dataset but a reference to it. You, as Power BI developer, are able to extend the referenced model with new data sources (like the Excel file I used in my previous example) and/or extend the model with new measures, columns and so on. For a new data model, relationships between the two data islands can be created.

Read on for examples of how this can be useful and what the current limitations look like.

Comments closed

Using the Open Source R or Python Runtime with Machine Learning Services

Niels Berglund walks us through using the open source extensibility framework to install R or Python:

When Java became a supported language in SQL Server 2019, Microsoft mentioned that communication between ExternalHost and the language extension should be based on an API, regardless of the external language. The API is the Extensibility Framework API for SQL Server. Having an API ensures simplicity and ease of use for the extension developer.

From the paragraph above, one can assume that Microsoft would like to see 3rd party development of language extensions. That assumption turned out to be accurate as, mentioned above, Microsoft open-sourced the Java language extension, together with the include files for the extension API, in September 2020! This means that anyone interested can now create a language extension for their own favorite language!

However, open sourcing the Java extension was not the only thing Microsoft did. They also created and open-sourced language extensions for R and Python!

Click through for more detail and a walkthrough on installation of Python.

Comments closed

External Table Not Accessible because Content of Directory Cannot be Listed

Liliam Leme troubleshoots an error when working with a serverless SQL pool in Azure Synapse Analytics:

Following this lab: Lab: Serverless Synapse – From Spark to SQL On Demand – Microsoft Tech Community

You may experience this message: 

Failed to execute the query because content of directory cannot be listed) 

This is due to an extra step required to enable the AAD to pass through the firewall on the storage.

Click through for the solution.

Comments closed