Press "Enter" to skip to content

Author: Kevin Feasel

Refreshing a Single Table in Power BI

Marc Lelijveld doesn’t want to wait for everything to reload:

If you want to refresh a Power BI dataset, we all know where to find the refresh button in Power BI Desktop as well as in the Power BI Service. By clicking it, you will trigger the entire dataset to refresh. But sometimes it is more convenient to trigger a single table to refresh. If you want to do this, you can do a simple right-click on a table in Power BI Desktop, but how does this work in the Power BI Service? In this blogpost I will describe how you can trigger a single table refresh in the Power BI Service over XMLA endpoints. Please know, this does require Power BI Premium (either Premium per User or Premium Capacity is fine).

Click through to see how.

Comments closed

Multi-Pathed Queries

Guy Glanster needs a multi-tool procedure:

This stored procedure, which I created in the AdventureWorks2017 database, has two parameters: @CustomerID and @SortOrder. The first parameter, @CustomerID, affects the rows to be returned. If a specific customer ID is passed to the stored procedure, then it returns all the orders (top 10) for this customer. Otherwise, if it’s NULL, then the stored procedure returns all orders (top 10), regardless of the customer. The second parameter, @SortOrder, determines how the data will be sorted—by OrderDate or by SalesOrderID. Notice that only the first 10 rows will be returned according to the sort order.

So, users can affect the behavior of the query in two ways—which rows to return and how to sort them. To be more precise, there are 4 different behaviors for this query:

1. Return the top 10 rows for all customers sorted by OrderDate (the default behavior)
2. Return the top 10 rows for a specific customer sorted by OrderDate
3. Return the top 10 rows for all customers sorted by SalesOrderID
4. Return the top 10 rows for a specific customer sorted by SalesOrderID

Let’s test the stored procedure with all 4 options and examine the execution plan and the statistics IO.

This is quite common for reporting procedures and Guy shares several patterns, some of which work better than others.

Comments closed

So You Want to Index

Erik Darling has an indexing strategy for querulous normies:

Most queries will have a where clause. I’ve seen plenty that don’t. Some of’em have surprised the people who developed them far more than they surprised me.

But let’s start there, because it’s a pretty important factor in how you design your indexes. There are all sorts of things that indexes can help, but the first thing we want indexes to do in general is help us locate data.

None of this is groundbreaking but Erik does a really good job of laying out the order in which you want to consider specific factors.

Comments closed

Polling Loops in Powershell

Aaron Nelson has one method for creating a polling loop in Powershell:

Originally I had used the Start-Sleep command to wait 3 seconds ( Start-Sleep 3
). That worked fine on my machine, but when I deployed it to the server, I found I needed to bump it up to 6 seconds. At first that worked, but then a week later I needed to bump it up to 9 seconds. The problem here is obvious, if we force it to wait 9 seconds every time, even if the task was updated after 4 second, we’re wasting extra time. And those seconds are going to add up.

Read on for a smarter approach. Ideally we’d be able to use asynchronous event handling with awaits for all of this, but the real world is not always so nice.

Comments closed

DAX and Case Sensitivity

Marco Russo and Alberto Ferrari talk about case sensitivity:

Every new language defines its own rules of case-sensitivity. R and Python are case-sensitive, DAX is not. It is not that one is right and the others are not; it is really a matter of personal taste of the author of the language. We would say that there is an equal number of pros and cons in both choices. Therefore, there is no definitive choice. That said, a choice needs to be made on two aspects: the language itself and the way it considers strings. Pascal, for example, is case-insensitive as a language, but string comparison is case-sensitive. The M language, in Power Query, is case-sensitive despite living in the same environment as DAX. DAX is case-insensitive as a formula language. 

Maybe it’s because I like living in the SQL world so much, but I highly prefer case-insensitivity as the default and case-sensitivity only when necessary.

Comments closed

The Data Mesh

James Serra explains what a data mesh is:

Its goal is to treat data as a product, with each source having its own data product manager/owner (who are part of a cross-functional team of data engineers) and being its own clearly-focused domain that has an autonomous offering, becoming the fundamental building blocks of a mesh, leading to a domain-driven distributed architecture. Note that for performance reasons, you could have a domain that aggregates data from multiple sources. Each domain should be discoverable, addressable, self-describing, secure (governed by global access control), trustworthy, and interoperable (governed by an open standard). Each domain will store its data in a data lake and in many cases will also have a copy of some of the data in a relational database (see Data Lakehouse defined for why you still want a relational database in most cases).

I’ll have to think more about this before I’m convinced. I’ll also need to think about the Aristotelian opposite of the data mesh.

Comments closed

Kafka in .NET

Diogo Souza walks us through building an application which produces and consumes messages using Apache Kafka:

Kafka is just the broker, the stage in which all the action takes place. The producers send messages to the world while the consumers read specific chunks of data. How do you differentiate one specific portion of data from the others? How do consumers know what data to consume? To understand this, you need a new actor in the play: the topics.

Kafka topics are the channels, the carriage that transport messages around. Kafka records produced by producers are organized and stored into topics.

This is a nice overview of Kafka followed by the basics of building a consumer and a producer in C#. I just wish that there was more community usage of Kafka so that the Confluent .NET driver would include some of the really cool stuff they’ve added to Kafka over the past couple of years.

Comments closed

How to Update Statistics Manually

Matthew McGiffen takes us through the process of updating statistics:

At the heart of all the methods we’ll look at is the UPDATE STATISTICS command. There are a lot of options for using this command, but we’ll just focus on the ones you’re most likely to use. For full documentation here is the official reference:
https://docs.microsoft.com/en-us/sql/t-sql/statements/update-statistics-transact-sql

Even if you have an automated system, knowing how to update statistics is a great thing because you might need to run a one-off update to help a poorly-performing query. Or you’re using PolyBase, which doesn’t have the capability to perform statistics updates automatically because the data isn’t actually in SQL Server.

Comments closed

Building a Function to Get the Next Date by Date Name or Offset

Louis Davidson has a function for us:

As I have been building my Twitter management software, I have been doing a lot more ad-hoc, repetitive coding using T-SQL directly. When I was generating new tweets for upcoming days, a bit of the process that got old quick was getting the date for an upcoming day (the primary key for my tweet table is the date, the type of the tweet, and a sequence number). After having to pick the date of next Tuesday… I had to write some more code (because a true programmer doesn’t do repetitive work when code can be written… even if sometimes the code doesn’t save you time for days or weeks.

So this following function was born from that need, and it is something I could imagine most anyone using semi-regularly, especially when testing software. 

This is definitely fancy. My inclination would be to create a calendar table, as that’ll solve this particular issue as well as other complex variants (like, I want the next Tuesday which doesn’t fall on a holiday).

Comments closed