Press "Enter" to skip to content

Author: Kevin Feasel

Pre-Calculating Semi- and Non-Additive Measures in Power BI

Gerhard Brueckl shows us how to work with semi-additive and non-additive measures in Power BI:

In one of my recent projects we wanted to visualize data from the customers analytical platform based on Azure Databricks in Power BI. The connection between those two tools works pretty flawless which I also described in my previous post but the challenge was the use-case and the calculations. We wanted to display the distinct customers across various aggregations levels over a billion rows fact table. We came up with different potential solutions all having their pros and cons:

1. load all data into Power BI (import mode) and do the aggregations there

2. use Power BI with direct query and let the back-end do the heavy lifting

3. load only necessary pre-aggregated data into Power BI (import mode)

Please keep in mind that we are dealing with a distinct count measure here. Semi- and Non-additive measure like this cannot easily be aggregated from lower levels to higher levels without having all the detail data available!

Read on for Gerhard’s thoughts on each as well as the decision and process.

Comments closed

The Performance Overhead for Query Store

Erin Stellato updates a prior post:

I wrote the original Query Store performance overhead post just over two years ago, and just like the data in your database keeps changing, so does SQL Server.  However, the question, “What is the performance overhead of enabling Query Store?” is still the most frequent question I am asked. 

So why am I writing this post?  Because there have been many improvements specific to Query Store that have taken the feature to the point where it can support all workloads, including those that are ad-hoc.  This is a big deal.

Read on to learn more.

Comments closed

Monitoring Kafka Metrics with Prometheus and Grafana

Murat Derman shows how you can use Prometheus and Grafana to track vital measures on an Apache Kafka cluster:

You can add  scrape_interval parameter in your configuration by default it is  every 1 minute  scrape_interval: 5s

Prometheus has its own query language called promql. You can learn more about this language from this here https://prometheus.io/docs/prometheus/latest/querying/basics/ 

There are lot of metrics you can define for Kafka. I will mention  a few  of them in this article

Read on for a breakdown of some of these measures.

Comments closed

Using Hints in SQL Server

Jared Poche is flirting with the dark side:

I work on hundreds of databases with the same schema. They have different data sets and distributions, different sizes, and their statistics are going to update at different times. But if one of them chooses a bad plan, I have to push aside whatever other work to research the high CPU on database xyz.

Consistency is really valuable to me. And in this case, the answer is simple. Yes, I want to scan the fast, small memory-optimized table variable first, and use it to filter the larger, slower table. Adding a join hint or a force order to this query should keeps its plan and performance consistent.

Click through for a few examples of where query hints can be useful, but also where they can fail you in unexpected ways.

Comments closed

Using the Vertipaq Analyzer with DAX Studio

Gilbert Quevauvilliers walks us through the Vertipack Analyzer in DAX Studio:

If you are looking to better understand your Power BI Model, how big are your tables, which column is taking up the most space then you can use Vertipaq analyzer which is built directly into the amazing DAX studio.

If you are looking for a video on how to do this, there are some awesome videos found here.

SQLBI.COM – Introducing VertiPaq Analyzer in DAX Studio

Guy In a Cube – How do you even use VertiPaq Analyzer with Power BI???

The reason for my blog post, is that I find sometimes I want to watch a video to better understand the content, and other times I want to follow a step by step process on how complete the analysis.

Click through for a walkthrough of the process.

Comments closed

Deploy Reporting Services Projects with Powershell

Aaron Nelson has a pair of new Powershell cmdlets:

I built two new PowerShell commands to deploy SSRS projects, and they have finally been merged into the ReportingServicesTools module. The commands are Get-RsDeploymentConfig & Publish-RsProject. While the Write-RsFolderContent command did already exist, and is very useful, it does not support deploying the objects in your SSRS Project to multiple different folders in your report server. These two new commands can handle deployment to multiple folders.

Click through for details on each.

Comments closed

Fun with Function Rewrites

Erik Darling reminds me why I hate user-defined functions in SQL Server:

At 23 seconds, this is probably unacceptable. And this is on SQL Server 2019, too. The function inlining thing doesn’t quite help us, here.

One feature restriction is this, so we uh… Yeah.

The UDF does not contain aggregate functions being passed as parameters to a scalar UDF

But we’re probably good query tuners, and we know we can write inline functions.

Read the whole thing, as this is not always straightforward.

Comments closed

Migrating DATETIME Columns to DATETIMEOFFSET

Josh Darnell makes a change:

We have an application that uses datetime columns in a number of places. All of the users have always been in Eastern Time, but now we have a request to introduce users from a different time zone (Central Time) into the system. The lack of time zone information in our dates and times now presents a problem.

The system needs to communicate to users how long ago something occurred, or a time in the future that something needs to be done. If an Eastern Time user enters in a “follow up time” of today at 2:00 pm, a Central Time user could log in, see that, and end up being an hour late following up with their customer.

Click through for the process and several bugaboos you might run into. What we’ve done was to force all application times in UTC in DATETIME or DATETIME2 format and then store user preferences on time zone in the application, translating from UTC to the relevant time zone at that level.

Comments closed

Optimizing Multiple CTEs

Itzik Ben-Gan continues a series on table expressions:

Last month I explained and demonstrated that CTEs get unnested, whereas temporary tables and table variables actually persist data. I provided recommendations in terms of when it makes sense to use CTEs versus when it makes sense to use temporary objects from a query performance standpoint. But there’s another important aspect of CTE optimization, or physical processing, to consider beyond the solution’s performance—how multiple references to the CTE from an outer query are handled. It’s important to realize that if you have an outer query with multiple references to the same CTE, each gets unnested separately. If you have nondeterministic calculations in the CTE’s inner query, those calculations can have different results in the different references.

Say for instance that you invoke the SYSDATETIME function in a CTE’s inner query, creating a result column called dt. Generally, assuming no change in the inputs, a built-in function is evaluated once per query and reference, irrespective of the number of rows involved. If you refer to the CTE only once from an outer query, but interact with the dt column multiple times, all references are supposed to represent the same function evaluation and return the same values. However, if you refer to the CTE multiple times in the outer query, be it with multiple subqueries referring to the CTE or a join between multiple instances of the same CTE (say aliased as C1 and C2), the references to C1.dt and C2.dt represent different evaluations of the underlying expression and could result in different values.

Definitely worth the read.

Comments closed