Press "Enter" to skip to content

Category: Performance Tuning

Measuring Query Times in Power BI DirectQuery Mode

Chris Webb breaks out the stopwatch:

If you’re tuning a DirectQuery semantic model in Power BI one of the most important things you need to measure is the total amount of time spent querying your data source(s). Now that the queries Power BI generates to get data from your source can be run in parallel it means you can’t just sum up the durations of the individual queries sent to get the end-to-end duration. The good news is that there are new traces event available in Log Analytics (though not in Profiler at the time of writing) which solves this problem.

Read on to learn more about this event.

Comments closed

Measuring Write Speeds in SQL Server

Vlad Drumea performs a test:

In this post I cover a script I’ve put together for measuring storage write speeds in SQL Server, namely against database data files.

This is meant to help get an idea of how the underlying storage performs when SQL Server is writing 1GB of data to a database.

At this point, you might be asking yourself: “Why not use CrystalDiskMark instead?”.
The answer is simple: you might not always be able to install/run additional software in an environment. Even more so if you work with external customers or you’re a consultant. It’s a lot simpler to ask a customer to run a script and send you the output, than it is to ask them to install and run some 3rd party software.

Click through for the script, what it does, and how to run it, as well as a note on limitations and example based on three drives.

Comments closed

dataConvergeDefinition and DirectQuery Partitions

Chris Webb talks about hybrid tables:

Hybrid tables – tables which contain both Import mode and DirectQuery mode partitions to hold data from different time periods – have been around for a while. They are useful in cases where your historic data doesn’t change but your most recent data changes very frequently and you need to reflect those changes in your reports; you can also have “reverse hybrid tables” where the latest data is in Import mode but your historic data (which may not be queried often but still needs to be available) is in DirectQuery mode. Up to now they had a problem though: even when you were querying data that was in the Import mode partition, Power BI still sent a SQL query to the DirectQuery partition and that could hurt performance. That problem is now solved with the new dataCoverageDefinition property on the DirectQuery partition.

Read on to see what dataCoverageDefinition does.

Comments closed

Improving Aurora Postgres Performance by Lowering random_page_cost

Shayon Mukherjee does some performance tuning:

Recently I have been working with some queries in Postgres where I noticed either it has decided not to use an index and perform a sequential scan, or it decided to use an alternative index over a composite partial index. This was quite puzzling, especially when you know there are indexes in the system that can perform these queries faster.

So what gives?

After some research, I stumbled upon random_page_cost (ref).

Read on to learn more about what this is, how you can change it, and a warning before you go wild in changing it.

Comments closed

Performance Costs of using Calculated Columns in Power BI Composite Models

Chris Webb share a warning:

I don’t have anything against the use of calculated columns in Power BI semantic models in general but you do need to be careful using them with DirectQuery mode. In particular when you have a DirectQuery connection to another Power BI semantic model – also known as a composite model on a Power BI semantic model – it’s very easy to cause serious performance problems with calculated columns. Let’s see a simple example of why this is.

Read on for Chris’s example.

Comments closed

Blocking and Waiting in Power BI Import Mode Refreshes

Chris Webb has some ‘splainin to do:

Following on from my previous post showing how you can visualise the job graph for a Power BI Import mode semantic model refresh, I this post I will look at how you can interpret what the job graph tells you – specifically, explaining the concepts of blocking and waiting. As always, simple examples are the best way of doing this.

Click through for the explanation using a job graph.

Comments closed

Thinking about Scale Up-Front

Andy Brownsword shares a warning:

A point of sale system being rolled out across hundreds of physical locations. Transaction data collected each night to be batch processed into a warehouse for usual types of analysis. Our integration preference was SSIS internally. A solution was deployed in preparation.

Rolling out of the new system started with a handful of locations which steadily increased as confidence grew. On the back of this the data hitting our solution was increasing too. With a trickle of data early on there were no issues as expected. A small volume of data from a small number of stores. The process flew. We left it doing it’s thing.

Read on to see the story take a darker turn and the importance of planning for scale.

Comments closed

Many-to-Many Power BI Relationships and Table Refreshes

Dany Hoter gives us a reason to minimize use of many-to-many relationships in Power BI:

I must admit that in the last two years I’ve told many Power BI/Kusto customers not to worry about relationships that are created as M:M.

I was pretty sure that with Direct Query, such relationships are fine,

Indeed, the generated queries looked fine and performed as expected.

I recently became aware that the number of queries generated for some visuals e.g. Matrix and tables can be affected by the type of relationships between the participating tables.

Read on for a description of why you shouldn’t load your Power BI semantic models with many-to-many relationships, especially once Kusto is involved.

Comments closed