Press "Enter" to skip to content

Category: Performance Tuning

dataConvergeDefinition and DirectQuery Partitions

Chris Webb talks about hybrid tables:

Hybrid tables – tables which contain both Import mode and DirectQuery mode partitions to hold data from different time periods – have been around for a while. They are useful in cases where your historic data doesn’t change but your most recent data changes very frequently and you need to reflect those changes in your reports; you can also have “reverse hybrid tables” where the latest data is in Import mode but your historic data (which may not be queried often but still needs to be available) is in DirectQuery mode. Up to now they had a problem though: even when you were querying data that was in the Import mode partition, Power BI still sent a SQL query to the DirectQuery partition and that could hurt performance. That problem is now solved with the new dataCoverageDefinition property on the DirectQuery partition.

Read on to see what dataCoverageDefinition does.

Comments closed

Improving Aurora Postgres Performance by Lowering random_page_cost

Shayon Mukherjee does some performance tuning:

Recently I have been working with some queries in Postgres where I noticed either it has decided not to use an index and perform a sequential scan, or it decided to use an alternative index over a composite partial index. This was quite puzzling, especially when you know there are indexes in the system that can perform these queries faster.

So what gives?

After some research, I stumbled upon random_page_cost (ref).

Read on to learn more about what this is, how you can change it, and a warning before you go wild in changing it.

Comments closed

Performance Costs of using Calculated Columns in Power BI Composite Models

Chris Webb share a warning:

I don’t have anything against the use of calculated columns in Power BI semantic models in general but you do need to be careful using them with DirectQuery mode. In particular when you have a DirectQuery connection to another Power BI semantic model – also known as a composite model on a Power BI semantic model – it’s very easy to cause serious performance problems with calculated columns. Let’s see a simple example of why this is.

Read on for Chris’s example.

Comments closed

Blocking and Waiting in Power BI Import Mode Refreshes

Chris Webb has some ‘splainin to do:

Following on from my previous post showing how you can visualise the job graph for a Power BI Import mode semantic model refresh, I this post I will look at how you can interpret what the job graph tells you – specifically, explaining the concepts of blocking and waiting. As always, simple examples are the best way of doing this.

Click through for the explanation using a job graph.

Comments closed

Thinking about Scale Up-Front

Andy Brownsword shares a warning:

A point of sale system being rolled out across hundreds of physical locations. Transaction data collected each night to be batch processed into a warehouse for usual types of analysis. Our integration preference was SSIS internally. A solution was deployed in preparation.

Rolling out of the new system started with a handful of locations which steadily increased as confidence grew. On the back of this the data hitting our solution was increasing too. With a trickle of data early on there were no issues as expected. A small volume of data from a small number of stores. The process flew. We left it doing it’s thing.

Read on to see the story take a darker turn and the importance of planning for scale.

Comments closed

Many-to-Many Power BI Relationships and Table Refreshes

Dany Hoter gives us a reason to minimize use of many-to-many relationships in Power BI:

I must admit that in the last two years I’ve told many Power BI/Kusto customers not to worry about relationships that are created as M:M.

I was pretty sure that with Direct Query, such relationships are fine,

Indeed, the generated queries looked fine and performed as expected.

I recently became aware that the number of queries generated for some visuals e.g. Matrix and tables can be affected by the type of relationships between the participating tables.

Read on for a description of why you shouldn’t load your Power BI semantic models with many-to-many relationships, especially once Kusto is involved.

Comments closed

Differentiating Physical and Logical Reads in SQL Server

Jose Manuel Jurado Diaz explains a concept:

In the realm of Azure SQL Database, query performance is a paramount concern for database administrators and developers alike. A critical aspect of this performance is understanding how SQL Server interacts with data, particularly through physical and logical reads. This article delves into these two fundamental concepts, providing insights into their impact on database performance and a practical lab to observe these metrics in action.

Read on for the difference, as well as a demonstration. With slow disks and insufficient RAM, it’s really important to know this difference. But as you have more RAM and move to formats like NVMe for storage, I’d argue that it’s less of an issue. The additional RAM, in particular, is important because the idea is that data access frequently will remain in the buffer pool for longer, so you’re more likely to see logical reads in action. Of course, poor indexing and bad decisions can ruin that idea, so don’t do that, okay?

Comments closed

Optimizing Shared Buffers in Postgres

Salman Ahmed explains how shared buffers work in Postgres:

PostgreSQL is known for its robustness and performance right out of the box. However, different applications and different scales of data demand fine-tuning of various parameters to achieve optimum performance. One such significant parameter is shared_buffers, which, when configured correctly, can notably enhance the performance of your PostgreSQL database.

Read on to see what shared buffers are, why they are so important to Postgres, and how to figure out the right value to use.

Comments closed