Press "Enter" to skip to content

Category: Performance Tuning

Many-to-Many Power BI Relationships and Table Refreshes

Dany Hoter gives us a reason to minimize use of many-to-many relationships in Power BI:

I must admit that in the last two years I’ve told many Power BI/Kusto customers not to worry about relationships that are created as M:M.

I was pretty sure that with Direct Query, such relationships are fine,

Indeed, the generated queries looked fine and performed as expected.

I recently became aware that the number of queries generated for some visuals e.g. Matrix and tables can be affected by the type of relationships between the participating tables.

Read on for a description of why you shouldn’t load your Power BI semantic models with many-to-many relationships, especially once Kusto is involved.

Leave a Comment

Differentiating Physical and Logical Reads in SQL Server

Jose Manuel Jurado Diaz explains a concept:

In the realm of Azure SQL Database, query performance is a paramount concern for database administrators and developers alike. A critical aspect of this performance is understanding how SQL Server interacts with data, particularly through physical and logical reads. This article delves into these two fundamental concepts, providing insights into their impact on database performance and a practical lab to observe these metrics in action.

Read on for the difference, as well as a demonstration. With slow disks and insufficient RAM, it’s really important to know this difference. But as you have more RAM and move to formats like NVMe for storage, I’d argue that it’s less of an issue. The additional RAM, in particular, is important because the idea is that data access frequently will remain in the buffer pool for longer, so you’re more likely to see logical reads in action. Of course, poor indexing and bad decisions can ruin that idea, so don’t do that, okay?

Comments closed

Optimizing Shared Buffers in Postgres

Salman Ahmed explains how shared buffers work in Postgres:

PostgreSQL is known for its robustness and performance right out of the box. However, different applications and different scales of data demand fine-tuning of various parameters to achieve optimum performance. One such significant parameter is shared_buffers, which, when configured correctly, can notably enhance the performance of your PostgreSQL database.

Read on to see what shared buffers are, why they are so important to Postgres, and how to figure out the right value to use.

Comments closed

Postgres Performance Tuning via work_mem

Salman Ahmed explains what working memory is in Postgres and the effects of changing the work_mem value:

PostgreSQL, by default, is configured to run everywhere with minimum resource utilization. To achieve maximum performance under specific scenarios, PostgreSQL’s parameters can be tuned to enhance performance. One such parameter that can impact performance in PostgreSQL is work_mem.

In this blog we will discuss how work_mem can be used to optimize performance in PostgreSQL.

Click through for that discussion.

Comments closed

Caching: In-Database and External

Adron Hall talks caches:

All aboard the Data Express! Let’s imagine our database as this massive train station. The trains are packed with information – from passengers’ details to the schedules. Every time you want to know when the next train to DevOps Land is, you have to ask the station master (the database). If too many folks keep asking the same question, the station master will get tired, slowing down the whole operation. So, what do we do? Enter: Caching!

Read on for different caching mechanisms in several major relational databases, various reasons for external caches (like Redis and memcached) to exist, and four patterns for external caching. I’ve found that database people tend not to care much about external caches, leaving that to application developers. But there can be good reasons to store high-read, low-write data in caches, reducing some of the strain on those expensive database servers.

Comments closed

Reasons Your SQL Server Query Performance Fluctuates

Aaron Bertrand starts the count:

Query performance can fluctuate over time, and it is not necessarily due to a change to the query itself (or to the application code that calls it). Users often ask why a query suddenly got slower even though they haven’t published any changes to the application and the underlying data hasn’t changed drastically. This article points out some other reasons – and there are many – that a query might be slower today than it was 10 minutes ago, two weeks ago, or last summer.

Read on for a bulleted list of reasons. Of course, it would be extremely challenging to create a comprehensive list—for example, in the Same Plan section, in addition to there being more data, changes in the statistical distribution of data can cause performance profiles to change over time. But this is a really good starting point.

Comments closed

Cache Management and Semantic Link in Fabric Notebooks

Marc Lelijveld warms up the cache:

In the previous blog, I wrote about data temperature as part of Fabric when you’re using Direct Lake storage mode. In that blog, I explained how you can get insights in the temperature of a column, what that temperature means and what the impact of the temperature is on columns that are queried more often.

In this blog, I will continue this story by elaborating on a process called framing and how you can influence data eviction to drop data from memory. Finally, this blog goes into more details on how you could use Semantic Link in Fabric Notebooks to warm up the data for most optimal end-user performance.

The SQL Server analog here is having some automated queries which keep specific pages in the buffer pool, like a warm-up script for an instance with plenty of memory but slow disks.

Comments closed

The Risk of Changing MaxDOP

Erik Darling recommends caution:

Like in yesterday’s post about Cost Threshold For Parallelism, changing MAXDOP settings will have a universal effect on the workload.

This is true whether you change it at the server level for all databases, or at the database level using a database scoped configuration for a single database.

It is a guardrail to prevent unwanted conditions as a whole, like excessive concurrent parallel queries causing worker thread starvation (THREADPOOL waits), or just pushing CPU to 100% for extended periods of time.

Read on to see what Erik recommends you think about after any MaxDOP change.

Comments closed

Thoughts on Cost Threshold for Parallelism

Erik Darling has some thoughts:

First, I’m not suggesting that anyone should be using the default value for Cost Threshold For Parallelism. It’s old and moldy and not a good fit for most workloads functioning on modern hardware.

My apologies to Azure SQLDB users who can’t change this setting and leave it up to Microsoft to maybe manage it for them based on ???

Some people out there really like fiddling with settings in a usually ill-informed reaction to Some Script They Found On The Internet, without reading the fine print.

Erik’s thoughts are reasonable overall. My recommendation is to use Michael J. Swart’s technique for tuning cost threshold for parallelism as a starting point, as it gives you a basis for what the net effect of your changes are.

Comments closed