Press "Enter" to skip to content

Category: Performance Tuning

Result Set Caching in Microsoft Fabric Data Warehouse

Emily Tehrani makes an announcement:

Result Set Caching is now available in preview for Microsoft Fabric Data Warehouse and Lakehouse SQL analytics endpoint. This performance optimization works transparently to cache the results of eligible T-SQL queries. When the same query is issued again, it directly retrieves the stored result, instead of recompiling and recomputing the original query. This operation drastically cuts execution time for complex queries. The cache is then automatically managed on the user’s behalf. This lightweight performance boost is most beneficial for workloads like reports, that issue many repetitive T-SQL queries to the DW and SQL analytics endpoint.

This is something I’ve wished we had on-premises for years and years, especially for data warehouses where you know the data only changes once every x hours or days. You can, of course, do this yourself with the cache-aside pattern and some caching solution, but that implies you have a layer between your end user and the data source that you fully control.

Leave a Comment

SQL Server Performance Troubleshooting Script Updates

Erik Darling has been busy (plus raking in changes from contributors like he’s farming in an AFK game):

It’s been a busy few months working on scripts in my GitHub repo, including adding two new members to the family.

  • sp_IndexCleanup: Easily deduplicate indexes — scripts all the changes out for you!
  • sp_PerfCheck: A high-level review of performance-related settings and configurations!

Read on to see which other scripts have updates and where you can download all of Erik’s scripts.

Leave a Comment

Tuning SSIS Data Flow Buffers

Andy Brownsword speeds things up:

When using data flows in SSIS packages we want the flow to be fast and fluid. Improving performance of the flows will vary in different packages, but one element can consistently help – tuning memory utilisation.

In this post we’ll look at tuning the memory utilisation by altering the number of rows passing through the flow concurrently. Specifically we’re looking at the following properties:

  • DefaultBufferSize
  • DefaultBufferMaxRows

This is a pretty big deal, finding the right size that allows you to keep data flowing smoothly without having to wait for buffers to fill. The defaults are aggressively low in most cases. And good on Andy for providing a series of tests to give practical numbers.

Leave a Comment

The Challenges of Benchmarking

Tomas Vondra digs into some metrics:

I do a fair number of benchmarks, not only to validate patches, but also to find interesting (suspicious) stuff to improve. It’s an important part of my development workflow. And it’s fun 😉 But we’re dealing with complex systems (hardware, OS, DB, application), and that brings challenges. Every now and then I run into something that I don’t quite understand.

Consider a read-only pgbench, the simplest workload there is, with a single SELECT doing lookup by PK. If you do this with a small data set on any machine, the expectation is near linear scaling up to the number of cores. It’s not perfect, CPUs have frequency scaling and power management, but it should be close.

Click through for a quick look at some odd behavior, followed by a lot of interesting digging into the weeds trying to find answers.

Leave a Comment

Pain Points around tempdb

Kevin Hill has a list:

TempDB is the SQL Server equivalent of a junk drawer – everyone uses it, nobody monitors it, and eventually it becomes a bottleneck you can’t ignore.

Whether it’s poorly configured from the start or getting hammered by bad execution plans, TempDB often becomes the silent killer of performance. The good news? A few targeted changes can make a big impact.

Read on for some of tempdb’s greatest hits. Kevin also has a few quick tips for tempdb.

Leave a Comment

Estimating Query Percentiles in PostgreSQL

Michael Christofides makes an assertion:

I recently saw a feature request for pg_stat_statements to be able to track percentile performance of queries, for example the p95 (95th percentile) or p99 (99th percentile).

That would be fantastic, but isn’t yet possible. In the meantime, there is a statistically-dodgy-but-practically-useful (my speciality) way to approximate them using the mean and standard deviation columns in pg_stat_statements.

Click through for the code. Michael even covers the immediate objection I have (that the data isn’t normally distributed, so you shouldn’t use the same Z score estimators that exist for the normal). That said, if you’re interested in “p99…ish” then this is a clever approach to take.

Leave a Comment

SQL Server Performance Office Hours

Erik Darling is back with a new episode of office hours:

Do you know of any disadvantages of using a filtered index to filter NULL values? We have a very heavy transactional table, like 10k trans/sec, with a clustered index and one non-clustered index. We don’t have any queries that select rows with NULL values ​​from this table. The DBA team said we should avoid using a filtered index without any proof. What do you think?

Click through for Erik’s answers in video form. I was workshopping a joke around how all of the evidence Erik has of me being mean to him are lies, but couldn’t make it work without riding the line of “Wait…is he serious?”

Leave a Comment

Working with Memory-Optimized tempdb

Haripriya Naidu deals with metadata contention:

This feature is specifically designed to reduce metadata contention. Note that adding data files will not resolve metadata contention, as that addresses a different type of contention.

You can learn more about enabling this feature and its benefits here.

A company I used to work for was a perfect candidate for this, except that the limitations meant that we couldn’t actually use it. We ended up switching some of our most frequently recurring temp tables and table-valued parameters to memory-optimized user-defined table types and got us out of our metadata contention mess without using this feature.

Leave a Comment

Explaining Cost in Oracle

Kellyn Gorman knows the cost:

When you first begin your journey into Oracle performance tuning, you quickly discover that you’re not just working with databases, you’re deciphering a complex system of rules, statistics, and estimations.  I was reminded by Connor McDonald in a post he made on LinkedIn about one of the most influential turning points in my career as a database professional came when I encountered Wolfgang Breitling’s 2003 paper – A Look Under the Hood of the CBO: the 10053 Event. At a time when I was focused on tuning queries for speed, Breitling’s work redefined how I thought about cost, and more importantly, how Oracle thinks about cost.

Click through for more information. Although SQL Server’s cost estimation process is different, the underlying reasoning and many of the mechanisms involved are the same.

2 Comments