Replaying Workloads with WorkloadTools

Gianluca Sartori shows an example of using the WorkloadTools application to replay a workload, including where the analytics server cannot directly access the production database:

Regardless of the method that you decided to use, at the end of the replays, you will have two distinct sets of tables containing the workload analysis data, sitting in different schemas in the same database or in completely different databases.

WorkloadViewer will let you visualize performance over time, as we have seen for a single workload analysis, but this time it will be able to show you data from both workloads, so that you can compare them.

This sort of production load testing is both important and difficult; WorkloadTools makes it easier.

Troubleshooting Query Performance Changes

Erin Stellato walks us through a troubleshooting guide when users complain about poorly-performing queries:

This is tale of troubleshooting…

When you unexpectedly or intermittently encounter a change in query performance, it can be extremely frustrating and troublesome for a DBA or developer. If you’re not using Query Store, a third-party application, or your own method to capture query data, then you probably don’t have information about what query performance looked like when things were good…you just know how it’s running now. I was working with a customer of Tim’s last week that had been chasing this exact problem for, in their words, years. They had recently upgraded to SQL Server 2016, and the problem was still occurring.

Strangely, “blame the network” didn’t appear in Erin’s post, so I don’t know if it’s comprehensive.

What Compatibility Level 150 Gets You

Erik Darling explains the upsides and downsides of moving to SQL Server 2019 and compatibility level 150:

In those versions, flipping compatibility level uses the new Cardinality Estimator (CE). That new Cardinality Estimator is real hit or miss.

The worst part is that there’s practically no gain to be realized for using higher compatibility levels — that changes with SQL Server 2019.

Read on to see what those new features are. As far as the compatibility level switch goes, there comes a time when you just need to bite the bullet and use the new cardinality estimator. Erik has a few tips to help with that too.

Antivirus and SQL Server

Randolph West proffers advice should your IT team require installing antivirus software on a server with SQL Server running:

This is why it is documented that we should exclude SQL Server from any AV (anti-malware) detection products, so that it can get on with doing what it does best.

Yes, it’s formally documented. This is why we should read documentation when installing things. While it’s super-easy to click “Next,” “Next,” “Next,” that should not be the case with a complex product like SQL Server.

Read on for the list of exceptions you should add and processes to avoid scanning.

Diving Into Index Scans

Hugo Kornelis explains how index scans work in SQL Server:

The logic of the Index Scan operator itself is fairly simple, but the actual actions carried out can vary hugely depending on the type of index being scanned (as defined in the Storage and IndexKind properties). Most of this logic is carried out at the level of the storage engine. Since an understanding of this is important to get a proper understanding of the performance of this operator, the actual actions carried out at the level of the storage engine will be described on this page as well.

The current version of SQL Server (2017) supports four types of index storage. The Storage property distinguishes between RowStore, ColumnStore, and MemoryOptimized; for the latter type only IndexKind further differentiates this into NonClustered and NonClusteredHash.

Scans are an important part of the database engine and knowing how they work helps us understand when they’re the right choice for the job.

Parameter Sniffing in the Wild

Erik Darling is a parameter sniffing anthropologist:

A while back, I put together a pretty good rundown of this on the DBA Stack Exchange site.

In the plan cache, it’s really hard to tell if a query is suffering from parameter sniffing in isolation.

By that I mean, if someone sends you a cached plan that’s slow, how can you tell if it’s because of parameter sniffing?

Read on to see what Erik does to discover parameter sniffing problems.

Power BI Performance Analyzer

Marco Russo takes us through the Power BI Performance Analyzer:

The Power BI Performance Analyzer is a feature included in the May 2019 release of Power BI Desktop that simplifies the way you can collect the DAX queries generated by Power BI. You can use DAX Studio to capture them (as described in Capturing Power BI queries using DAX Studio), but the Performance Analyzer integrated in Power BI is simpler and provides a few insights about the time consumed in other activities, such as the rendering time of any visuals.

You can enable the Power BI Performance Analyzer by clicking the Performance Analyzer checkbox in the View ribbon of Power BI Desktop.

Read the whole thing.

The Cost of Ad Hoc Queries

Erin Stellato gives us a simple demonstration of why parameterization is important for performance:

From this screenshot you can see that we have about 3GB total dedicated to the plan cache, and of that 1.7GB is for the plans of over 158,000 adhoc queries. Of that 1.7GB, approximately 500MB is used for 125,000 plans that execute ONE time only. About 1GB of the plan cache is for prepared and procedure plans, and they only take up about 300MB worth of space. But note the average use count – well over 1 million for procedures. In looking at this output, I would categorize this workload as mixed – some parameterized queries, some adhoc.

Kimberly’s blog post discusses options for managing a plan cache filled with a lot of adhoc queries. Plan cache bloat is just one problem you have to contend with when you have an adhoc workload, and in this post I want to explore the effect it can have on CPU as a result of all the compilations that have to occur. When a query executes in SQL Server, it goes through compilation and optimization, and there is overhead associated with this process, which frequently manifests as CPU cost. Once a query plan is in cache, it can be re-used. Queries that are parameterized can end up re-using a plan that’s already in cache, because the query text is exactly the same. When an adhoc query executes it will only re-use the plan in cache if it has the exact same text and input value(s).

Read on to see an example of how long it takes a set of ad hoc queries to finish versus their parameterized equivalents. Erin’s test is at the behavioral extreme (100% parameterized versus 100% ad hoc) so real-world results won’t be quite this good.

Minimal Logging into Empty Clustered Indexes

Paul White explains how to perform minimal logging when using the INSERT..SELECT pattern to insert into an empty table with a clustered index:

The summary top row suggests that all inserts to an empty clustered index will be minimally logged as long as TABLOCK and ORDER hints are specified. The TABLOCK hint is required to enable the RowSetBulk facility as used for heap table bulk loads. An ORDER hint is required to ensure rows arrive at the Clustered Index Insert plan operator in target index key order. Without this guarantee, SQL Server might add index rows that are not sorted correctly, which would not be good.

Unlike other bulk loading methods, it is not possible to specify the required ORDER hint on an INSERT...SELECT statement. This hint is not the same as using an ORDER BY clause on the INSERT...SELECT statement. An ORDER BY clause on an INSERTonly guarantees the way any identity values are assigned, not row insert order.

Read on to see what you can do.

Blocked Queries and the Plan Cache

Erik Darling lays out an argument why execution time isn’t always a great measure for finding bad queries:

Most scripts (even ones I’ve worked on!) that look at the plan cache, have had the ability to sort it by different metrics: CPU, reads, writes, duration, etc.

A lot of people are very interested in long running queries — and I am too!

Heck, they’re how I make money. Blogging pays like crap.

But there’s a slight problem with only looking at query duration.

Read on for an example. Erik loses money on every blog post but he makes up for it in volume.


July 2019
« Jun