Press "Enter" to skip to content

Category: Performance Tuning

SQL Server Health Checks with SQLMonitor

Ajay Dwivedi performs a server health check:

Working as a DBA, I often get pulled into issues where application teams complain about “database server is slow”.

This general statement “server is slow” requires an equally robust approach that can help us figure out if there is an issue with CPU, memory, io, or regressed query. At this point, I prefer to use my SQLMonitor dashboard. This tool is entirely free and open source, and can be deployed on SQL Server 2014+ irrespective of any environment or edition.

Read on to see how it works and check out the GitHub repo as well as a one-off script you can run if you don’t have SQLMonitor set up.

Comments closed

Updates to Intelligent Query Processing in SQL Server 2022

Derek Wilson and Kate Smith dive into IQP updates:

SQL Server 2022 introduces a handful of new members to the growing Intelligent Query Processing (IQP) family. These additions range from expanding and improving various query performance feedback mechanisms to adding two new features—parameter sensitive plans (PSPs) and optimized plan forcing. In this blog, we give a general overview of these additions, with detailed feature-specific blogs to follow.

Read on to see what they have in store.

Comments closed

Inverted Indexes for Full-Text Search

Maria Zakourdaev twists some text inside-out:

Sometimes there are properties in the document with unstructured text, like newspaper articles, blog posts, or book abstracts. The inverted index is easy to build and is similar to data structures search engines use.

Such document structures can help in various complex search patterns, like common word detection, full-text searches, or document similarity searches, using humming distance or l2distance algorithms. Inverted indexes are useful when the number of keywords is not too large and when the existing data is either totally immutable or rarely changed, but frequently searched.

This post and Maria’s MSSQLTips post both cover the high-level concept, focusing on tradeoffs between different data models. I like this sort of idea a lot and like telling people that sometimes, the right answer in a relational database involves thinking backwards.

Comments closed

Sources of Query Plan Duplication

Eitan Blumin has a script for us:

Running the above script with the parameter @RCA set to 0 will output a list of the top query hashes (sorted based on their size in bytes so that you’d see the most impactful queries first) that have multiple different query plan hashes cached for them.

Running the script with the parameter @RCA set to 1 will output the same list, with additional columns that will help you do deeper root cause analysis.

Unlike the similar script provided by Brent, my script counts the number of query plan hashes rather than the query plan handles per each query hash. I found that this results in far fewer “false positives”. But you can still set the parameter @CountByPlanHandleInsteadOfPlanHash to 1 if you want it to count based on query plan handles instead.

Click through for the script, details on how it works, and various classes of reason why you might get to this outcome.

Comments closed

A Critique of Parameter Sensitive Plan Optimization

Brent Ozar is not amused:

Parameter sniffing is a notorious problem for Microsoft SQL Server because it tries to reuse execution plans, which doesn’t work out well for widely varying parameters. Here’s a primer for the basics about how it happens.

SQL Server 2022 introduces a new feature called Parameter Sensitive Plan optimization. I’m not really sure why Microsoft capitalized the first three letters sometimes and not others, or why they only abbreviate it as “PSP optimization” – as long as we’re abbreviating, why not call it PSPO? Optimization is the word in there with the most syllables – why the hell would we leave that one as the one that isn’t abbreviated?

Because I make things go faster for a living, we’re gonna call it PSPO. You’ll see how I pronounce it later, and why.

Erik Darling has also spent a lot of time talking about this. And at this point, I don’t know that how many (if any) of Brent’s critiques get fixed before RTM.

Comments closed

Z-Ordering with Apache Impala

Zoltan Borok-Nagy and Norbert Luksa show off a performance improvement in Apache Impala:

So we’ll have great search capabilities against the partition columns plus one data column (which drives the ordering in the data files). With our sample schema above, this means we could specify a SORT BY “platform” to enable fast analysis of all Android or iOS users. But what if we wanted to understand how well version 5.16 of our app is doing across platforms and countries?

Can we do more? It turns out that we can. There are exotic orderings out there that can also sort data by multiple columns. In this post, we will describe how Z-order allows ordering of multidimensional data (multiple columns) with the help of a space-filling curve. This ordering enables us to efficiently search against more columns. More on that later.

It looks like a really good technique for nearly-static data, sort of like you’d see with a data warehouse which refreshes once a day.

Comments closed

Reducing GAM Contention with tempdb in SQL Server 2022

David Pless explains a performance benefit:

Over the past several SQL Server releases, Microsoft has improved the concurrency and performance of the tempdb database. In SQL Server 2022 we are addressing one of the last areas of contention by introducing concurrent global allocation map (GAM) and shared global allocation map (SGAM) updates which will give SQL Server 2022 a big improvement for scalability as tempdb is arguably the most important database in your environment.

Click through for the details. I can see this being a big performance improvement on really busy environments which make heavy use of temp tables.

Comments closed

Don’t Store Files in the Database

Josh Darnell provides timeless advice:

As Deborah’s invite post suggests, this is a “that one time at that client” story. I was working at a consulting firm, and we had written an app for a particular client. Part of this application’s workflow involved users uploading images alongside some other information. These were not particularly large images in the grand scheme of things – they were taken by a microscope, and were a few kilobytes each, maybe.

However, this app had been in use for a long time. And as you might have guessed from the title of this post, each of these images was stored in a single table in the database that backed this application

Yeah, that’ll be a problem… Read on for some recommendations on how to avoid the issue. One thing I would add is FileTable, which came out in SQL Server 2012. In that case, the files are actually stored on disk but are queryable via T-SQL. It introduces its own set of problems but I do have some fond feelings about having used FileTable in the past.

Comments closed

Filtered Statistics and Table Performance

Guy Glantser provides a use case for filtered statistics:

Let’s say you have a very large table on a SQL Server 2012 Standard Edition instance. This means: old cardinality estimator and no partitioning. The table has a DATETIME column, which is ever-increasing, and it contains 5 years of data (it has to, due to regulations). The auto-update statistics kicks in only every 4 days, more or less, even when trace flag 2371 is enabled. The problem is that users usually query the table for the last day, and only rarely need to access older data. Since auto-update statistics uses a very small sample rate for very large tables, the result is not very accurate. The bottom line of all this is that most of the time you get a poor execution plan, because the optimizer estimates very few rows, while in fact there are many rows. What can you do?

I’m not sure I’ve ever used filtered statistics but it is good to know such a thing exists.

Comments closed