Category: Performance Tuning

In summary, page allocations and page free events rapidly occur, sometimes in an alternating pattern. SQL Server will often free a number of pages just to immediately request allocations for a similar number of pages. If all of the free page events result in returned memory to the OS then the reason for the scalability bottleneck becomes clear. When running the full workaround with 96 concurrent sessions, a total of 341965 page freed operations were performed. Those events freed about 71.3 million pages in total. That amounts to about 584 GB of memory returned to the OS in total, based on the previous assumptions.

This is a great investigation into the depths of debugging in SQL Server. Joe wasn’t able to get a definitive solution to his problem, but he showed us a lot along the way.

Comments closed

Nested Loops, Hash, Or Merge: Which Is Best?

Published 2018-07-17 by Kevin Feasel

Grant Fritchey dodges the important questions:

First response, also a joke, was the question at the title of this post:

What is the preferred operator when joining tables: Hash Match, Nested Loops or Merge?

While my immediate response to this question is, yes. Meaning, they’re all preferred, situationally. I decided to expand on that a bit.

I completely agree with Grant: there is no single best operator. If there were, database companies wouldn’t have multiple options. That said, in an ideal world, all joins would be merge joins; in our fallen world, nested loops and hash matches often prove superior second-best alternatives.

Comments closed

Parallelism Strategies For Grouping Operations

Published 2018-07-13 by Kevin Feasel

Itzik Ben-Gan continues his series on grouping data in SQL Server by looking at how these operations can go parallel:

Besides needing to choose between various grouping and aggregation strategies (preordered Stream Aggregate, Sort + Stream Aggregate, Hash Aggregate), SQL Server also needs to choose whether to go with a serial or a parallel plan. In fact, it can choose between multiple different parallelism strategies. SQL Server uses costing logic that results in optimization thresholds that under different conditions make one strategy preferred to the others. We’ve already discussed in depth the costing logic that SQL Server uses in serial plans in the previous parts of the series. In this section I’ll introduce a number of parallelism strategies that SQL Server can use for handling grouping and aggregation. Initially, I won’t get into the details of the costing logic, rather just describe the available options. Later in the article I’ll explain how the costing formulas work, and an important factor in those formulas called DOP for costing.

As you will later learn, SQL Server takes into account the number of logical CPUs in the machine in its costing formulas for parallel plans. In my examples, unless I say otherwise, I assume the target system has 8 logical CPUs. If you want to try out the examples that I’ll provide, in order to get the same plans and costing values like I do, you need to run the code on a machine with 8 logical CPUs as well. If you’re machine happens to have a different number of CPUs, you can emulate a machine with 8 CPUs—for costing purposes—like so:
DBCC OPTIMIZER_WHATIF(CPUs, 8);
Even though this tool is not officially documented and supported, it’s quite convenient for research and learning purposes.

This is a fairly long article, but a great one.

Comments closed

Finding Low-Hanging Fruit When Tuning SQL Server

Published 2018-07-12 by Kevin Feasel

Kevin Hill has a couple scripts which help him find easy performance gains:

Recently, I’ve been getting a lot of performance tuning work, much of which is basically “things are slow…can you fix them?” type of requests. Most experienced DBAs know that there a few gazillion factors that can lead to this request, and I won’t re-hash them here.

Lets assume, that we’ve optimzed or eliminated the server, disks, network, etc. and were now looking at SQL code and everyone’s favorite – indexes.

I have two scripts I use that give me a quick overview of the type of work SQL Server is doing. These complement each other, and are used AS A STARTING POINT to locate the low-hanging fruit that can be causing excessive work for the disks and memory of the server.

Click through for those scripts.

Comments closed

The Power Of Predicate Pushdown

Published 2018-07-11 by Kevin Feasel

Pedro Lopes explains how predicate pushdown helps improve performance on queries:

First, let’s define a few terms, so we can see how to detect whether we’re making good use of our indexes, as they relate to the queries running in our SQL Server.

Whenever you submit a query to SQL Server, if it includes a JOIN and/or WHERE clause, that constitutes a row filtering pattern known as a predicate.

The query optimizer can use that to estimate how to best retrieve only the intended rows, after that predicate has been applied, this surfaces in the query plan as the Estimated Number of Rows.

When that estimated plan is executed, and you look at the actual execution plan, this surfaces as the Actual Number of Rows. Usually, a big difference between Estimated and Actual number of rows indicates a misestimation that may need to be addressed to improve performance: maybe you don’t have the right indexes in place?

These are the two properties related to rows you had on every SQL Server plan up to SQL Server 2014.

Read on to learn how predicate pushdown can make queries faster.

Comments closed

Finding Scalar Functions In Execution Plans

Published 2018-07-06 by Kevin Feasel

Kendra Little points out that scalar user-defined functions can hide in the most unassuming of places:

After we find matches based on the customer id, we have more work “left over” — that’s the “residual” bit.

For every row that matches, SQL Server is plugging values into the Website.CalculateCustomerPrice() function and comparing the result to the Unit price column, just like we asked for in the where clause.

In other words, this is happening for every row in Sales.InvoiceLines that has a matching row in Sales.Invoices.

Which is every single invoice & invoice line, as it turns out.

It’s a shame there’s no “this is why your query is slow” plan operator for scalar UDFs.

Comments closed

Alleviating tempdb Contention

Published 2018-07-05 by Kevin Feasel

Pam Lahoud has some advice for those with tempdb-heavy workloads:

TL;DR – Update to the latest CU, create multiple tempdb files, if you’re on SQL 2014 or earlier enable TF 1117 and 1118, if you’re on SQL 2016 enable TF 3427.

And now it’s time for everyone’s favorite SQL Server topic – tempdb! In this article, I’d like to cover some recent changes that you may not be aware of that can help alleviate some common performance issues for systems that have a very heavy tempdb workload. We’re going to cover three different scenarios here:

Object allocation contention
Metadata contention
Auditing overhead (even if you don’t use auditing)

There’s some good information in here so don’t just say tl;dr.

Comments closed

Performance Tuning Window Functions

Published 2018-06-29 by Kevin Feasel

Kathi Kellenberger gives us some hints on tuning queries using window functions:

The only way to overcome the performance impact of sorting is to create an index specifically for the OVER clause. In his book Microsoft SQL Server 2012 High-Performance T-SQL Using Window Functions, Itzik Ben-Gan recommends the POC index. POC stands for (P)ARTITION BY, (O)RDER BY, and (c)overing. He recommends adding any columns used for filtering before the PARTITION BY and ORDER BY columns in the key. Then add any additional columns needed to create a covering index as included columns. Just like anything else, you will need to test to see how such an index impacts your query and overall workload. Of course, you cannot add an index for every query that you write, but if the performance of a particular query that uses a window function is important, you can try out this advice.

There are some good insights here.

Comments closed

Studying Performance Problems With Web Applications

Published 2018-06-28 by Kevin Feasel

Adrian Colyer reviews a paper on fixing performance problems with ORMs:

This is a fascinating study of the problems people get into when using ORMs to handle persistence concerns in their web applications. The authors study real-world applications and distil a catalogue of common performance anti-patterns. There are a bunch of familiar things in the list, and a few that surprised me with the amount of difference they can make. By fixing many of the issues that they find, Yang et al., are able to quantify how many lines of code it takes to address the issue, and what performance improvement the fix delivers.

Much of this is straightforward, but worth the read.

Comments closed

OFFSET – FETCH Versus ROWNUM In Oracle

Published 2018-06-28 by Kevin Feasel

Lukas Eder compares the OFFSET FETCH logic versus using ROWNUM for grabbing an ordered sub-selection of rows in Oracle:

Now, while the SQL transformation from FETCH FIRST to ROW_NUMBER() filtering is certainly correct, the execution plan doesn’t really make me happy. Consider the ROWNUM based query:

---------------------------------------------------------
| Id  | Operation                     | Name    | Rows  |
---------------------------------------------------------
|   0 | SELECT STATEMENT              |         |       |
|*  1 |  COUNT STOPKEY                |         |       |
|   2 |   VIEW                        |         |     1 |
|   3 |    TABLE ACCESS BY INDEX ROWID| FILM    |  1000 |
|   4 |     INDEX FULL SCAN           | PK_FILM |     1 |
---------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(ROWNUM=1)

And compare that to the FETCH FIRST query:

-------------------------------------------------
| Id  | Operation                | Name | Rows  |
-------------------------------------------------
|   0 | SELECT STATEMENT         |      |       |
|*  1 |  VIEW                    |      |     1 |
|*  2 |   WINDOW SORT PUSHED RANK|      |  1000 |
|   3 |    TABLE ACCESS FULL     | FILM |  1000 |
-------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("from$_subquery$_002"."rowlimit_$$_rownumber"<=1)
   2 - filter(ROW_NUMBER() OVER ( ORDER BY "FILM"."FILM_ID")<=1)

Lukas digs into this and is not the biggest fan of OFFSET-FETCH. On the SQL Server side, my anecdotal experience has been that it doesn’t perform nearly as well as you’d like either.

Comments closed