Category: Query Tuning

How MAXDOP Works

Published 2020-07-07 by Kevin Feasel

The nuance to the question is in the phrase “running concurrently”. There is a limit to the number of characters one can use in a Twitter poll. My intention was for people to focus on the number of threads that could have a status of RUNNING at any one moment.
There are various definitions for that status in different DMVs, but essentially RUNNING means actively executing on a processor — not waiting for its turn on a CPU or for some other resource to become available.
My expectation was that most people would think that MAXDOP 4 would limit a parallel query to using 4 CPU cores at any one moment in time. Indeed, that was the second most popular answer to the poll.

But read on to understand why 4 isn’t the correct answer.

1 Comment

Key Lookups and Self-Joins

Published 2020-07-06 by Kevin Feasel

Erik Darling has an interesting method for eliminating key lookups:

This post isn’t going to go terribly deep into anything, but I do want to make a few things about them more clear, because I don’t usually see them mentioned anywhere.
1. Lookups are joins between two indexes on the same table
2. Lookups can only be done via nested loops joins
3. Lookups can’t be moved around in the execution plan
I don’t want you to think that every lookup is bad and needs to be fixed, but I do want you to understand some of the limitations around optimizing them.

Definitely worth the read.

Comments closed

Delayed Prefetch and Hidden Reads

Published 2020-06-29 by Kevin Feasel

Hugo Kornelis looks at when worlds collide:

So let’s check. The picture above shows, side by side, the properties of the Index Seek and the Key Lookup operator. They show that the Index Seek did 3 logical reads only, while Key Lookup did 650 logical reads. A clear indication where the majority of the work is done.
But wait. Aren’t we missing something?
The SET STATISTICS IO ON output indicates a total of 722 logical reads. The two screenshots above add up to 653 logical reads. Where are the other 69 logical reads?

Read on for the answer.

Comments closed

Optimizing Read Performance of Heaps

Published 2020-06-24 by Kevin Feasel

Uwe Ricken continues a series on heaps in SQL Server:

Heaps are not necessarily the developer’s favourite child, as they are not very performant, especially when it comes to selecting data (most people think so!). Certainly, there is something true about this opinion, but in the end, it is always the workload that decides it. In this article, I describe how a Heap works when data are selected. If you understand the process in SQL Server when reading data from a Heap, you can easily decide if a Heap is the best solution for your workload.

Uwe hits on a couple of the (few) use cases where heap performance can match and sometimes surpass clustered index performance.

Comments closed

Calculating Partitions for Processing Data Files in Apache Spark

Published 2020-06-23 by Kevin Feasel

Ajay Gupta digs into how to calculate the number of partitions the different Spark APIs use when reading from files:

Until recently, the process of picking up a certain number of partitions against a set of data files, always looked mysterious to me. However, recently, during an optimization routine, I wanted to change the default number of partitions picked by Spark for processing a set of data files, and that is when I started to decode this process comprehensively along with proofs. Hopefully, the description of this decoded process would also help the readers to understand Spark a bit deeper and would enable them to design an efficient and optimized Spark routine.

This is important information if you’re tuning Spark cluster performance.

Comments closed

Fun with Query Tuning in SQL Server 2019

Published 2020-06-22 by Kevin Feasel

Erik Darling has just wrapped up a nice series on tackling a problem which looks like parameter sniffing but isn’t. Part 2 covers the issue:

This isn’t always the exact case, but generally speaking you’ll observe something along these lines.
It’s definitely not the case for what we’re going to be looking at this week.
This week is far more interesting.
That’s why it’s a monstrosity.

Part three digs in:

It’s not parameter sniffing, but it sure could feel like it.
– When the procedure compiles and runs with VoteTypeId 5, it runs for 12 minutes
– Other VoteTypeIds run well with the same plan that VoteTypeId 5 gets
– When VoteTypeId 5 runs with a “small” plan, it does okay at 10 seconds

Part four gives us a solution without using OPTIMIZE FOR MEDIOCRE:

This is what happens when we optimize for unknown. The density vector guess is 13,049,400.
That guess for Vote Types with very few rows ends up with a plan that has very high startup costs.
This version of the query will run for 13-17 seconds for any given parameter. That sucks in zero gravity.

Part 5 looks into something which occasionally pops up with this query:

You see, there’s a mystery plan.
It only shows up once in a while, like Planet X. And when it does, we get bombarded by asteroids.
Just like when Planet X shows up.
I wouldn’t call it a good all-around plan, but it does something that we would want to happen when we run this proc for VoteTypeId 5.

Read on for an educational romp through the SQL Server 2019 optimizer.

Comments closed

Window Functions in Spark SQL

Published 2020-06-18 by Kevin Feasel

Juoko Virtanen walks us through window functions in Spark SQL:

When you think of windows in Spark you might think of Spark Streaming, but windows can be used on regular DataFrames. Window functions calculate an output value for every row of a DataFrame based on a group of rows. I have been working on optimizing some Spark code and have noticed a few places where the use of a window function eliminates the need for a join and speeds up the code. A common pattern where a window can be used to replace a join is when an aggregation is performed on a DataFrame and then the DataFrame resulting from the aggregation is joined to the original DataFrame. Let’s take a look at an example.

Read on for a few examples using the Scala flavor of Spark SQL.

Comments closed

Choosing an Algorithm for Table.Join in Power Query

Published 2020-06-15 by Kevin Feasel

Chris Webb continues a series on optimizing merge performance in Power Query:

The first thing to say is that if you don’t specify a join algorithm in the sixth parameter of Table.Join (it’s an optional parameter), Power Query will try to decide which algorithm to use based on some undocumented heuristics. The same thing also happens if you use JoinAlgorithm.Dynamic in the sixth parameter of Table.Join, or if you use the Table.NestedJoin function instead, which doesn’t allow you to explicitly specify an algorithm.
There are going to be some cases where you can get better performance by explicitly specifying a join algorithm instead of relying on JoinAlgorithm.Dynamic but you’ll have to do some thorough testing to prove it. From what I’ve seen there are lots of cases where explicitly setting the algorithm will result in worse performance, although there are enough cases where doing so results in better performance to make all that testing worthwhile.

That behavior is the same as if you decided to start writing INNER LOOP JOIN or INNER HASH JOIN for your queries. In the right spot, you may have knowledge the optimizer doesn’t have and can make performance faster, but a lot of the time the approach will be a bit too heavy-handed and end up a net degradation of performance.

Comments closed

One-Column Fusion with DAX

Published 2020-06-12 by Kevin Feasel

Phil Seamark has a performance tuning tip for DAX:

I wrote an article about an optimisation called DAX Fusion that attempts to fuse similar SE calls when it can. This article highlights an elegant DAX-based trick that works for a specific scenario by reducing the number of SE calls that doesn’t rely on DAX fusion. The difference between 1-Column fusion and the other is:
– DAX fusion in the engine works across multiple columns that have the same effective WHERE clause
– 1-Column fusion works by fusing multiple measures that all reference a single column, but with different WHERE clauses

Read on to learn how to switch around a bit of DAX to reduce the number of storage engine calls, as well as an example of one scenario in which it can come in handy.

Comments closed

The SQL Server Optimizer and Data Types

Published 2020-06-12 by Kevin Feasel

Erik Darling turns on the logic machine:

Let me ask you a question: If I told you that all the numbers in an integer column were either:
1. > 0
2. >= 1
You’d probably agree that the lower number you could possible see is 1.
And that’s exactly the case with the Reputation column in Stack Overflow.

But as Erik points out, the optimizer doesn’t always see those as the same. Read on for an example.

Comments closed