Category: Query Tuning

Local Variables with TOP and ORDER BY

Published 2020-07-08 by Kevin Feasel

Erik Darling points out issues with using local variables. First up is with TOP:

In case you missed it for some reason, check out this post of mine about local variables. Though it’s hard to imagine how you missed it, since it’s the single most important blog post ever written, even outside of SQL Server. It might even be more important than SQL Server. Time will tell.
While live streaming recently about paging queries, I thought that it might make an interesting post to see what happens when you use variables in places other than the where clause.
After several seconds of thinking about it, I decided that TOP would be a good enough place to muck around.

After that is ORDER BY:

I see this kind of pattern a lot in paging queries where people are doing everything in their power to avoid writing dynamic SQL for some reason.
It’s almost as if an entire internet work of SQL Server knowledge and advice doesn’t exist when they’re writing these queries.
Quite something. Quite something indeed.

I’d call out Erik’s ORDER BY examples by saying “C’mon, nobody does that!” if I hadn’t actually seen people do that…

Comments closed

What MAXDOP Controls

Published 2020-07-08 by Kevin Feasel

Pedro Lopes gives us an explanation of what MAXDOP really does for us:

There are plenty of blogs on these topics, and the official documentation does a good job of explaining these (in my opinion). If you want to know more about the guidelines and ways to override for specific queries, refer to the Recommendations section in the Configure the max degree of parallelism Server Configuration Option documentation page.
But what does MAXDOP control? A common understanding is that it controls the number of CPUs that can be used by a query – previous revisions of the documentation used this abstraction. And while that is a correct abstraction, it’s not exactly accurate.

This is definitely a nice companion piece to Paul White’s article on how MAXDOP works.

Comments closed

How MAXDOP Works

Published 2020-07-07 by Kevin Feasel

Paul White lets us in on a secret:

The nuance to the question is in the phrase “running concurrently”. There is a limit to the number of characters one can use in a Twitter poll. My intention was for people to focus on the number of threads that could have a status of RUNNING at any one moment.
There are various definitions for that status in different DMVs, but essentially RUNNING means actively executing on a processor — not waiting for its turn on a CPU or for some other resource to become available.
My expectation was that most people would think that MAXDOP 4 would limit a parallel query to using 4 CPU cores at any one moment in time. Indeed, that was the second most popular answer to the poll.

But read on to understand why 4 isn’t the correct answer.

1 Comment

Key Lookups and Self-Joins

Published 2020-07-06 by Kevin Feasel

Erik Darling has an interesting method for eliminating key lookups:

This post isn’t going to go terribly deep into anything, but I do want to make a few things about them more clear, because I don’t usually see them mentioned anywhere.
1. Lookups are joins between two indexes on the same table
2. Lookups can only be done via nested loops joins
3. Lookups can’t be moved around in the execution plan
I don’t want you to think that every lookup is bad and needs to be fixed, but I do want you to understand some of the limitations around optimizing them.

Definitely worth the read.

Comments closed

Delayed Prefetch and Hidden Reads

Published 2020-06-29 by Kevin Feasel

Hugo Kornelis looks at when worlds collide:

So let’s check. The picture above shows, side by side, the properties of the Index Seek and the Key Lookup operator. They show that the Index Seek did 3 logical reads only, while Key Lookup did 650 logical reads. A clear indication where the majority of the work is done.
But wait. Aren’t we missing something?
The SET STATISTICS IO ON output indicates a total of 722 logical reads. The two screenshots above add up to 653 logical reads. Where are the other 69 logical reads?

Read on for the answer.

Comments closed

Optimizing Read Performance of Heaps

Published 2020-06-24 by Kevin Feasel

Uwe Ricken continues a series on heaps in SQL Server:

Heaps are not necessarily the developer’s favourite child, as they are not very performant, especially when it comes to selecting data (most people think so!). Certainly, there is something true about this opinion, but in the end, it is always the workload that decides it. In this article, I describe how a Heap works when data are selected. If you understand the process in SQL Server when reading data from a Heap, you can easily decide if a Heap is the best solution for your workload.

Uwe hits on a couple of the (few) use cases where heap performance can match and sometimes surpass clustered index performance.

Comments closed

Calculating Partitions for Processing Data Files in Apache Spark

Published 2020-06-23 by Kevin Feasel

Ajay Gupta digs into how to calculate the number of partitions the different Spark APIs use when reading from files:

Until recently, the process of picking up a certain number of partitions against a set of data files, always looked mysterious to me. However, recently, during an optimization routine, I wanted to change the default number of partitions picked by Spark for processing a set of data files, and that is when I started to decode this process comprehensively along with proofs. Hopefully, the description of this decoded process would also help the readers to understand Spark a bit deeper and would enable them to design an efficient and optimized Spark routine.

This is important information if you’re tuning Spark cluster performance.

Comments closed

Fun with Query Tuning in SQL Server 2019

Published 2020-06-22 by Kevin Feasel

Erik Darling has just wrapped up a nice series on tackling a problem which looks like parameter sniffing but isn’t. Part 2 covers the issue:

This isn’t always the exact case, but generally speaking you’ll observe something along these lines.
It’s definitely not the case for what we’re going to be looking at this week.
This week is far more interesting.
That’s why it’s a monstrosity.

Part three digs in:

It’s not parameter sniffing, but it sure could feel like it.
– When the procedure compiles and runs with VoteTypeId 5, it runs for 12 minutes
– Other VoteTypeIds run well with the same plan that VoteTypeId 5 gets
– When VoteTypeId 5 runs with a “small” plan, it does okay at 10 seconds

Part four gives us a solution without using OPTIMIZE FOR MEDIOCRE:

This is what happens when we optimize for unknown. The density vector guess is 13,049,400.
That guess for Vote Types with very few rows ends up with a plan that has very high startup costs.
This version of the query will run for 13-17 seconds for any given parameter. That sucks in zero gravity.

Part 5 looks into something which occasionally pops up with this query:

You see, there’s a mystery plan.
It only shows up once in a while, like Planet X. And when it does, we get bombarded by asteroids.
Just like when Planet X shows up.
I wouldn’t call it a good all-around plan, but it does something that we would want to happen when we run this proc for VoteTypeId 5.

Read on for an educational romp through the SQL Server 2019 optimizer.

Comments closed

Window Functions in Spark SQL

Published 2020-06-18 by Kevin Feasel

Juoko Virtanen walks us through window functions in Spark SQL:

When you think of windows in Spark you might think of Spark Streaming, but windows can be used on regular DataFrames. Window functions calculate an output value for every row of a DataFrame based on a group of rows. I have been working on optimizing some Spark code and have noticed a few places where the use of a window function eliminates the need for a join and speeds up the code. A common pattern where a window can be used to replace a join is when an aggregation is performed on a DataFrame and then the DataFrame resulting from the aggregation is joined to the original DataFrame. Let’s take a look at an example.

Read on for a few examples using the Scala flavor of Spark SQL.

Comments closed

Choosing an Algorithm for Table.Join in Power Query

Published 2020-06-15 by Kevin Feasel

Chris Webb continues a series on optimizing merge performance in Power Query:

The first thing to say is that if you don’t specify a join algorithm in the sixth parameter of Table.Join (it’s an optional parameter), Power Query will try to decide which algorithm to use based on some undocumented heuristics. The same thing also happens if you use JoinAlgorithm.Dynamic in the sixth parameter of Table.Join, or if you use the Table.NestedJoin function instead, which doesn’t allow you to explicitly specify an algorithm.
There are going to be some cases where you can get better performance by explicitly specifying a join algorithm instead of relying on JoinAlgorithm.Dynamic but you’ll have to do some thorough testing to prove it. From what I’ve seen there are lots of cases where explicitly setting the algorithm will result in worse performance, although there are enough cases where doing so results in better performance to make all that testing worthwhile.

That behavior is the same as if you decided to start writing INNER LOOP JOIN or INNER HASH JOIN for your queries. In the right spot, you may have knowledge the optimizer doesn’t have and can make performance faster, but a lot of the time the approach will be a bit too heavy-handed and end up a net degradation of performance.

Comments closed