Press "Enter" to skip to content

Category: Query Tuning

Fun with Multiple Indexes

Erik Darling makes a fairly rare multi-index sighting:

Notice! Both of our nonclustered indexes get used, and without a lookup are joined together.

Because the predicates are of the inequality variety, the join columns are not ordered after each seek. That makes a hash join the most likely join type.

I’ve always had this belief that there are probably more cases in which multi-index solutions are useful than the SQL Server optimizer gives us. This belief may be irrational.

Comments closed

Common Query Plan Patterns for Joins

Erik Darling has a new mini-series. First up is degrees of parallelism and bitmaps:

But wait! This query runs at DOP 4. You can tell by looking at the number of executions.

Who runs queries at DOP 4?

Fools.

This DOP goes up to 11.

Next, Erik versus equality predicates:

But not every query has an available equality join predicate. For various reasons, like poor normalization choices, or just the requirements of the query, you may run into things like I’m about to show you.

If there’s a headline point to this post, it’s that joins that don’t have an equality predicate in them only have one join choice: Nested Loops. You cannot have a Hash or Merge join with this type of query. Even with the best possible indexes, some of these query patterns just never seem to pan out performance-wise.

Read on for the implications of this, as well as techniques to improve performance.

1 Comment

Semi-Join Plan Weirdness

Erik Darling has an interesting scenario for us:

This post isn’t meant to dissuade you from using EXISTS or NOT EXISTS when writing queries. In fact, most of the time I think they make a lot of sense.

But weird things can happen along the way, especially if you don’t have supporting indexes, or if supporting indexes aren’t chosen by the optimizer for various reasons.

In this post, I’m going to show you a query plan pattern that can occur in semi-join plans, and what you can do about it.

Click through for the problem and the solution. Me? I don’t like semi-joins on principle. Either join or don’t join; give me none of these cowardly half-measures. I’m not sure what to think about anti-semi-joins because I’m apparently anti semi-join for the purposes of this belabored joke, but I’m a bit suspicious of them as well.

Comments closed

The Pain of OR Clauses

Erik Darling wants you to embrace the healing power of AND:

This is one of my least favorite query patterns, because even with appropriate indexes, performance often isn’t very good without additional interventions.

Without indexes in place, or when “indexes aren’t used”, then the query plans will often look like one of these.

Maybe not always, but there are pretty common.

It’s something that I do wish the optimizer could be smarter about. One important thing to note in Erik’s demo: the OR clause is on two different columns, so SELECT x.Col1 FROM dbo.TblX x WHERE x.ID = 8 OR x.ID = 7 works fine, but WHERE x.ID = 8 OR x.SomethingElse = 14 is liable to cause performance issues on a large enough table.

Comments closed

Temporal Table Data Retrieval Execution Plans

Hugo Kornelis digs into execution plans when retrieving data from temporal tables:

The query above will simply grab the requested data from the Products table as if it were a normal table. The corresponding history table is not used at all. To understand why this happens it is important to recall that system-versioned temporal tables always store the currently valid version of the row in the table itself; all older, no longer valid versions are stored in the accompanying history table. So the query above, which does not use any specific temporal logic, means “I don’t care about the history, I want the data as it is now”. Due to how temporal tables are designed, SQL Server only has to query the actual table for this.

Since there is nothing special or interesting about this, let’s move on to queries that do use special logic in the query to retrieve older versions of the data.

Click through for quite a bit more detail on what it looks like for less-regular queries.

Comments closed

Indexing for Physical Join Operators

Deepthi Goguri continues a series on physical join operators:

In the Part1 of decoding the physical join operators, we learned about the different types of physical operators: Nested loops, Merge joins and Hash joins. We have seen when they are useful and how to take advantage of each for the performance of our queries. We have also seen when they are useful and when they needs to be avoided.

In this part, we will know more about these operators and how the indexes really help these operator to perform better so the queries can execute faster.

Read on to see how to define indexes for each of the three physical operators.

Comments closed

Columnstore, Strings, and Windowing Functions

Erik Darling has a tale to tell:

The only columns that we were really selecting from the Comments table were UserId and CreationDate, which are an integer and a datetime.

Those are relatively easy columns to deal with, both from the perspective of reading and sorting.

In order to show you how column selection can muck things up, we need to create a more appropriate column store index, add columns to the select list, and use a where clause to  restrict the number of rows we’re sorting. Otherwise, we’ll get a 16GB memory grant for every query.

Read on to see how one little (or, well, big) string column can foul up the whole works.

Comments closed

Query Plans and Window Functions

Erik Darling has a two-fer here. First, window functions and parallelism:

When windowing functions don’t have a Partition By, the parallel zone ends much earlier on than it does with one.

That doesn’t mean it’s always slower, though. My general experience is the opposite, unless you have a good supporting index.

But “good supporting index” is for tomorrow. You’re just going to have to deal with that.

Second, columnstore behavior with respect to window functions:

Not only is the parallel version of the row mode plan a full second slower, but… look at that batch mode plan.

Look at it real close. There’s a sort before the Window Aggregate, despite reading from the same nonclustered index that the row mode plan uses.

But the row mode plan doesn’t have a Sort in it. Why?

Check out both posts.

Comments closed

Parameterization with TOP PERCENT

Erik Darling has a bone to pick with TOP PERCENT:

There was a three-part series of posts where I talked about a weird performance issue you can hit with parameterized top. While doing some query tuning for a client recently, I ran across a funny scenario where they were using TOP PERCENT to control the number of rows coming back from queries.

With a parameter.

So uh. Let’s talk about that.

Read on to see the mess of an execution plan TOP PERCENT creates and a more complex query which performs considerably better under the circumstances.

Comments closed

Performance Gains with APPLY

Erik Darling gives us a scenario where OUTER APPLY is quite useful:

In all, this query will run for about 18 seconds. The majority of it is spent in a bad neighborhood.

Why does this suck? Boy oh boy. Where do we start?

– Sorting the Votes table to support a Merge Join?
– Choosing Parallel Merge Joins ever?
– Choosing a Many To Many Merge Join ever?
– All of the above?

I voted “all of the above.” Click through to see how Erik turns a bad query plan into a much less bad query plan.

Comments closed