Press "Enter" to skip to content

Category: Indexing

The Pain of OR Clauses

Erik Darling wants you to embrace the healing power of AND:

This is one of my least favorite query patterns, because even with appropriate indexes, performance often isn’t very good without additional interventions.

Without indexes in place, or when “indexes aren’t used”, then the query plans will often look like one of these.

Maybe not always, but there are pretty common.

It’s something that I do wish the optimizer could be smarter about. One important thing to note in Erik’s demo: the OR clause is on two different columns, so SELECT x.Col1 FROM dbo.TblX x WHERE x.ID = 8 OR x.ID = 7 works fine, but WHERE x.ID = 8 OR x.SomethingElse = 14 is liable to cause performance issues on a large enough table.

Comments closed

Indexing for Physical Join Operators

Deepthi Goguri continues a series on physical join operators:

In the Part1 of decoding the physical join operators, we learned about the different types of physical operators: Nested loops, Merge joins and Hash joins. We have seen when they are useful and how to take advantage of each for the performance of our queries. We have also seen when they are useful and when they needs to be avoided.

In this part, we will know more about these operators and how the indexes really help these operator to perform better so the queries can execute faster.

Read on to see how to define indexes for each of the three physical operators.

Comments closed

Query Plans and Window Functions

Erik Darling has a two-fer here. First, window functions and parallelism:

When windowing functions don’t have a Partition By, the parallel zone ends much earlier on than it does with one.

That doesn’t mean it’s always slower, though. My general experience is the opposite, unless you have a good supporting index.

But “good supporting index” is for tomorrow. You’re just going to have to deal with that.

Second, columnstore behavior with respect to window functions:

Not only is the parallel version of the row mode plan a full second slower, but… look at that batch mode plan.

Look at it real close. There’s a sort before the Window Aggregate, despite reading from the same nonclustered index that the row mode plan uses.

But the row mode plan doesn’t have a Sort in it. Why?

Check out both posts.

Comments closed

So You Want to Index

Erik Darling has an indexing strategy for querulous normies:

Most queries will have a where clause. I’ve seen plenty that don’t. Some of’em have surprised the people who developed them far more than they surprised me.

But let’s start there, because it’s a pretty important factor in how you design your indexes. There are all sorts of things that indexes can help, but the first thing we want indexes to do in general is help us locate data.

None of this is groundbreaking but Erik does a really good job of laying out the order in which you want to consider specific factors.

Comments closed

Index Creation with DROP_EXISTING

Monica Rathbun takes us through the DROP_EXISTING option when modifying an index:

When you are making changes to an existing Non-Clustered index SQL Server provides a wide variety of options. One of the more frequently used methods is DROP EXISTING; in this post you will learn all about that option. This option automatically drops an existing index after recreating it, without the index being explicitly dropped. Let us take a moment understand the behavior of this choice.

What I really want is DROP_IF_EXISTS. I want idempotent commands: if I run it once or a thousand times, I end up in the same state whether there was an index there at the start or not (or if attempt #793 failed due to running out of sort space in tempdb or something, leaving me with no index). DROP_EXISTING is only idempotent if the index already existed, but then you have to ask, why is it important if an index of that name is already there? The important part of the statement is that I want an end state which includes this index in this form.

Comments closed

The Limits of Filtered Indexes

Erik Darling lays out the pros and cons of filtered indexes:

Filtered indexes are really interesting things. Just slap a where clause on your index definition, and you can do all sorts of helpful stuff:

– Isolate hot data
– Make soft delete queries faster
– Get a histogram specific to the span of data you care about

Among other things, of course. There are some annoying things about them though.

– They only work with specific ANSI options
– If you don’t include the filter definition columns in the index, it might not get used
– They only work when queries use literals, not parameters or variables

Click through for examples of them in action. I would definitely like to see improvements to filtered indexes along the lines that Erik mentions. They have so much potential, but are really held back by those limitations.

Comments closed

Deduplicating Missing Index Requests

Erik Darling trains us in a skill:

Going back to our queries and our index requests, all the queries have two things in common:

– They filter on OwnerUserId
– They order by Score

There are of course other elements in the where clause to attend to, but our job is to come up with one index that helps all of our queries.

This gets a lot easier if you know the queries. If all you have is the set of index details, you can make a sane guess but the chance of being wrong is much greater.

Comments closed

PFS Contention and Heaps

Uwe Ricken continues a series on heaps in SQL Server:

The PFS page “can” become a bottleneck for a heap if many data records are entered in the heap in the shortest possible time. How often the PFS page has to be updated depends mostly on the data record’s size to be saved.

This procedure does not apply to clustered indexes since data records in an index must ALWAYS be “sorted” into the data volume according to the defined index value. Therefore, the search for a “free” space is not carried out via the PFS page but via the value of the key attribute!

Read on for more detail.

Comments closed