Press "Enter" to skip to content

Category: Indexing

Choosing between Duplicate Indexes

David Alcock wants to know what choice you make when all choices lead to the same conclusion:

As there is an index that has the LastName and FirstName columns the optimiser has opted for an index seek operator using the IX_Person_LastName_FirstName_MiddleName index, and if I look into the Plan XML I can see that it’s using a trivial plan: StatementOptmLevel=”TRIVIAL”.
This basically means there’s one obvious way to return the query results so the optimiser has avoided the cost of going through full optimisation and has elected to use this plan straightaway.
So what happens if I create an identical copy of that particular index, in fact let’s create five indexes that are exactly the same:

There’s a Mass Effect 3 joke in my intro line. But read on for the answer and also check out Barney Lawrence’s comment for a bit more elucidation.

Comments closed

Wide Non-Clustered Indexes as Kinda-Sorta Clustered Indexes

Grant Fritchey gets our hopes up:

Everyone knows that you only get a single clustered index, right? Wouldn’t it be great though if you could have two clustered indexes?

Well, you can. Sort of. Let’s talk about it.

Click through to see what Grant means. This is a thing that I’ve done occasionally, though much more often, I’ve ripped it out because Database Tuning Advisor suggested it and a credulous user took DTA at its word.

Comments closed

Preconceived Notions with Filtered Indexes

Aaron Bertrand has learned a thing or two about filtered indexes:

Confession time. For filtered indexes, I have long held the following impressions:

1. That the definition of the filter had to exactly match the predicate in the query.

2. That col IN (1, 2) was not supported in a filter definition, since it’s the same as col = 1 OR col = 2, which is definitely not supported.

If I were to take a wild guess, I’d think impression 1 was probably influenced by the extreme limitations filtered indexes have with parameterized queries. Anyhow, read the whole thing and learn why both of these are wrong.

Comments closed

Hypothetical Indexes in SQL Server

Eitan Blumin explains what hypothetical indexes are and why they might be useful:

Using Hypothetical Indexes, you can generate an estimated execution plan for a given query, that would essentially assume the existence of a “hypothetical” index as if it actually exists as a real index. Compare that estimated execution plan to its counterpart without the hypothetical index, and you’ll be able to determine whether creating this index for real is worth the time and effort.

Hypothetical Indexes are actually nothing new in SQL Server. It existed since SQL Server version 2005. However, its use is still not widespread to this day. Most likely because it’s not very easy to use and the relevant commands are undocumented.

Click through to see how to use them and an important warning if you try it in production.

Comments closed

Indexing and Window Functions

I continue a series on window functions in SQL Server:

If you’ve been around the block with window functions, you’ve probably heard of the POC indexing strategy: Partition by, Order by, Covering. In other words, with a query, focus on the columns in the PARTITION BY clause (in order!), then the ORDER BY clause (again, in order!), and finally other columns in the SELECT clause to make the index covering (not in order! though it doesn’t hurt!).

But do read on to understand why this is not sufficient.

Comments closed

Why the Optimizer Doesn’t Look at Buffer Pool Data

Paul Randal has an explanation for us:

SQL Server has a cost-based optimizer that uses knowledge about the various tables involved in a query to produce what it decides is the most optimal plan in the time available to it during compilation. This knowledge includes whatever indexes exist and their sizes and whatever column statistics exist. Part of what goes into finding the optimal query plan is trying to minimize the number of physical reads needed during plan execution.

One thing I’ve been asked a few times is why the optimizer doesn’t consider what’s in the SQL Server buffer pool when compiling a query plan, as surely that could make a query execute faster. In this post, I’ll explain why.

This is an interesting post because it explains why the developers of the database engine would purposefully ignore something that could make things faster, but at a potentially devastating cost.

Comments closed

A Heap of Pain

Chad Callihan explains the dislike for heaps in SQL Server:

A table is considered a heap when it is created without a clustered index. Data isn’t in any type of ordered state. Some data is over here, some data is over there.

When you are inserting data into a heap, that data is tossed in wherever. Think of it like your junk drawer. It’s not organized into its own little sections. What do you do when you have something to add such as a pair of scissors or an old pen? You open the drawer, toss it in, and close it up without giving it a second thought.

Like Chad mentions, there are uses for heaps. And when you move to Azure Synapse Analytics, there are more uses for heaps. But with on-premises SQL Server, a heap is usually a mistake.

Comments closed

Index Usage across Replicas

Jess Pomfret does the math:

Last week, I was working on a project to analyse indexes on a database that was part of an availability group. The main goal was to find unused indexes that could be removed, but I was also interested in gaining an overall understanding of how the system was indexed.

Unused indexes not only take up disk space, but they also add overhead to write operations and require maintenance which can add additional load on your system.  We can also use this analysis to look for a high number of lookups which could indicate we need to adjust indexes slightly.

Click through to see how you can connect together index usage stats from the primary and secondary replicas of an availability group.

Comments closed

Fill Factor and When It Matters

Raul Gonzalez has a confession to make:

I love SQL Server internals, I do and I just said it.

Why? because thanks to all the tools, documentation and community members that share their knowledge, folks like me can understand how a super complex piece of software like a relational database engine works (or at least a small part of it).

Click through for a discussion of fill factor and one area where Raul thinks it falls short. I’m not sure that I agree but would need to think about it to give a clear explanation as to why.

Comments closed