Press "Enter" to skip to content

Category: Indexing

Limiting Index Sizes in Cosmos DB

Hasan Savran explains why you might want to exclude columns from Cosmos DB indexes:

If everything is indexed already; Why do we want to exclude some of indexes? Indexes are saved on disk, you pay for the storage in Azure. If you keep indexing everything, your index file gets larger and you pay more for storage.

     Also; write operations to index file takes longer if index file is larger. By keeping only what you need in index file will improve the latency of write operations. If you will need to change your indexing policies, Rebuilding indexes will take less time.

This behavior is quite different from the way SQL Server behaves, where indexing is more of an opt-in philosophy.

Comments closed

Forced Parameterization and Filtered Indexes

Aaron Bertrand walks us through a case where filtered indexes become unhelpful:

Again, focusing on the areas highlighted in orange: the statement has a parameter @0 (previously it had @1) but, more importantly, the clustered index is scanned now instead of the filtered index. This has impacts throughout the plan, including how many rows are both estimated to be read and actually read in order to return those 11 rows. You can see a much higher I/O cost (about 22X), the predicate is now listed explicitly in the tooltip, and you can see warnings about residual I/O (which just means a lot more rows were read than necessary). The root operator still has the warning about the unmatched index, so at least the plan gives you some clue that a filtered index exists that might be useful if you change the parameterization setting for the database (or add OPTION (RECOMPILE) to the statement):

There are still ways to make filtered indexes work with forced parameterization, such as index hints, but Aaron does a great job explaining why something which seems like it should just work doesn’t always.

Comments closed

When Indexes Collide

Andy Mallon gives us a case where it makes sense to have a non-clustered index which shares the same columns as your clustered index columns:

First off, let’s remember the difference between clustered & nonclustered indexes

The clustered index is organized by the key columns. It also includes every other column as part of the row structure (ie, it has the entire row).

The nonclustered index is also organized by the key columns. It implicitly includes the clustering key columns (if the table is clustered), or a pointer to the row (if the table’s a heap). If any INCLUDE columns are explicitly specified, they will also be included in the index structure (but these included columns don’t affect order).

I’ve seen other cases where it made sense on sufficiently large and wide tables even for seeks (where the page density difference is large enough that you have a 4-level clustered index but a 3-level non-clustered index), so I think there’s more than Andy’s one corner case. But I do agree that it generally doesn’t help.

Comments closed

Finding Unused Indexes in SQL Server

Monica Rathbun shows us how we can find and remove unused indexes in SQL Server:

Indexes can be incredibly beneficial to your database performance; however, they do come with a cost—indexes both consume storage space and affect insert performance. Therefore, it is important as part of your index maintenance procedures that you periodically check to see if your indexes are being used. Many times, indexes are created in the belief they are needed but in fact they are never used. You can reduce that IO overhead on inserts when you remove unnecessary indexes.

I’ll use the same script. Typically, I won’t drop unless total reads is 0 or at least two or three orders of magnitude smaller than writes. Sometimes you have indexes which don’t get used frequently but support very expensive or time-sensitive reports, and you don’t want those getting caught up in your dragnet.

Comments closed

When Adding Indexes Hurts Performance

Jeffry Schwartz takes us through an odd case:

Recently, a customer requested that we tune a query that took 13 seconds to return 11 rows.  SQL Server 2017 suggested an index to improve performance, so we added it in a development environment.  The improvement made the query run 647 seconds, almost 50 TIMES longer than the original!  This naturally caused much consternation, so we decided to determine what and why it happened as well as how we could still achieve the original objective, i.e., make the query run faster.  This article discusses what caused the original performance problem in addition to the new one that was caused by the introduction of an index, and illustrates how we were able to make the query run significantly faster than it did originally.  We will cover reading query plans, examining the specific details of query plan operators, the effects of index statistics on missing index recommendations, using query plan XML to enable simpler query plan comparison, and the effects of using functions in where clauses. 

Click through to understand how this could be the case.

Comments closed

Optimize For Sequential Key

Pam Lahoud explains the context behind a new option you can add to indexes in SQL Server 2019 CTP 3.1 and later:

With last page insert contention, as the number of insert threads increases, the queue for the page latch increases which in turn increases latency. Throughput will also decrease, but if something slows down one of the threads that is holding the latch, this can trigger a convoy and throughput suddenly falls off a cliff. This typically happens when a page fills up and a new page must be added to the index (also known as a page split). The insert that triggers the new page will naturally have to hold the latch for longer than normal while the new page operation completes. This causes the queue to build up behind the latch. Adding a new page also requires an exclusive latch on the parent page, which can cause latch requests to queue at that level as well. At this point, throughput falls off a cliff.

OPTIMIZE_FOR_SEQUENTIAL_KEY aims to do two things – control the rate at which new threads are allowed to request the latch, and favor threads that are likely to keep the throughput high. These techniques will not prevent the contention or reduce latency, but they will help keep throughput consistent as concurrency increases.

Read on to learn more about the process and by implication some places where this won’t work well at all for you.

Comments closed

Parameter Sniffing and Multiple Indexes

Erik Darling looks at how available indexes may contribute to parameter sniffing problems:

When you’re troubleshooting parameter sniffing, the plans might not be totally different.

Sometimes a subtle change of index usage can really throw gas on things.

It’s also a good example of how Key Lookups aren’t always a huge problem.

Both plans had them, just in different places.

The plans had a small change, but that made a big difference.

Comments closed

Statistics and Multiple Single-Column Indexes

Erik Darling is fusing together queries like Dr. Frankenstein in his lab:

You may have noticed that both queries get pretty bad estimates. You might even be thinking about leaving me a comment to update stats.

The thing is that I created these indexes, which means they get stats built with a full scan, and it’s a demo database where nothing changes.

We just get unfortunate histograms, in this case. If I create very specific filtered statistics, both plans perform a key lookup.

I snipped this part because it’s so ingrained in DBAs that when performance is bad, updating statistics is a panacea.

Comments closed

Relating Nonclustered Indexes to the Clustered Index

Erin Stellato takes us through a bit of indexing strategy:

In the sp_SQLskills_helpindex output you’ll notice that OrderLineID is the only column in columns_in_tree, and both OrderLineID and OrderID_and_OrderLineID are in columns_in_leaf.

Remember that when you identify a nonclustered index as unique, it will not add the clustering key to the tree level, it just adds it to the leaf level. But if the nonclustered index is not declared as unique, then SQL Server adds the clustering key to the tree level to uniquely identify the row.

This post deserves a careful reading.

Comments closed

Index Column Order and Selectivity

Erik Darling gives us multiverse indexing:

Missing index request column order is pretty basic.

Instead, we’re gonna add these:

CREATE INDEX ix_spaces
ON dbo.Posts(ParentId, Score);

 
CREATE INDEX ix_tabs
ON dbo.Posts(Score, ParentId);

SQL Server is capable of using both intelligently. This is something I generally don’t like to do, but if you have queries which absolutely need differently-ordered index columns, it can make sense to do this. Just don’t expect SQL Server’s missing index DMV to tell you which order they should be in.

Comments closed