Category: Indexing

I better point out that online rebuilds in general tend to take longer. Mostly because behind the scene’s it’s making a rebuilt copy of your index and then it swaps around to the new index once it has completed.
However, there is another key point I should mention here.

Kevin also points out a sub-item for online rebuilds which could fit just as well in offline rebuilds: if there’s a long-running transaction which blocks SQL Server from taking the schema modification lock, you’ll be sitting there until those long-running transactions ahead of you finish.

Comments closed

Online and Resumable Operations in SQL Server

Published 2019-12-10 by Kevin Feasel

Kendra Little summarizes which operations in SQL Server have the ability to be run online, which are resumable, and which support the WAIT_AT_LOW_PRIORITY flag:

ONLINE operations in SQL Server were simple to understand for years — we got ONLINE index rebuilds in SQL Server 2005. That was it for a while. Then, things got more complicated: we got more types of indexes. We got ONLINE options for schema changes that don’t involve indexes. We got more options for managing things like blocking, because online operations are really only mostly online — generally there’s going to be at least a short period where an exclusive lock is needed to update metadata. We now have some RESUMABLE operations coming in, too, for those big operations that are tough to handle.
Along the way, I fell behind. Because these features have steadily come out over a period of time, my brain simply didn’t register them all, or possibly I missed seeing them amid other announcements.

It’s not a comprehensive list, but it’s a good starting point for understanding the options you have available.

Comments closed

Resuming Index Operations but Using Different Options

Published 2019-11-27 by Kevin Feasel

John Morehouse has an interesting use case for resumable indexes:

Documentation on ALTER INDEX provides which options we can set when resuming a rebuild or creation operation:
<resumable_index_option> ::=
{
    MAXDOP = max_degree_of_parallelism
    | MAX_DURATION =<time> [MINUTES]
    | <low_priority_lock_wait>
}

<low_priority_lock_wait>::=
{
    WAIT_AT_LOW_PRIORITY ( MAX_DURATION = <time> [ MINUTES ] ,
                          ABORT_AFTER_WAIT = { NONE | SELF | BLOCKERS } )
}
This means that we can change the MAXDOP, MAX_DURATION, and WAIT_AT_LOW_PRIORITY.

I’m going to gather that this was not necessarily the original intent, but it’s pretty nice, as it means that you can resume with fewer cores and lower priority during the day, but more cores and higher priority after hours.

Comments closed

Problems with SQL Server Index Recommendations

Published 2019-11-15 by Kevin Feasel

Brent Ozar has some grievances to air:

And if you don’t have time to review one query at a time, SQL Server makes wide-ranging analysis easy too, letting you query dynamic management views like sys.dm_db_missing_index_details to get index recommendations for the entire database. You can even use tools like sp_BlitzIndex to analyze and report on ’em.
Except…
Both of these – the index recommendations in the query plan and the ones in the DMVs – suffer from some pretty serious drawbacks.

Click through for the list. There are some doozies in there.

Comments closed

Explaining Duplicate Indexes

Published 2019-11-14 by Kevin Feasel

Kevin Hill will be shocked and amazed that I finally linked to him again:

Duplicate indexes are those that exactly match the Key and Included columns. That’s easy.
Possible duplicate indexes are those that very closely match Key/Included columns.
Why do you care?
Indexes have to be maintained. When I say that, most people immediately think of Reorganizing, rebuilding and updating statistics, and they are not wrong.

Click through for a great explanation of what “duplicate” indexes are, as well as ways to find them. If you’re searching for dupes, I’d recommend a couple blog posts from Kim Tripp as well on whether an index is really a duplicate and how to remove duplicate indexes.

Comments closed

Creating a Better Index Maintenance Script

Published 2019-10-30 by Kevin Feasel

Erik Darling, despite being on Team Profiler, has something important to say:

If you’re the kind of person who cares about various caches on your server, like the buffer pool or the plan cache, then you’d wanna measure something totally different. You’d wanna measure how much free space you have on each page, because having a bunch of empty space on each page means your data will take up more space in memory when you read it in there from disk.
You could do that with the column avg_page_space_used_in_percent.
BUT…

Read the whole thing.

Comments closed

Limiting Index Sizes in Cosmos DB

Published 2019-10-11 by Kevin Feasel

Hasan Savran explains why you might want to exclude columns from Cosmos DB indexes:

If everything is indexed already; Why do we want to exclude some of indexes? Indexes are saved on disk, you pay for the storage in Azure. If you keep indexing everything, your index file gets larger and you pay more for storage.

Also; write operations to index file takes longer if index file is larger. By keeping only what you need in index file will improve the latency of write operations. If you will need to change your indexing policies, Rebuilding indexes will take less time.

This behavior is quite different from the way SQL Server behaves, where indexing is more of an opt-in philosophy.

Comments closed

Forced Parameterization and Filtered Indexes

Published 2019-09-23 by Kevin Feasel

Aaron Bertrand walks us through a case where filtered indexes become unhelpful:

Again, focusing on the areas highlighted in orange: the statement has a parameter @0 (previously it had @1) but, more importantly, the clustered index is scanned now instead of the filtered index. This has impacts throughout the plan, including how many rows are both estimated to be read and actually read in order to return those 11 rows. You can see a much higher I/O cost (about 22X), the predicate is now listed explicitly in the tooltip, and you can see warnings about residual I/O (which just means a lot more rows were read than necessary). The root operator still has the warning about the unmatched index, so at least the plan gives you some clue that a filtered index exists that might be useful if you change the parameterization setting for the database (or add OPTION (RECOMPILE) to the statement):

There are still ways to make filtered indexes work with forced parameterization, such as index hints, but Aaron does a great job explaining why something which seems like it should just work doesn’t always.

Comments closed

When Indexes Collide

Published 2019-09-04 by Kevin Feasel

Andy Mallon gives us a case where it makes sense to have a non-clustered index which shares the same columns as your clustered index columns:

First off, let’s remember the difference between clustered & nonclustered indexes
The clustered index is organized by the key columns. It also includes every other column as part of the row structure (ie, it has the entire row).
The nonclustered index is also organized by the key columns. It implicitly includes the clustering key columns (if the table is clustered), or a pointer to the row (if the table’s a heap). If any INCLUDE columns are explicitly specified, they will also be included in the index structure (but these included columns don’t affect order).

I’ve seen other cases where it made sense on sufficiently large and wide tables even for seeks (where the page density difference is large enough that you have a 4-level clustered index but a 3-level non-clustered index), so I think there’s more than Andy’s one corner case. But I do agree that it generally doesn’t help.

Comments closed

Finding Unused Indexes in SQL Server

Published 2019-08-29 by Kevin Feasel

Monica Rathbun shows us how we can find and remove unused indexes in SQL Server:

Indexes can be incredibly beneficial to your database performance; however, they do come with a cost—indexes both consume storage space and affect insert performance. Therefore, it is important as part of your index maintenance procedures that you periodically check to see if your indexes are being used. Many times, indexes are created in the belief they are needed but in fact they are never used. You can reduce that IO overhead on inserts when you remove unnecessary indexes.

I’ll use the same script. Typically, I won’t drop unless total reads is 0 or at least two or three orders of magnitude smaller than writes. Sometimes you have indexes which don’t get used frequently but support very expensive or time-sensitive reports, and you don’t want those getting caught up in your dragnet.

Comments closed