Press "Enter" to skip to content

Category: Indexing

Partitioning on Columnstore Table Loading

Aaron Bertrand continues a series around learning about columnstore indexes:

In part 1, I showed how both page and columnstore compression could reduce the size of a 1TB table by 80% or more. While I was impressed I could shrink a table from 1TB to 50GB, I wasn’t very happy with the amount of time it took (anywhere from 2 to 14 hours). With some tips graciously borrowed from folks like Joe ObbishLonny NiederstadtNiko Neugebauer, and others, in this post I will try to make some changes to my original attempt to get better load performance. Since the regular columnstore index didn’t compress better than page compression on this data set, and took 13 hours longer to get there, I’ll focus solely on the more advanced solution using COLUMNSTORE_ARCHIVE compression.

Click through for part 2.

Comments closed

Auditing Index Changes

Jason Brimhall gives us a solution to track index changes:

In the previous article, I showed a quick and dirty method to try and capture the indexes that may have been just created. We all know that in the hectic life of the DBA, it is easy enough to forget the indexes created more than just a few hours in the past.

In that article, I mentioned that the solution provided was far from a complete solution to the entire problem. A bigger part of the picture involves audits to know when and by whom indexes were created, altered or dropped.

Keeping track of this kind of thing is important, particularly in an environment where performance suddenly changes even though the devs totally didn’t touch anything, promise.

Comments closed

When Missing Index Requests Go Missing

Erik Darling doesn’t want you to put missing index requests functionality on the back of a milk carton:

Rebuilding indexes will clear missing index requests. So before you go Hi-Ho-Silver-Away rebuilding every index the second an iota of fragmentation sneaks in, think about the information you’re clearing out every time you do that.

You may also want to think about Why Defragmenting Your Indexes Isn’t Helping, anyway. Unless you’re using Columnstore.

There are several reasons why you might not get a missing index request and Erik enumerates them for us.

Comments closed

Why Disabling the Clustered Index is a Bad Idea

Kenneth Fisher has an experiment in mind:

You are probably already aware that you can disable an index. This can be handy when you have a large load and the load + re-enabling the indexes (you have to completely rebuild them) is faster than leaving the indexes in place. I’ve had pretty limited occasions where this has helped but it can be a handy trick at times. That said, this is only true for non-clustered indexes. What happens when you disable the clustered index?

Nothing good, that’s what.

Comments closed

What Slows Down Clustered Index Rebuilds

Kevin Chant has a few reasons why you might see slow clustered index rebuilds in your environment:

I better point out that online rebuilds in general tend to take longer. Mostly because behind the scene’s it’s making a rebuilt copy of your index and then it swaps around to the new index once it has completed.

However, there is another key point I should mention here.

Kevin also points out a sub-item for online rebuilds which could fit just as well in offline rebuilds: if there’s a long-running transaction which blocks SQL Server from taking the schema modification lock, you’ll be sitting there until those long-running transactions ahead of you finish.

Comments closed

Online and Resumable Operations in SQL Server

Kendra Little summarizes which operations in SQL Server have the ability to be run online, which are resumable, and which support the WAIT_AT_LOW_PRIORITY flag:

ONLINE operations in SQL Server were simple to understand for years — we got ONLINE index rebuilds in SQL Server 2005. That was it for a while. Then, things got more complicated: we got more types of indexes. We got ONLINE options for schema changes that don’t involve indexes. We got more options for managing things like blocking, because online operations are really only mostly online — generally there’s going to be at least a short period where an exclusive lock is needed to update metadata. We now have some RESUMABLE operations coming in, too, for those big operations that are tough to handle.

Along the way, I fell behind. Because these features have steadily come out over a period of time, my brain simply didn’t register them all, or possibly I missed seeing them amid other announcements.

It’s not a comprehensive list, but it’s a good starting point for understanding the options you have available.

Comments closed

Resuming Index Operations but Using Different Options

John Morehouse has an interesting use case for resumable indexes:

Documentation on ALTER INDEX provides which options we can set when resuming a rebuild or creation operation:

<resumable_index_option> ::=
 {
    MAXDOP = max_degree_of_parallelism
    | MAX_DURATION =<time> [MINUTES]
    | <low_priority_lock_wait> 
 }

<low_priority_lock_wait>::= 

    WAIT_AT_LOW_PRIORITY ( MAX_DURATION = <time> [ MINUTES ] ,  
                          ABORT_AFTER_WAIT = { NONE | SELF | BLOCKERS } ) 

This means that we can change the MAXDOP, MAX_DURATION, and WAIT_AT_LOW_PRIORITY.

I’m going to gather that this was not necessarily the original intent, but it’s pretty nice, as it means that you can resume with fewer cores and lower priority during the day, but more cores and higher priority after hours.

Comments closed

Problems with SQL Server Index Recommendations

Brent Ozar has some grievances to air:

And if you don’t have time to review one query at a time, SQL Server makes wide-ranging analysis easy too, letting you query dynamic management views like sys.dm_db_missing_index_details to get index recommendations for the entire database. You can even use tools like sp_BlitzIndex to analyze and report on ’em.

Except…

Both of these – the index recommendations in the query plan and the ones in the DMVs – suffer from some pretty serious drawbacks.

Click through for the list. There are some doozies in there.

Comments closed

Explaining Duplicate Indexes

Kevin Hill will be shocked and amazed that I finally linked to him again:

Duplicate indexes are those that exactly match the Key and Included columns.  That’s easy.

Possible duplicate indexes are those that very closely match Key/Included columns.

Why do you care?
Indexes have to be maintained. When I say that, most people immediately think of Reorganizing, rebuilding and updating statistics, and they are not wrong.

Click through for a great explanation of what “duplicate” indexes are, as well as ways to find them. If you’re searching for dupes, I’d recommend a couple blog posts from Kim Tripp as well on whether an index is really a duplicate and how to remove duplicate indexes.

Comments closed

Creating a Better Index Maintenance Script

Erik Darling, despite being on Team Profiler, has something important to say:

If you’re the kind of person who cares about various caches on your server, like the buffer pool or the plan cache, then you’d wanna measure something totally different. You’d wanna measure how much free space you have on each page, because having a bunch of empty space on each page means your data will take up more space in memory when you read it in there from disk.

You could do that with the column avg_page_space_used_in_percent.

BUT…

Read the whole thing.

Comments closed