Indexing – Page 20 – Curated SQL

The Importance of Cardinality

Published 2019-01-30 by Kevin Feasel

Bert Wagner shows us why cardinality is important to understand when indexing data:

When building indexes for your queries, the order of your index key columns matters. SQL Server can make the most effective use of an index if the data in that index is stored in the same order as what your query requires for a join, where predicate, grouping, or order by clause.
But if your query requires multiple key columns because of multiple predicates (eg. WHERE Color = ‘Red’ AND Size= ‘Medium’), what order should you define the columns in your index key column definition?

One of my favorite books for query tuning is a bit long in the tooth at this point but remains quite relevant, and a key point there is to look for ways to drop the largest percent of rows as soon as possible. This applies for good indexes as well: they’ll let you ignore as large a percentage of your irrelevant data as you can, as soon as possible.

Comments closed

On Disabling Indexes

Published 2019-01-29 by Kevin Feasel

Kenneth Fisher has a few notes on disabling indexes:

Indexes are probably the number one tool we have to improve performance. That said, there are times when we want to put that index on hold. While indexes dramatically improve read performance they do cause a slight dip in write performance. This isn’t significant most of the time but when doing a large load it can frequently be faster to get rid of the existing indexes and then put them back when you are done.

I don’t think that I’ve ever regularly disabled indexes, even during bulk loading. It’s good to know that the option exists, however.

Comments closed

The Importance Of Index Column Order

Published 2019-01-29 by Kevin Feasel

Erik Darling shows us how arranging columns in an index can make a huge difference in query performance:

A while back I promised I’d write about what allows SQL Server to perform two seeks rather than a seek with a residual predicate.
More recently, a post touched a bit on predicate selectivity in index design, and how missing index requests don’t factor that in when requesting indexes.
This post should tie the two together a bit. Maybe. Hopefully. We’ll see where it goes, eh?

Also apropos: missing index hints return results in alphabetical order, not in selectivity order or what would be best for queries. In other words, just because the green text in SSMS says it’s the index you want doesn’t mean it’s the index you need.

Comments closed

Filtered Index Trickiness

Published 2019-01-16 by Kevin Feasel

Greg Low explains some of the tricky bits behind using filtered indexes:

If you think about it, if all we’re ever going to use is one part of the index, i.e. just the unfinalized rows, having an entry in there for every single row is quite wasteful, as although the vast majority of the index will never be used, it still has to be maintained.
So in SQL Server 2008, we got the ability to create a filtered index. Now these were actually added to support sparse columns. But on their own, they’re incredibly useful anyway.

I use these on occasion but less than I want to, and a big part of the reason why is in this post, particularly around parameters.

Comments closed

Creating Cosmos DB Indexes

Published 2019-01-10 by Kevin Feasel

Hasan Savran explains indexing in Cosmos DB:

In SQL Server you need to pick which columns you like to index, In CosmosDB you need to pick which columns not to index. It’s kind of same thing at the end. You might ask “If everything is indexed and working fine, why do you want me to poke the well running system?” When we compare SQL Server indexes to CosmosDB Indexes, one thing works exactly same. That is the index file size. CosmosDB holds the indexes in a separate file like SQL Server and if we want to index everything, index file size is going to get large. Since we need to pay for the file space in CosmosDB, you might need to pay extra for indexes that you might never use. Also, your updates, inserts and deletes might cost you more Request Units since CosmosDB needs to maintain all the indexes in the background.

There’s just enough difference to make you pay the price if you assume Cosmos DB works just like SQL Server.

Comments closed

What Happens With Multiple Missing Indexes

Published 2018-12-21 by Kevin Feasel

Arthur Daniels shows us what happens when there are multiple missing indexes in an execution plan:

This is missing index request #1, and by default, this is the only missing index we’ll see by looking at the graphical execution plan. There’s actually a missing index request #2, which we can find in the XML (I know, it’s a little ugly to read. Bear with me).

I am of two minds on this. It probably should be easier to see multiple index candidates, but there’s already so much risk of people just copy-pastaing missing index recommendations that adding more seems like a bad idea.

Comments closed

Fill Factor And The Performance Tradeoff

Published 2018-12-14 by Kevin Feasel

Tara Kizer explains the performance tradeoff when setting fill factor for an index:

There are workloads where frequent page splits are a problem. I thought I had a system like this many years ago, so I tested various fill factor settings for the culprit table’s clustered index. While insert performance improved by lowering the fill factor, read performance drastically got worse. Read performance was deemed much more critical than write performance on this system. I abandoned that change and instead recommended a table design change since it made sense for that particular table.

Click through for a demo.

Comments closed

Resumable Online Index Creation In SQL Server 2019

Published 2018-11-29 by Kevin Feasel

Monica Rathbun tries out resumable online index creation in SQL Server 2019:

SQL Server 2019 brings a very exciting new feature that, is long overdue. Resumable online index create is one of my favorite new things. This paired with the Resumable Index Rebuilds introduced with SQL Server 2017 really gives database administrators much more control over index processes.
Have you ever started to build a new index on very large table only to have users call and complain their process is hung, not completing, or system is slow? That’s when you realize you’re the cause because you tried to sneak in a new index. I have many times, because creating a new index can impact performance and can be a problematic process for users when you have no or little downtime windows available. When you kill the create process it rolls back requiring you to start from the beginning the next time. With resumable online index creation you now have the ability to pause and restart the build at the point it was paused. You can see where this can be very handy.

Click through for a demo and discussion on what options are available.

Comments closed

Fixing Issues Related To Filtered Indexes

Published 2018-11-28 by Kevin Feasel

Kevin Chant looks at a few problems that can pop up when using filtered indexes:

In a past post here I did an overview of different index types. I said in that post that I think filtered indexes could be more popular. In this post I will cover fixing some of the problems caused when you first introduce rowstore filtered indexes to a SQL Server database.
Some of you have probably been there already. You’ve put in your first filtered index on a database only to find an issue has happened. I’ve witnessed these issues at a few places. This will hopefully reduce the pain.

I’ve definitely experienced the third issue (which also pops up when using parameterized queries, so the optimizer doesn’t know that it can use the filtered index), but never the first two.

Comments closed

Avoid Key Lookups On Clustered Columnstore Indexes

Published 2018-11-16 by Kevin Feasel

Joey D’Antoni points out a potential big performance problem with clustered columnstore indexes:

In the last year or so, with a large customer who makes fairly heavy use of this pattern, I’ve noticed another concern. Sometimes, and I can’t figure out what exactly triggers it, the execution plan generated, will do a seek against the nonclustered index and then do a key lookup against the columnstore as seen below. This is bad for two reasons–first the key lookup is super expensive, and generally columnstores are very large, secondly this key lookup is in row execution mode rather than batch and drops the rest of the execution plan into row mode, thus slowing the query down even further.

Joey also has a UserVoice item as well, so check it out.

Comments closed

Category: Indexing