Category: Indexing

Included Columns on Filtered Indexes

Published 2019-04-01 by Kevin Feasel

Rob Farley take a look at included columns on filtered indexes:

First let’s think a little about indexes in general.
An index provides an ordered structure to a set of data. (I could be pedantic and point out that reading through the data in an index from start to end might jump you from page to page in a seeming haphazard way, but still as you’re reading through pages, following the pointers from one page to the next you can be confident the data is ordered. Within each page you might even jump around to read the data in order, but there is a list showing you which parts (slots) of the page should be read in which order. There really is no point in my pedantry except to answer those equally pedantic who will comment if I don’t.)
And this order is according to the key columns – that’s the easy bit that everyone gets. It’s useful not only for being able to avoid re-ordering the data later, but also for being able to quickly locate any particular row or range of rows by those columns.

Rob does a great job of covering some of the nuances of filtered indexes.

Comments closed

Cancelling Resumable Index Creation

Published 2019-03-27 by Kevin Feasel

Brent Ozar takes us through a couple considerations when using online, resumable index creation:

In SSMS, you’re used to being able to click the “Cancel” button on your query, and having your work rolled back.
You’re also used to being able to kill a query, and have it automatically roll back.
Neither of those are true with resumable index creations. In both cases, whether you kill the index creation statement or just hit the Cancel button in SSMS to abort your request, your index creation statement is simply paused until you’re ready to come back to it. (Or, it’s ready to come back to haunt you, as we saw above.)

There are some good things to keep in mind here.

Comments closed

Finding Missing Index Hints in Query Store

Published 2019-03-13 by Kevin Feasel

Grant Fritchey shows us another place where we can find missing index hints:

A couple of notes on the query. I cast the query_plan as xml so that I can use the XQuery to pull out the information. It is possible that the plan might be so large that you get an error because of the limit on nesting levels within XML. Also, I aggregate the information from the sys.query_store_runttime_stats. You may want to modify this to only look at limited ranges. I’ll leave that to you as an exercise.

Do read Grant’s warning in the conclusion.

Comments closed

Benefits of Inline Indexes

Published 2019-03-01 by Kevin Feasel

Phil Factor covers a topic fairly close to my heart:

Usually, the added features of the CREATE TABLE syntax in new releases of SQL Server are esoteric, and unless you are dealing with memory-optimized tables or other esoteric stuff, they aren’t of great interest. However, the Inline INDEX for both a table and column index has just crept in quietly with SQL Server 2014 (12.x). This was interesting because the SQL Server team back-fitted it to all tables rather than just in-memory OLTP tables for which it was, at the time, found necessary. The new syntax was introduced which allows you to create certain index types inline with the table definition. These could be at column level, concerning just that column, or at the table level, with indexes containing several columns.
Why interesting? This affects multi-statement table functions, user-defined table types, table-valued parameters as well as table variables. It was considered a game-change for table variables because, for a start, it allowed non-unique indexes or explicit clustered indexes to be declared on columns for the first time because you can create indexes on table variables as part of the table definition. Of more significance were the table-level indexes that allowed you to specify multi-column indexes. Previous releases had allowed multi-column primary or unique constraints, but not explicitly named indexes. You still cannot declare an index after the table is created, which is a shame as there are good reasons for being able to do so after a table is stocked with data. Any sort of large import of data into a table that is over-indexed or prematurely-indexed is doomed to crawl rather than to run. I’ll show this later on in this article.

Click through for an analysis of inline indexes themselves as well as how they fit on table variables—something I tend not to do much.

Comments closed

The Costs Of Rebuilding Indexes Online

Published 2019-01-31 by Kevin Feasel

Tibor Karaszi explains that TANSTAAFL (There A’int No Such Thing As A Free Lunch) applies to online index rebuilds:

The time it takes to rebuild the index can be substantially longer for ONLINE. Many of us has other things to do with the database during night-time and/or weekends. Yes, these are the typical window in time where we try to find things such as index rebuilds. Say that you do it night-time and it currently take 4 hours. Wouldn’t it be nice if you could cut that time down to 1.5 hours? That would leave more time for imports, massaging of data, CHECKDB and other things you want to do. Sure, you can do it ONLINE, but it will slow down access during the rebuild. Also the more data you modify during the rebuild, the more space you need in tempdb.

Betteridge’s Law of Headlines applies too, so that’s two important principles in one post.

As far as the post goes, Tibor makes a fair point: there is a trade-off between availability and efficiency with index rebuilds. But having worked with clustered columnstore indexes in 2014, you’ll pry the online operations in subsequent versions out of my cold, dead hands.

Comments closed

The Importance of Cardinality

Published 2019-01-30 by Kevin Feasel

Bert Wagner shows us why cardinality is important to understand when indexing data:

When building indexes for your queries, the order of your index key columns matters. SQL Server can make the most effective use of an index if the data in that index is stored in the same order as what your query requires for a join, where predicate, grouping, or order by clause.
But if your query requires multiple key columns because of multiple predicates (eg. WHERE Color = ‘Red’ AND Size= ‘Medium’), what order should you define the columns in your index key column definition?

One of my favorite books for query tuning is a bit long in the tooth at this point but remains quite relevant, and a key point there is to look for ways to drop the largest percent of rows as soon as possible. This applies for good indexes as well: they’ll let you ignore as large a percentage of your irrelevant data as you can, as soon as possible.

Comments closed

On Disabling Indexes

Published 2019-01-29 by Kevin Feasel

Kenneth Fisher has a few notes on disabling indexes:

Indexes are probably the number one tool we have to improve performance. That said, there are times when we want to put that index on hold. While indexes dramatically improve read performance they do cause a slight dip in write performance. This isn’t significant most of the time but when doing a large load it can frequently be faster to get rid of the existing indexes and then put them back when you are done.

I don’t think that I’ve ever regularly disabled indexes, even during bulk loading. It’s good to know that the option exists, however.

Comments closed

The Importance Of Index Column Order

Published 2019-01-29 by Kevin Feasel

Erik Darling shows us how arranging columns in an index can make a huge difference in query performance:

A while back I promised I’d write about what allows SQL Server to perform two seeks rather than a seek with a residual predicate.
More recently, a post touched a bit on predicate selectivity in index design, and how missing index requests don’t factor that in when requesting indexes.
This post should tie the two together a bit. Maybe. Hopefully. We’ll see where it goes, eh?

Also apropos: missing index hints return results in alphabetical order, not in selectivity order or what would be best for queries. In other words, just because the green text in SSMS says it’s the index you want doesn’t mean it’s the index you need.

Comments closed

Filtered Index Trickiness

Published 2019-01-16 by Kevin Feasel

Greg Low explains some of the tricky bits behind using filtered indexes:

If you think about it, if all we’re ever going to use is one part of the index, i.e. just the unfinalized rows, having an entry in there for every single row is quite wasteful, as although the vast majority of the index will never be used, it still has to be maintained.
So in SQL Server 2008, we got the ability to create a filtered index. Now these were actually added to support sparse columns. But on their own, they’re incredibly useful anyway.

I use these on occasion but less than I want to, and a big part of the reason why is in this post, particularly around parameters.

Comments closed

Creating Cosmos DB Indexes

Published 2019-01-10 by Kevin Feasel

Hasan Savran explains indexing in Cosmos DB:

In SQL Server you need to pick which columns you like to index, In CosmosDB you need to pick which columns not to index. It’s kind of same thing at the end. You might ask “If everything is indexed and working fine, why do you want me to poke the well running system?” When we compare SQL Server indexes to CosmosDB Indexes, one thing works exactly same. That is the index file size. CosmosDB holds the indexes in a separate file like SQL Server and if we want to index everything, index file size is going to get large. Since we need to pay for the file space in CosmosDB, you might need to pay extra for indexes that you might never use. Also, your updates, inserts and deletes might cost you more Request Units since CosmosDB needs to maintain all the indexes in the background.

There’s just enough difference to make you pay the price if you assume Cosmos DB works just like SQL Server.

Comments closed