Press "Enter" to skip to content

Category: Indexing

The Limits of Filtered Indexes

Erik Darling lays out the pros and cons of filtered indexes:

Filtered indexes are really interesting things. Just slap a where clause on your index definition, and you can do all sorts of helpful stuff:

– Isolate hot data
– Make soft delete queries faster
– Get a histogram specific to the span of data you care about

Among other things, of course. There are some annoying things about them though.

– They only work with specific ANSI options
– If you don’t include the filter definition columns in the index, it might not get used
– They only work when queries use literals, not parameters or variables

Click through for examples of them in action. I would definitely like to see improvements to filtered indexes along the lines that Erik mentions. They have so much potential, but are really held back by those limitations.

Comments closed

Deduplicating Missing Index Requests

Erik Darling trains us in a skill:

Going back to our queries and our index requests, all the queries have two things in common:

– They filter on OwnerUserId
– They order by Score

There are of course other elements in the where clause to attend to, but our job is to come up with one index that helps all of our queries.

This gets a lot easier if you know the queries. If all you have is the set of index details, you can make a sane guess but the chance of being wrong is much greater.

Comments closed

PFS Contention and Heaps

Uwe Ricken continues a series on heaps in SQL Server:

The PFS page “can” become a bottleneck for a heap if many data records are entered in the heap in the shortest possible time. How often the PFS page has to be updated depends mostly on the data record’s size to be saved.

This procedure does not apply to clustered indexes since data records in an index must ALWAYS be “sorted” into the data volume according to the defined index value. Therefore, the search for a “free” space is not carried out via the PFS page but via the value of the key attribute!

Read on for more detail.

Comments closed

Pruning Indexes on a Table

John McCormack takes us through an index pruning exercise:

Your SQL Server indexes can’t always be perfect. What I mean by that is you can’t index for every query on a busy SQL server. In this post I’m going to describe just one table from a server I have been tuning recently. It had 26 indexes, and that was not unusual for this database. This most likely happened over time as a result of developers creating an index per query using the table, as well as following some perceived best practices (with the best of intentions).

John got it down from 26 to 11 and shares thoughts on how. If you have that many separate indexes, it most likely stems from a failure in normalization—it’s uncommon that a properly-described table has that many unique access patterns. And that will often lead you to a hard floor which includes more indexes than you’d like without reimagining the table and going through a lot of developer pain.

Comments closed

Tracking Cosmos DB Re-Indexing Progress

Hasan Savran wants information:

Indexes let your queries run faster. When you need to adjust your indexing policies, database engines re-indexes your data respecting to your changes. In Cosmos DB, when you change your indexing policies, database engine truncates all your indexes and starts to reindex all your indexes from scratch. You do not want to change your indexing policies when your application is busy. Because your queries can not use the dropped indexes, queries will take longer, and they will cost more Request Units. Also, your queries might not return all the data they supposed to. You can read me my older post about indexes in Cosmos DB.

     You may want to monitor re-indexing progress; you may want to disable your application until indexing is completed or warn your team about the re-indexing progress. You can check the re-indexing progress only from SDK, that means you need to write your own code to accomplish this. I have the following code which checks the progress every second. If progress is at %100 then it quits, otherwise it continues to check progress every second until it receives 100 as result.

Hasan has provided us with a script, so check that out.

Comments closed

Choosing the Right Index and Partition in Dedicated SQL Pools

Tsuyoshi Matsuzaki gives us some advice on indexing and partitioning data in Azure Synapse Analytics dedicated SQL pools:

Designing index for a table is so primitive and important for better performance.
There’s no “one answer for any case”. You should choose right index for a table depending on the size, usage, query patterns, and cardinality.

In order to help you understand pros/cons in each indexes, I’ll show you each pictures illustrating intuitive structures of indexes available in Synapse Analytics.

Because dedicated SQL pools aren’t the same as the SQL Server box product, it’s important to go in with the understanding that indexing won’t be exactly the same as on-premises or in Azure SQL Database.

Comments closed

Finding Indexes Not in Use

Dennes Torres takes us through a few iterations of a query to find indexes not in use:

It doesn’t matter if you are trying to remove indexes for good reasons or just to work around a bad environment, let’s see in more details how to find which indexes doesn’t have enough usage to justify their existance.

First, some basic definitions, without going into many details:

Index Seek: That’s the best and desirable use of the index. It means the index tree is being used to go directly to the records we need.

Index Scan: Not so good as an index seek, so it could be better. However, sometimes even an index scan is good, a non clustered index scan means the pages of that index are smaller an better for a scan than the pages of the clustered index. There are many variations that makes an index scan good, but most times you don’t need to reach this level of analysis, you may reach your objective only analysing index seeks.

Update: When the fields are updated (update/insert/delete) all indexes which contain those fields need to be updated as well. Indexes are a balance: We increase performance on reading and suffer a bit more when writting. The problem is when the writting happens more than the reading.

Read on to see Dennes’s query evolve and bring important information to the table. For example, it’s not just how often a particular index gets used; it’s also how important the queries are which use this index. An index may only run once a month, but if it turns the most important report the CEO cares about from running in 4 hours to running in 4 seconds, you bet that index is staying.

Comments closed

Eager Spools in SQL Server

Erik Darling describes the concept of eager spooling:

Not only does SQL Server create an index for you, it doesn’t really tell you about it. There’s no loud warning here.

It also throws that index away when the query is done executing. It only exists in tempdb while the query executes, and it’s only available to the query that builds it.

And boy, they sure can take a long time to build.

I enjoy and frequently use Erik’s depiction that an eager spool is SQL Server’s passive-aggressive way of telling you that you need an index.

Comments closed

Disabled Indexes Tell No Compression Tales

Eric Cobb gives us a warning around disabling indexes:

Here at work we have a very large, very intensive data load that disables and rebuilds indexes as part of the process. We recently added compression to many of the tables and indexes in the database because it was growing quite large (around 28TB at the time). After adding compression, we got the database size down to somewhere around 17TB.

So you can imagine our surprise when the DB size jumped back up to over 30TB after the last data load! In trying to figure out what happened I discovered that most of the data compression was gone.

That’s…not great. Eric shows us a demo as well and notes that it still applies to SQL Server 2019. I’d be apt to call it a bug, myself.

Comments closed