Press "Enter" to skip to content

Category: Indexing

Finding Heaps in a Database

David Fowler has a quick one for us:

This is just going to be a quick post but I was asked the question the other day, how can I find all the tables in the database that don’t have a clustered index?

Tables without clustered indexes, known as heaps can give you quite a few different performance problems for various reasons. I don’t really want to go into those here, that could be a subject for another post but finding these heaps is actually very simple.

And David has a short script to show just how simple it is.

Comments closed

Dropping Unused Indexes in Azure SQL DB

Monica Rathbun gives an important lesson around tracking index utilization in Azure SQL Database:

If the index has not shown any utilization I investigate to determine if it is one that can be removed. However, this week something caught my attention. I was looking at a client’s indexes and noted the values for these were not as high as I would have expected. I know that these index statistics are reset upon every SQL Server Service restart, but in this case, I was working on an Azure SQL Database. which got me wondering exactly how that worked. With an Azure Virtual Machine or an on Prem SQL Server instance this is easy to figure out. But with an Azure SQL Database we do not have control over when restarts are done, and what about the Serverless offering (which pauses unutilized databases to reduce costs), how do those behave?  I really want to make sure before I remove any indexes from a database that I am examining the best data possible to make that decision. So, I did some digging.

Read on to see what Monica discovered.

Comments closed

Columnstore Index Maintenance

Ed Pollack continues a series on columnstore indexes:

Like with standard B-tree indexes, a columnstore index may be the target of a rebuild or reorganize operation. The similarities end here, as the function of each is significantly different and worth considering carefully prior to using either.

There are two challenges addressed by columnstore index maintenance:

1. Residual open rowgroups or open deltastores after write operations complete.
2. An abundance of undersized rowgroups that accumulate over time

Read on for the full story.

Comments closed

Making a Heap Molehill out of a Heap Mountain

Bert Wagner needed to move a large subset of a heap into its own heap:

Recently I had to filter out 1.2 billion records from a 3.5 billion row heap. Don’t ask me why this 3.5 billion row table is stored as a heap.

If the lack of a clustered index wasn’t bad enough, I also had some other restrictions:

I couldn’t add a clustered index (or any index for that matter) sorted on the key I needed to filter on. It wasn’t my system, and and I needed to access the 1.2 billion records sooner than it would take to get a clustered index approved and added. Even then, maybe the lack of a clustered index on this table is a feature and my request would be denied. Who knows.
I didn’t have a server that could store all of the data. At first I thought of copying all 3.5 billion rows to my own server and indexing it how I needed, but I didn’t have enough storage space anywhere to do that.
My connection to the server had a relatively short timeout set on it. This also couldn’t be changed. If I couldn’t copy all 3.5 billion rows because of storage, I also couldn’t copy all 1.2 billion records in one fell swoop because the connection would timeout.

I’m also glad to see that Bert is back in action and look forward to seeing these posts and videos pop up again.

Comments closed

Missing Indexes Don’t Tell the Whole Story

Erik Darling explains some of the shortcomings of the missing indexes DMV:

The problem with relying on any data point is that when it’s not there, it can look like there’s nothing to see.

Missing indexes requests are one of those data points. Even though there are many reasons why they might not be there, sometimes it’s not terribly clear why one might not surface.

That can be annoying if you’re trying to do a general round of tuning on a server, because you can miss some easy opportunities to make improvements.

Read on for a few examples of where the results can betray you.

Comments closed

Learning About Index Utilization with dbatools

Ben Miller takes us through a way to know your data:

You have many tables in your databases and you want to know how they are used. There are DMVs for index usage stats which will tell you about like sys.dm_db_index_usage_stats and querying them is insightful, but how do the stats change over time? These stats are reset when the instance is restarted and it is good to know that you have 2000 seeks and 500 scans of the index, but when did they happen? Was it on a common day? Common hour?

Ben has a way to help you figure that out.

Comments closed

Automating Columnstore Index Partition Rebuilds

Brett Powell has a procedure for us:

This post provides an example of a stored procedure which A) identifies the last two partitions of an Azure Synapse Analytics SQL pool table (which uses the columnstore index (default)) and B) rebuilds the index for these two partitions. Additionally, a sample PowerShell-based Azure Automation runbook is included for scheduling the execution of this procedure.

This post follows up on the previous post regarding a Power BI template to be used to analyze the health or quality of a columnstore index. For example, the template shared may help you find that the last one or two partitions such as partition numbers 39 and 40 out of 40 partitions may have many open (uncompressed) and/or not-optimized rowgroups. The cause of these low quality partitions could be that recent and ongoing data processing events are impacting these partitions (inserts,updates). Perhaps partitions 39 and 40 refer to the current and prior month for example.

Read on for the link to the script, as well as details on how to use it.

Comments closed

Key Lookups and Self-Joins

Erik Darling has an interesting method for eliminating key lookups:

This post isn’t going to go terribly deep into anything, but I do want to make a few things about them more clear, because I don’t usually see them mentioned anywhere.

1. Lookups are joins between two indexes on the same table
2. Lookups can only be done via nested loops joins
3. Lookups can’t be moved around in the execution plan

I don’t want you to think that every lookup is bad and needs to be fixed, but I do want you to understand some of the limitations around optimizing them.

Definitely worth the read.

Comments closed

Optimizing Read Performance of Heaps

Uwe Ricken continues a series on heaps in SQL Server:

Heaps are not necessarily the developer’s favourite child, as they are not very performant, especially when it comes to selecting data (most people think so!). Certainly, there is something true about this opinion, but in the end, it is always the workload that decides it. In this article, I describe how a Heap works when data are selected. If you understand the process in SQL Server when reading data from a Heap, you can easily decide if a Heap is the best solution for your workload.

Uwe hits on a couple of the (few) use cases where heap performance can match and sometimes surpass clustered index performance.

Comments closed

Finding Index Usage Stats in Query Store

Grant Fritchey gives us another option for determining whether an index is in use:

One of the most frequent questions you’ll hear online is how to determine if a particular index is in use. There is no perfect answer to this question. You can look at the sys.dm_db_index_usage_stats to get a pretty good picture of whether or not an index is in use. However, this DMV has a few holes through which you could be mislead.

I thought of another way to get an idea of how and where an index is being used. This is also a flawed solution, but, still, an interesting one.

What if we queried the information in Query Store?

Be sure to read Grant’s warning before jumping into this, but at least it gives us another option, as well as a better understanding of which queries are using particular indexes.

Comments closed