Press "Enter" to skip to content

Category: Indexing

A Use for Secondary XML Indexes

Hugo Kornelis follows up from an earlier post:

Welcome back to my plansplaining blog series, where I dive deep into the details of non-obvious execution plans. This part is also a sort of follow up on my post from two weeks ago, when I wrote about the structure and usage of XML indexes, and had to admit that I had not been able to come up with good use cases for all types of secondary XML index.

That very same day, I received an email from Mikael (Micke) Eriksson, who pointed me to this question and answer on Stack Exchange for Database Administrators. I then modified that example a bit, to come up with an execution plan that I consider interesting enough that I want to describe it here.

Click through for the example and a dive into what the plan does.

Leave a Comment

Moving Away from Average Fragmentation for Index Maintenance

Tim Radney proffers some advice:

As a SQL Server DBA with years of experience tuning production environments, I’ve seen the same patterns repeat: nightly index maintenance jobs running for hours, consuming massive CPU and I/O, only for performance to remain inconsistent or even degrade in some cases. Many of us (myself included, early in my career) relied heavily on avg_fragmentation_in_percent from sys.dm_db_index_physical_stats to decide when to reorganize or rebuild indexes. It’s time we move past that outdated approach and stop just doing what we’ve always done.

One thing I would add on to Tim’s advice is, determine whether that index even needs rebuilt or if you’re burning resources for no practical benefit. If there’s no practical performance benefit from the operation—and with all-flash arrays that are within an order of magnitude of RAM speeds, you might already be close to that point—then the index rebuild is for naught. The index maintenance strategy that we all learned back in the day was to minimize the time we spent waiting for spinning disks to reach relevant data. When random access lookups are approximately the same speed as sequential lookups, position on the disk doesn’t matter all that much.

Leave a Comment

Storage of XML Indexes in SQL Server

Hugo Kornelis continues a series on storage structures:

After covering on-disk rowstorecolumnstore indexesmemory-optimized storage, and memory-optimized columnstores, it is now time to turn our attention to storage structures that are used for specific datatypes only. The first “victim” will be the XML index.

When you need to store XML data in a SQL Server database, you can choose between two data types, each with their pros and cons. You can choose to store the XML data as nvarchar(max). This preserves the exact content of the XML as you received it, which might be required for legal reasons. However, any query that cares about the specific content of the XML data will have to resort to very complicated string expressions.

The xml data type shreds the XML and stores the content in an internal format, that allows SQL Server to work with for instance XQuery or XPath expressions. This format also saves space as compared to the nvarchar(max) alternative. However, when you query the data, the content will be the same, but formatting and whitespace might be different.

The latter also allow you to build indexes on top of them, and those indexes are what Hugo covers.

Leave a Comment

Matching Queries to Indexed Views

Erik Darling has a new video:

Erik Darling here with Darling Data, and in today’s video we’re going to continue on the Learn T-SQL voyage that we have started, and I’m going to talk today a little bit about indexed view matching, because SQL Server is, let’s just call it a mature, or an experienced database engine, and is quite capable, at least in Enterprise Edition, Standard Edition, you do not pay the Microsoft Friendship Tax, so you will be taxed performance-wise, but is quite capable of matching base queries to an indexed view where the syntax matches in some way between them. So, usually exactly between them, not in some way, usually pretty close to just about what you would ask for.

Click through for several tips and, as you experience the frustration of consistently trying to make best use of the view’s index, be glad you’re not trying to get filtered views to work.

Leave a Comment

Bloom Filters with Valkey

Jay Miller checks for a record:

A bloom filter is a small, probabilistic data structure designed to answer one question: “Have I seen this item before?” It provides two potential answers: Absolutely Not, and Probably.

You may think that 100% Yes or No would be better but here’s the thing, probably is really fast and you’re really concerned about the Absolutely Not’s taking up unnecessary connections.

The article speaks to Aiven’s implementation of a Bloom filter in Valkey, but does get into some neat details on bloom filters in general. And if you want to go further down that route, Paul White explains how SQL Server uses Bloom filters.

Comments closed

The Pain of Moving Indexes between Filegroups

Erik Darling explains a process:

At some point you’re going to want to move some indexes to a new filegroup. Maybe you’re separating data across storage, maybe you’re cleaning up after someone who put everything on PRIMARY and walked away, maybe you’ve got your reasons and they’re none of my business.

Whatever the cause, you’d think this would be a solved problem in a database that’s been around since the Clinton administration.

It is not.

Some days, I’m convinced that the only way to win is not to play at all. Erik explaining how to migrate LOB data across filegroups fits that bill perfectly.

Comments closed

Working with Indexed Views

Erik Darling talks indexed views:

Anyway. T-SQL Server Management Studio. When most people think about index views, they rightfully think about all the stuff they can’t do with them.

And I sympathize with that because, man, so many times they’ve been like, oh, if only you could do this, if only you could do that, it sure would be nice. And I realize that all the air has gone out of the room as far as making index views more powerful because everyone’s like, well, you could just use batch mode.

Click through for some tips around update operations when dealing with indexed views, as well as a side rant about merge joins.

Comments closed

Filtered Indexes in SQL Server

Erik Darling has a new video:

Now, you just can’t talk about indexing in SQL Server really without talking about filtered indexes. They are a very, very important thing. Conceptually, they are just not that hard to figure out.

It’s an index with a where clause. It only indexes some of the data. It qualifies for the where clause. I don’t know. Like the benefits of that just seem rather apparent to me.

Benjamin Franklin highly encourages you to watch this video, even though filtered indexes are one of the most frustrating things in SQL Server. There are so many cases where I think they should work, and they actually work in approximately a third of those cases.

Comments closed

Automatic Index Compaction in Azure SQL

Chad Callihan takes a look at a preview feature:

There isn’t one set way to manage indexes. Maybe you use Ola Hallengren scripts. Maybe it’s something you put together yourself. Either way, there might be a big shift coming for SQL Server database administrators and how index management is handled.

Last month, Microsoft announced Automatic Index Compaction, which is in preview for Azure SQL Database, Azure SQL Managed Instance, and SQL Database in Fabric. Instead of utilizing something like Ola Hallengren scripts or your own homegrown setup to monitor and rebuild indexes, the database engine will continuously run in the background and handle indexes for you, hence the “automatic” in the name.

Read on to see how it works, as well as a note around page density and index fragmentation. But Jeff Moden makes a good point in the comments, so check that out.

Comments closed

Storage of Memory-Optimized Columnstore Indexes

Hugo Kornelis joins a pair of technologies:

Time for the next part in my series on storage structures. The previous parts covered on-disk rowstorecolumnstore indexes, and memory-optimized storage. In this part, I will look at the combination of the latter two: memory-optimized columnstore indexes.

Memory-optimized columnstore indexes were introduced in SQL Server 2016. I’ve seen some slick Microsoft marketing sessions in that time that were big on “real-time operational analytics”. A new trend where analytical processing would no longer be done on a stale copy of the data in a separate data warehouse, but directly on the OLTP database. Reports would always be fully current, there would be no more need for an ETL pipeline, and due to the combination of memory-optimized for OLTP workloads and columnstore for analytical processing, everything would always be fast. In theory.

Yeah, this one kinda fizzled out quickly. It was in line with the HTAP craze from about the same time period. And Hugo shows in this post part of why very few companies ever adopted it.

Comments closed