Press "Enter" to skip to content

Category: Indexing

Another Look at Cosmos DB Indexing

Hasan Savran revises some indexing recommendations based on changes to Cosmos DB:

Lazy indexing used to be an option. It’s not in any CosmosDB documentation anymore. By using Lazy indexing, you could save 20 to 30 percent for Request Units. Just like anything else in life, you get what you pay for when it comes to Lazy indexing. By selecting Lazy indexing, you are saying that eventually Indexes will be updated. If Indexes are not updated, that means your queries might not return all the data since all data might not be indexed yet. Lazy indexing is still an option, nobody talks about it for a good reason. In my opinion, it should be listed as obsolete feature or it should have a better documentation about how it works or why it might not be a good option for your solutions.

     If you use Lazy Indexing to reduce Request Units in your solution, change it to consistent now unless you have a really good reason!

Read on for more advice in this vein.

Comments closed

Making Non-SARGable Queries SARGable with an Index

Denis Gobo violates Betteridge’s Law of Headlines:

This question came up the other day from a co-worker, he said he couldn’t change a query but was there a way of making the same query produce a better plan by doing something else perhaps (magic?)

He said his query had a WHERE clause that looked like the following

WHERE RIGHT(SomeColumn,3) = '333'

I then asked if he could change the table, his answer was that he couldn’t mess around with the current columns but he could add a column

Click through to see how Denis was able to solve this problem.

Comments closed

A Sizing Problem with Heaps

Slava Murygin shows an example where adding a clustered index to an existing table can reduce its size:

Tables were very narrow with just a few columns and my expectations for data growth were very modest. However, after just a little while I was very surprised when my database showed huge unexpected growth and size of the data became multiple times higher than I’ve expected.

After very little research I’ve found and fixed the problem. In this post I’ll describe how I’ve done it.

Read on to learn how Slava figured this out and how a clustered index fixed the problem.

Comments closed

Choosing Clustered Index Columns

Ed Elliott wades into the clustered index debate:

I have seen this debated in forums spread over the internet for decades, and the advice that we gave ten years ago isn’t as valid today as it was then. Ten years ago, memory was considerably less, and disks were spinning rust. The advent of SSD’s and the ability to get servers with more memory than data, even on large systems have changed how we should think about designing and maintaining databases.

I generally subscribe to the NUSE philosophy: Narrow, Unique, Static, Ever-Increasing. That generally leads me to selecting identity integers or longs. For junction tables (whose entire purpose is to join two tables together and which never get referenced outside of that), I use the primary key as the clustered index.

In extreme insert scenarios, I can see wanting to maximize fragmentation in order to insert into more pages in the B-tree and avoid hot spot pages.

Comments closed

More on Index Fragmentation

Tibor Karaszi revises and extends some remarks on index fragmentation:

In my last blog post, I wanted to focus on the sequential vs random I/O aspect and how that part should be more or less irrelevant with modern hardware. So I did a test that did a full scan (following the linked list of an index) and see if I could notice any performance difference on my SSD. I couldn’t.

That isn’t the end of the story, it turns out. Another aspect is how the data is brought into memory. You might know that SQL server can do “read ahead”, meaning it does larger reads per I/O instead if single-page I/O. I.e., fewer but larger I/O operations. This sounds fine, but what happens when we have fragmentation?

Read on for a situation in which fragmentation does matter.

Comments closed

Using Filtered Indexes

Monica Rathbun fills us in on filtered indexes:

What is a filtered index?
Simply it’s an index with a where clause. It is an optimized non clustered index that can be narrowed down in scope to better fit a subset of data. Example being date ranges, years, non NULLs or specific product types.

I wish filtered indexes were better than they are because they can solve some interesting problems but get stuck on parameterized queries.

Comments closed

Foreign Key Checks and Index Usage Stats

Marek Masko walks through an attendee question:

Last week I was speaking at SQLDay 2019 conference in Wroclaw, Poland. My session was about things you should focus on during work with Legacy Databases. One of the topics I discussed was concerning the database usage statistics collection and aggregation (mainly indexes and stored procedures).

After the session, one of the attendees came to me and ask me if I know that actions causing check of Foreign Keys and that use indexes underneath, don’t update index usage stats DMV. I was very surprised because, in my opinion, such behavior would be a huge SQL Server defect. So I decided to check it out…

Read on for Marek’s explanation and demo.

Comments closed

Minimal Logging into Empty Clustered Indexes

Paul White explains how to perform minimal logging when using the INSERT..SELECT pattern to insert into an empty table with a clustered index:

The summary top row suggests that all inserts to an empty clustered index will be minimally logged as long as TABLOCK and ORDER hints are specified. The TABLOCK hint is required to enable the RowSetBulk facility as used for heap table bulk loads. An ORDER hint is required to ensure rows arrive at the Clustered Index Insert plan operator in target index key order. Without this guarantee, SQL Server might add index rows that are not sorted correctly, which would not be good.

Unlike other bulk loading methods, it is not possible to specify the required ORDER hint on an INSERT...SELECT statement. This hint is not the same as using an ORDER BY clause on the INSERT...SELECT statement. An ORDER BY clause on an INSERTonly guarantees the way any identity values are assigned, not row insert order.

Read on to see what you can do.

Comments closed

Compress Those Indexes

Pamela Mooney reminds us that if you’re going to compress your heap or clustered index, remember those non-clustered indexes as well:

We compress to gain space, correct? So why sacrifice it to our indexes?

Here is a script (complete with demo databases and tables) that will find the culprits and fix them for you.  Put it (minus the demo databases and tables) in a job, and quit worrying about this.  Your tables (and indexes) will be sparkling clean. You’re welcome.

Click through for the script. It’s a good reminder that compressing a clustered index does not automatically compress non-clustered indexes—nor should it, as these won’t necessarily compress at the same ratio because data type makeup can differ.

Comments closed