Press "Enter" to skip to content

Category: Indexing

Indexing for Deletion

Jared Westover needs to delete some rows:

In this article, we’ll examine the importance of indexes in improving DELETE statements. Also, many developers assume that adding a foreign key creates an index, which means they’re typically missing. How can you identify which foreign keys are missing indexes? Don’t worry. A handy DMV helps find them. Starting today, you’ll have the skills to make your DELETE statements faster.

Click through for the full post.

Comments closed

Performance Tuning XML Operations in SQL Server

Ed Pollack does a bit of tuning:

SQL Server provides a variety of ways to tune XML so that it provides consistent performance, consumes less space, all while ensuring efficient access to critical data.

At its core, the metadata-styled XML format runs counter to the data that SQL Server is optimized to manage. Therefore, additional features were added to SQL Server over time that allowed for XML data to be indexed and compressed.

While these features are critical for managing XML data as it becomes large, it is important to remember what XML is intended for and why it is (loosely) structured as it is. Many data professionals have used shortcuts when XML was small, such as storing and analyzing it in string format, only to be forced to reckon with performance challenges when scanning large strings become agonizingly slow.

Read on for the full article.

Comments closed

Parallel Index Builds in SQL Server

Paul White delves into history:

SQL Server doesn’t support parallel modifications to a b-tree index.

That might sound surprising. After all, you can certainly write to the same b-tree index from multiple sessions concurrently. For example, two sessions can happily write alternating odd and even numbers to the same integer b-tree index. So long as both sessions execute on different schedulers and take row locks, there will be no blocking and you’ll get true concurrency.

No, what I mean is: A single session can’t write to a b-tree index using more than one thread. No parallel plan modifications of a b-tree index, in other words. It’s a bit like the lack of parallel backward ordered scans. There’s no reason it couldn’t be implemented, but it hasn’t been so far.

Click through for a link to the full article. Or click the link I just added, your choice.

Comments closed

Using a Filtered Index to Prevent Duplicate Rows

Jared Westover solves a problem:

Have you ever needed to prevent duplicate rows from popping up in a table, but the built-in unique constraint in SQL Server falls short? I ran into a table with duplicate rows, but we couldn’t delete them, at least for now. Ideally, you would delete all the duplicates and call it a day. If life were this simple, it would be boring. We needed to prevent new ones from showing up and keep the existing ones. The problem with a unique constraint is that it applies to all the rows in a column.

You can also use a filtered index to prevent the problem in the first place. This was actually an interview question I liked to give: with filtered indexes, you can ensure each non-NULL value was unique, but that you could have as many rows with NULL as your dataset required.

Comments closed

Optimized Column Order for Indexes

Eitan Blumin talks indexing:

SQL Server performance optimization is not a simple topic, and index design plays a pivotal role in it, determining the efficiency of database queries.

One key aspect that often influences performance is the order of columns in an index.

In this guide, I’ll use my real-world experience from our consulting jobs to examine the thinking process behind selecting the best column sequence for an index, the logic behind the decisions, and offer some practical insights for optimal database performance.

Read on to learn more about indexing strategy. And this is probably a good time to remind people that the missing index DMV’s column order has as its sole basis the column’s ordinal value. You can, and often will, do better than the missing index DMV recommendation by going through Eitan’s exercise.

Comments closed

Indexing for Read-Scale Databases

Jose Manuel Jurado Diaz shares a customer case:

Today, I worked on a service request that our customer has a Business Critical database with 4 vCores and Read-Scale Out enabled. Our customer noticed several performance issues using Read-Scale Out database and I would like to explain several lessons learned found during the troubleshooting steps.

Click through for notes on troubleshooting and improving performance.

Comments closed

A Deep Dive into Covering and Non-Covering Indexes

Etienne Lopes goes deeper than I thought:

A few days ago a client brought me a question, he had two tables with the same data, one in each database (same SQL version and compatibility level in different environments with similar infrastructure and configurations). Both tables had an extremely selective column and both had indexes in that column. The thing is that when running a particular query on one Database, let’s call it DB1 here, it was really fast whereas running the same query on the other database (DB2) it was a lot slower. The subset of rows that matched the WHERE clause was the same and it was really small in both cases (since the tables had the same data).

“Ok, let’s have a look at that”, I said, and when doing so, the query running on DB1 was doing a Clustered Index Seek whereas the same query in DB2 was doing a Table Scan. Didn’t I say that both tables had indexes in that column?

Read on for the post. Etienne warns that it’s a big boy of a post, and it covers a lot, including “seek” operations that are scans in disguise, index levels, and more.

Comments closed

Writing Conditional JOIN and WHERE Clauses

Erik Darling has Blondie stuck in my head:

The OR operator is a perfectly valid one to use in SQL statements. If you use an IN clause, there’s a reasonable chance that the optimizer will convert it to a series of OR statements.

For example, IN(1, 2, 3) could end up being = 1 OR = 2 OR = 3 without you doing a darn thing. Optimizers are funny like that. Funny little bunnies.

The problem generally isn’t when asking for IN or OR for a single column, with a list of literal values, the problem is usually when you:

  • Use OR across multiple where clause columns
  • Use OR in a join clause of any variety
  • Use OR to handle NULL parameters or variables

This is an excellent, detailed article and well worth the read.

Comments closed