Heap Pages Can Be Linked

Paul Randal gives a very special case in which heap pages can be (temporarily) linked together:

Now the pages are linked together!

Note that this is an OFFLINE rebuild, which is the default. What happened is that the offline ALTER TABLE … REBUILD operation uses the part of the underlying functionality for an offline ALTER INDEX … REBUILD operation that builds the leaf level of the index. As that functionality builds a doubly-linked list of pages, the newly rebuilt heap initially has a doubly-linked list of pages! This doesn’t happen for an ONLINE rebuild of the heap, which uses a totally different mechanism.

Although the pages appear doubly-linked, that’s just an artifact of the mechanism used to build the new heap – the linkages aren’t used or maintained.

In other words, you get zero benefit from this.  Click through to see Paul’s test script.

SARGable Predicates

Gail Shaw discusses what makes a particular predicate SARGable:

Any1 function on a column will prevent an index seek from happening, even if the function would not change the column’s value or the way the operator is applied, as seen in the above case. Zero added to an integer doesn’t change the value of the column, but is still sufficient to prevent an index seek operation from happening.

While I haven’t yet found any production code where the predicate is of the form ‘Column + 0’ = @Value’, I have seen many cases where there are less obvious cases of functions on columns that do nothing other than to prevent index seeks.

UPPER(Column) = UPPER(@Variable) in a case-insensitive database is one of them, RTRIM(COLUMN) = @Variable is another. SQL ignores trailing spaces when comparing strings.

This is a straightforward concept with significant performance implications.

The Importance Of Clustered Indexes

Grant Fritchey has an intro-level article on clustered indexes:

You get one clustered index per table.

That bears repeating, you get one clustered index per table.

Grant follows up with good advice.

Non-Clustered Index Design

Derik Hammer has an article on non-clustered index design:

In general, non-clustered indexes are a positive force in your database. Indexes are a form of data duplication, however, which costs disk space. In addition, non-clustered indexes need to be maintained. This can increase the number of writes which occur during INSERT or UPDATE operations and increase the number of index rebuilds or reorganizations that need to be performed.

Create non-clustered indexes to support all of your queries but be careful not to create duplicates and regularly purge indexes which are no longer necessary.

Worth reading the whole thing.

Constraints Or Unique Indexes

Kendra Little discusses unique constraints versus unique indexes:

Unique constraints and unique nonclustered indexes have a lot in common: unique constraints are implemented with a unique index behind the scenes.

While unique nonclustered indexes have a couple of additional features, I can still think of a scenario where it might be best to use unique constraints.

The value I see in unique constraints over unique indexes is semantic:  it says to everyone, “I’m supposed to be unique by virtue of my existence.”  They’re part of the data model, whereas indexes are performance optimizations to handle common data paths.

Deleting From Heaps

Uwe Ricken discusses a downside to leaving an active table as a heap:

The simplest way of removing empty data pages from a heap is via the exclusive locking of the table. But keep in mind that there is no such thing as a free lunch. Blocking the table from other processes will lead to a system that will not scale. As long as the DELETE process locks the table, no other process can access the table. The only exception is the “SNAPSHOT ISOLATION” but explaining the pros and cons of this technique is outside the scope of this article.

The next example (with the new created table) demonstrates the behavior.

The section on lock escalation was rather interesting.

Why Heaps?

Daniel Hutmacher makes an argument for when to use a heap instead of a clustered index:

I’m an outspoken advocate of always using a clustered index on each and every table you create as a matter of best practice. But even I will agree that there’s a case for using the odd heap now and then.

Read on for indicators that you might be better served with a heap.

Index Row Sizes

Kendra Little explains the rules behind how large a non-clustered index row can be:

So make sure you really need all that junk in your nonclustered index trunk. Er, key.

But even with the expanded size of key columns, sometimes I get asked a question: do columns that “secretly” get added to the key of a nonclustered index count against the maximum allowed nonclustered index key length?

You can read the short answer, but I recommend reading the full post.

Surrogate Versus Natural Keys

Kenneth Fisher digs into the debate on surrogate keys versus natural keys:

A natural key is one constructed of data that already exists in the table. For example using latitude and longitude in a table of addresses. Or the social security number in a table of employees. (Before you say anything, yes, the social security number is a horrible primary key. Be patient.)

My personal preference is to use surrogate keys most of the time and put unique constraints (or unique indexes) on the natural key.  There are some occasions in which I’d deviate, but ceteris paribus I’d pick this strategy..

Dynamic Index Generation

Brent Ozar generates 999 indexes:

The CHARACTER_MAXIMUM_LENGTH <> -1 OR IS NULL stuff is because I don’t want to try to index NVARCHAR(MAX) fields, but I do want to index NVARCHAR(50) fields.

The ORDER BY is because I wanted to get a wide variety of leading fields. If I just ordered by c1.COLUMN_NAME, I wouldn’t get all of the first fields covered in the first 999 indexes. (And I’m not even using a large table.)

Sometimes I think I’ve worked on systems which used this script to build indexes.  But then I read the index names:  “dta.”  And it all makes sense.


September 2017
« Aug