Performance Impact With Nullable Columns

Daniel Hutmacher shows us situations in which nullable columns can lead to worse performance:

We expected the Semi Join to turn into an Anti Semi Join, but the plan now also contains a Nested Loop branch with a Row Count Spool – what’s that about? Turns out the Row Count Spool, along with its index seek, has to do with the NOT IN() and the fact that we’re looking at a nullable column. Remember that…
x NOT IN (y, z, NULL)
… always returns false, because the NULL value could represent essentially anything, including the x. And so it is with the inner table, if there happens to be a NULL value among those rows.

Definitely worth the read. There are ways around this performance hit but design decisions have consequences.

Related Posts

Minimal Logging into Empty Clustered Indexes

Paul White explains how to perform minimal logging when using the INSERT..SELECT pattern to insert into an empty table with a clustered index: The summary top row suggests that all inserts to an empty clustered index will be minimally logged as long as TABLOCK and ORDER hints are specified. The TABLOCK hint is required to enable the RowSetBulk facility as used for heap table bulk loads. An ORDER hint […]

Read More

Blocked Queries and the Plan Cache

Erik Darling lays out an argument why execution time isn’t always a great measure for finding bad queries: Most scripts (even ones I’ve worked on!) that look at the plan cache, have had the ability to sort it by different metrics: CPU, reads, writes, duration, etc. A lot of people are very interested in long […]

Read More

Categories

January 2019
MTWTFSS
« Dec Feb »
 123456
78910111213
14151617181920
21222324252627
28293031