Hash Join Performance

Paul White wrote something.  That’s good enough to get tagged here:

It might be tempting to ask why the optimizer does not routinely add null-rejecting filters prior to equality joins. One can only suppose that this would not be beneficial in enough common cases. Most joins are not expected to encounter many null = null rejections, and adding predicates routinely could quickly become counter-productive, particularly if many join columns are present. For most joins, rejecting nulls inside the join operator is probably a better option (from a cost model perspective) than introducing an explicit Filter.

It does seem that there is an effort to prevent the very worst cases from manifesting through the post-optimization rewrite designed to reject null join rows before they reach the build input of a hash join. It seems that an unfortunate interaction exists between the effect of optimized bitmap filters and the application of this rewrite. It is also unfortunate that when this performance problem does occur, it is very difficult to diagnose from the execution plan alone.

I’m only on reading #2 of the post, so I’ve got a few more things yet to learn from it.

Related Posts

Table Variables And Parallelism

Erik Darling shows your brain on table variables: Inserts and other modifications to table variables can’t be parallelized. This is a product limitation, and the XML warns us about it. The select could go parallel if the cardinality estimate were more accurate. This could potentially be addressed with a recompile hint, or with Trace Flag […]

Read More

Non-Blocking Aggregations

Daniel Hutmacher tilts at windmills: It’s not entirely uncommon to want to group by a computed expression in an aggregation query. The trouble is, whenever you group by a computed expression, SQL Server considers the ordering of the data to be lost, and this will turn your buttery-smooth Stream Aggregate operation into a Hash Match […]

Read More


November 2015
« Jan Dec »