Adaptive Join Internals

Dmitry Pilugin digs deep into how adaptive joins work in SQL Server 2017:

We have three types of physical join algorithms in SQL Server: hash, nested loops and merge. Adaptive join allows SQL Server automatically choose an actual physical algorithm on the fly between the first two – hash (HM) and nested loops (NL).

NL has two join strategiesnaive nested loops join (inner loop scans the whole inner table or index) and index nested loops join (index on the join column of the inner table is used to find necessary rows and then those rows are applied to the outer row, also called Nested Loops Apply). Typically, the second one performs very well if you have rather small input on the outer side and indexed rather big input on the inner side.

HM is more universal and uses hash algorithms to match rows, so no indexes are necessary. You may refer to my blog post Hash Join Execution Internals for more details.

Adaptive Join starts execution as a Hash Join. It consumes all the input of the build phase and looks at the adaptive join threshold, if the number of rows is more or equal this threshold it will continue as a hash join. However, if the number of rows is less than this threshold, it will switch to a NL.

If you want to get a better understanding of how adaptive joins works, Dmitry’s post is a great start.

Related Posts

Unboxing ISPACs

It’s an early Christmas for Richie Lee: The first file that we’re going to look at is the [Content_Types].xml file, and this is the file that confirms that the ZipPackage class is used. There’s an article here that is ten years old but is still valid (scroll down to the System.IO.Packaging INcludes Zip Support to read up on this.) […]

Read More

Diving Into Spark’s Cost-Based Optimizer

Ron Hu, et al, explain how Spark’s cost-based optimizer works: At its core, Spark’s Catalyst optimizer is a general library for representing query plans as trees and sequentially applying a number of optimization rules to manipulate them. A majority of these optimization rules are based on heuristics, i.e., they only account for a query’s structure and ignore […]

Read More