Base Versus Simple Containment

Joe Obbish takes a crack at explaining the difference between base containment and simple containment for cardinality estimation:

We know that the first query will return 500k rows and the second query will return 0 rows. However, can SQL Server know that? Each statistics object only contains information about its own column. There’s no correlation between the UNIQUE_ID and MOD_FILTER columns, so there isn’t a way for SQL Server to know that the queries will return different estimates. The query optimizer can create an estimate based on the filters on the WHERE clause and on the histograms of the join columns, but there’s no foolproof way to do that calculation. The presence of the filters introduces uncertainty into the estimate, even with statistics that perfectly describe the data for each column. The containment assumption is all about the modeling assumption that SQL Server has to make to resolve that uncertainty.

It’s an interesting post aimed at trying to get you to think like a simplified cardinality estimator.  SQL Server doesn’t behave exactly like this, but it’s a good mental reference point.

Related Posts

The Bitmap Operator

Hugo Kornelis describes a new operator: The Bitmap operator is used to build a bitmap that, based on a hash, represents which values may be present in a data flow. Due to the chance of hash collisions in the hash function used, the Bitmap process can produce false positives but not false negatives – so a match based on […]

Read More

Visualizing Merge Joins

Bert Wagner continues his series on physical join operators: The base algorithm works as follows: SQL Server compares the first rows from both sorted inputs.  It then continues comparing the next rows from the second input as long as the values match the first input’s value.Once the values no longer match, SQL Server increments the […]

Read More

Categories

November 2017
MTWTFSS
« Oct Dec »
 12345
6789101112
13141516171819
20212223242526
27282930