Understanding The Cardinality Estimator

SQL Scotsman is working on a very interesting series on statistics and the different cardinality estimators.  So far, this is a three-part series.  Part one is an overview:

A few of those assumptions changed in the new  SQL Server 2014/2016 CE, namely:

  • Independence becomes Correlation: In absence of existing multi-column statistics, the legacy CE views the distribution of data contained across different columns as uncorrelated with one another. This assumption of independence often does not reflect the reality of a typical SQL Server database schema, where implied correlations do actually exist. The new CE uses an increased correlation assumption for multiple predicates and an exponential back off algorithm to derive cardinality estimates.

  • Simple Join Containment becomes Base Join Containment: Under the legacy CE, the assumption is that non-join predicates are somehow correlated which is called “Simple Containment”. For the new Cardinality Estimator, these non-join predicates are assumed to be independent (called “Base Containment”), and so this can translate into a reduced row estimate for the join. At a high level, the new CE derives the join selectivity from base-table histograms without scaling down using the associated filter predicates. Instead the new CE computes join selectivity using base-table histograms before applying the selectivity of non-join filters.

Part two looks at trace flag 9481:

When To Use Trace Flag 9481

Query Scope:  You’ve moved (migrated/upgraded) to SQL Server 2014 / 2016, your databases are at compatibility level 120 / 130 and using the new CE, your workload is performing well overall but there are a few regressions where a small number of queries actually perform worse.  Use Trace Flag 9481 on a per query basis as a temporary measure until you can tune / rewrite the query so it performs well without the hint.

Part three discusses database scoped configurations in SQL Server 2016:

The problem with lowering the database compatibility level is that you can’t leverage the new engine functionality available under the latest compatibility level.

This problem was solved in SQL Server 2016 with the introduction of Database Scoped Configurations which gives you the ability to make several database-level configuration changes for properties that were previously configured at the instance-level.  In particular, the LEGACY_CARDINALITY_ESTIMATION database scoped configuration allows you to set the cardinality estimation model independent of the database compatibility level. This option allows you to leverage all new functionality provided with compatibility level 130 but still use the legacy CE in the odd chance that the latest CE casuses severe query regressions across your workload.

The article on statistics is quite long for a blog post and a great read.  I’m looking forward to reading more.

Related Posts

What’s Special About stats_id = 1

Wayne Sheffield explains what makes stats_id = 1 special, as well as a relationship between stats_id and index_id in SQL Server: If you were to look at sys.indexes, you would see that these two indexes use index_id values of 1 and 3. The value 2 is skipped. It’s not because there used to be an […]

Read More

Retrieving Statistic Use From Query Plan XML

Lonny Niederstadt shows us how to retrieve stats usage details from SQL Server query plans if trace flag 8666 is enabled: Years ago someone said “Hey – why not drop auto-created stats, since the stats you need will just get created again and you’ll end up getting rid of those you no longer need.”   That […]

Read More

Categories

November 2016
MTWTFSS
« Oct Dec »
 123456
78910111213
14151617181920
21222324252627
282930