Stats Terminology

Erik Darling fills in gaps on statistics terminology in his unique style:

SELECTIVITY

This tells you how special your snowflakes are. When a column is called “highly selective” that usually means values aren’t repeating all that often, if at all. Think about order numbers, identity or sequence values, GUIDs, etc.

DENSITY

This is sort of the anti-matter to selectivity. Highly dense columns aren’t very unique. They’ll return a lot of rows for a given value. Think about Zip Codes, Gender, Marital Status, etc. If you were to select all the people in 10002, a densely (there’s that word again) populated zip code in Chinatown, you’d probably wait a while, kill the query, and add another filter.

Combine that with Kendra Little’s statistics FAQ for additional learning.

Related Posts

Asynchronous Stats Updates

Monica Rathbun explains why you might want to turn on asynchronous statistics updates in your OLTP environment: By default, when Auto Update Statistics is set to True, the SQL Server Query Optimizer will automatically update statistics when data has met a threshold of changes (insert, update, delete, or merge) and the estimated rows are now […]

Read More

How Statistics In SQL Server Have Changed Over The Years

Erin Stellato gives us a version-based timeline of how SQL Server has handled statistics over the years: SQL Server 2008 Filtered statistics are introduced, and these can be created separately from a filtered index. There are some limitations around filtered indexes with regard to the Query Optimizer (see Tim Chapman’s post The Pains of Filtered Indexes and Paul […]

Read More

Categories

April 2016
MTWTFSS
« Mar May »
 123
45678910
11121314151617
18192021222324
252627282930