Stats Terminology

Erik Darling fills in gaps on statistics terminology in his unique style:

SELECTIVITY

This tells you how special your snowflakes are. When a column is called “highly selective” that usually means values aren’t repeating all that often, if at all. Think about order numbers, identity or sequence values, GUIDs, etc.

DENSITY

This is sort of the anti-matter to selectivity. Highly dense columns aren’t very unique. They’ll return a lot of rows for a given value. Think about Zip Codes, Gender, Marital Status, etc. If you were to select all the people in 10002, a densely (there’s that word again) populated zip code in Chinatown, you’d probably wait a while, kill the query, and add another filter.

Combine that with Kendra Little’s statistics FAQ for additional learning.

Related Posts

Troubleshooting Query Performance Changes

Erin Stellato walks us through a troubleshooting guide when users complain about poorly-performing queries: This is tale of troubleshooting… When you unexpectedly or intermittently encounter a change in query performance, it can be extremely frustrating and troublesome for a DBA or developer. If you’re not using Query Store, a third-party application, or your own method […]

Read More

Workarounds for Updating Stats on Secondaries

Niko Neugebauer wants statistics updates on tables running on readable Availability Group secondary nodes: Let’s list the basic known details for the possible solution(for the Enterprise Edition of the Sql Server that is):– We can make the secondary replica readable and read the same data on it. (Not that you should do that by default, […]

Read More

Categories

April 2016
MTWTFSS
« Mar May »
 123
45678910
11121314151617
18192021222324
252627282930