Press "Enter" to skip to content

Explaining Column Statistics

Bert Wagner takes us through column statistics in SQL Server:

Statistics are the primary meta data used by the query optimizer to help estimate the costs of retrieving data for a specific query plan.

The reason SQL Server uses statistics is to avoid having to calculate information about the data during query plan generation. For example, you don’t want to have the optimizer scan a billion row table to learn information about it, only to then scan it again when executing the actual query.

Instead, it’s preferable to have those summary statistics pre-calculated ahead of time. This allows the query optimizer to quickly generate and compare multiple candidate plans before choosing one to actually execute.

These statistics aren’t perfect, but life is almost always better when you have accurate, up-to-date statistics on relevant columns.