Press "Enter" to skip to content

Category: Statistics

Stats Q&A

Erin Stellato has a two-parter on statistics in SQL Server. Part 1 deals with questions on stats creation:

Last week I presented a session, Demystifying Statistics in SQL Server, at the PASS Community Summit, and I had a lot of great questions; so many that I’m creating multiple posts to answer them. This first post is dedicated to questions specific to creating statistics in SQL Server.

Part 2 deals with stats updates:

Last week I presented a session, Demystifying Statistics in SQL Server, at the PASS Community Summit, and I had a lot of great questions; so many that I’m creating multiple posts to answer them. This second post is dedicated to questions specific to updating statistics in SQL Server. Of note…I have a couple previous posts which also include helpful information:

Click through for lots of questions and lots of good answers.

Comments closed

All about Synchronous Stats Updates

Paul Randal shares some thoughts about synchronous stats updates:

The SQL Server query optimizer makes use of statistics during query compilation to help determine the optimal query plan. By default, if the optimizer notices a statistic is out-of-date because of too many changes to a table, it will update the statistic immediately before query compilation can continue (only the statistics it needs, not all the statistics for the table).

Note that “too many” is non-specific because it varies by version and whether trace flag 2371 is enabled – see the AUTO_UPDATE_STATISTICS section of this page for details.

Read on to learn more, including the problems that synchronous stats updates can cause, what you can do to avoid them, and ways you can tell that synchronous stats updates are a problem in your environment.

Comments closed

Enabling Statistics Auto-Creation

Chad Callihan checks the stats:

When we query for data, we don’t always think about the magic that goes into efficiently returning results. One vital piece to this magic is statistics. Statistics in SQL Server are histograms that are used by the query optimizer to determine an optimal execution plan when executing a query. Let’s take a look at the different ways to check your statistics settings and make sure statistics are being automatically created.

Click through to see how.

Comments closed

Ignoring Updates to Some Statistics

Raul Gonzalez gives some tips on optimizing statistics updates:

For now, everything described might not be such a horrible thing, it’s clear that SQL Server will not take full advantage of the stats on the column [Body] if the queries we are running use wildcards (specially leading), but why so much fuss? Well, now it’s when things start making sense (or not).

Running stats maintenance on this kind of columns every night can become really expensive and this is what I’ve found more than once when using the Query Store to look for queries that have a high number of reads.

Read the whole thing.

Comments closed

Viewing Stats Used in Creating Execution Plans

Matthew McGiffen shows us how to find the statistics used when generating an execution plan:

Statistics are vital in allowing SQL Server to execute your queries in the most performant manner. Having a deep understanding of how the SQL Server Optimizer interacts with Statistics really helps when you are performance tuning

One thing that can be useful when looking at an execution plan is to understand what statistics objects the optimizer used to come up with the plan. In this post we look at how that can be achieved using the undocumented traceflag 8666 which can be used to save internal debugging informational into the plan XML – including details of the Statistics objects used. 

Click through for a couple of caveats about this, as well as a primer on how to see those precious statistics.

Comments closed

Number of Rows Automatically Sampled versus Table Size

Matthew McGiffen does the math:

I mentioned in my previous post about manually updating statistics that you can specify whether they’re updated using a full scan, or you can specify an amount of data to sample, either a percentage of the table size, or a fixed number of rows. You can also choose not to specify this, and SQL Server will decide for you whether to do a full scan, or to sample a certain amount of data.

I thought it would be interesting to look at what the sample sizes are that SQL will choose to use, depending on the amount of data in your table. 

Click through for the result of Matthew’s analysis.

Comments closed

How to Update Statistics Manually

Matthew McGiffen takes us through the process of updating statistics:

At the heart of all the methods we’ll look at is the UPDATE STATISTICS command. There are a lot of options for using this command, but we’ll just focus on the ones you’re most likely to use. For full documentation here is the official reference:
https://docs.microsoft.com/en-us/sql/t-sql/statements/update-statistics-transact-sql

Even if you have an automated system, knowing how to update statistics is a great thing because you might need to run a one-off update to help a poorly-performing query. Or you’re using PolyBase, which doesn’t have the capability to perform statistics updates automatically because the data isn’t actually in SQL Server.

Comments closed

Statistics and Ascending Keys

Matthew McGiffen looks at a common problem with statistics:

The Ascending Key Problem relates to the most recently inserted data in your table which is therefore also the data that may not have been sampled and included in the statistics histograms. This sort of issue is one of the reasons it can be critical to update your statistics more regularly than the built-in automatic thresholds.

We’ll look at the problem itself, but also some of the mitigations that you can take to deal with it within SQL Server.

Click through for more detail.

Comments closed

Estimating Row Counts without Statistics

Matthew McGiffen dives into rules of thumb:

I find this is a question that comes up again and agan. What estimate for the number of rows returned does SQL Server use if you’re selecting from a column where there are no statistics available?

There are a few different algorithms used depending on how you’re querying the table. In this post we’ll look at where we have a predicate looking for a fixed value.

Read on for a few examples, noting that this specifically relates to tables and not things like table-valued parameters.

Comments closed