Statistics – Page 10

Unlike traditional Btree indexes, when a columnstore index is created, there is no index statistics created on the columns of the columnstore indexes. However, there is an empty stats object created with the same name as columnstore index and an entry is added to sys.stats at the time of index creation. The stats object is populated on the fly when a query is executed against the columnstore index or when executing DBCC SHOW_STATISTICS against the columnstore index, but the columnstore index statistics aren’t persisted in the storage. The index statistics is different from the auto created statistics on the individual columns of columnstore indexes which is generated on the fly and persisted in the statistics object. Since the index statistics is
not persisted in storage, the clonedatabase will not contain those statistics leading to inaccurate stats and different query plans when same query has run against database clone as opposed to production database.

Click through for the script.

Comments closed

AG Secondary Stats Overwritten With Sample

Published 2017-03-03 by Kevin Feasel

Taiob Ali has run into an interesting issue:

Once I update my statistics with fullscan, with in 10~20 seconds some of the statistics on the same table are getting update on secondary with a sample pecent of rows. Meaning my best statistics are being overwritten with good (full vs sample) statistics. On primary node once I run “Update statistics Tablename with fullscan” . I see following about statistics status.

After 10~20 seconds of updating statistics in primary node if I check the status of the same on my secondary nodes, I see fullscan statistics is replaced by sample statistics. Look at the rows_sampled and last_updated column, you will see the sample row number and last_updated column time is within few seconds of update in primary. RowsModified column still showing zero records.

It’s happening on an Availability Group secondary. Taiob has a workaround, so read on for that.

Comments closed

Which Data Types Can Create Statistics?

Published 2017-02-10 by Kevin Feasel

Raul Gonzalez figures out which data types cannot be part of statistics:

Yeah, there you go, all these _WA_Sys_ stats tell me they have been automatically created (there is a flag in sys.stats if you don’t believe me) but I can see there are only 31, where I created 34 columns.

That’s funny, let’s see which data types did get statistics.

The results are pretty interesting.

Comments closed

Automating Stats Maintenance With Azure SQL DW

Published 2017-01-20 by Kevin Feasel

Grant Fritchey shows how to create automated statistics maintenance for an Azure SQL Data Warehouse database:

NOTE: The most important habit you can start with in Azure is putting everything into discrete, planned, Resource Groups. These make management so much easier.

Once the account is set, the first thing you need is to create a Runbook. There is a collection of them for your use within Azure. None of them are immediately applicable for what I need. I’m just writing a really simple Powershell script to do what I want:

Runbooks are an important part of Azure maintenance, and this is a gentle introduction to them.

Comments closed

Updating Multiple Statistics Concurrently

Published 2016-12-22 by Kevin Feasel

SQL Scotsman explains trace flag 7471, which allows you to update multiple statistics on a table concurrently:

Running multiple UPDATE STATISTICS commands for different statistics on a single table concurrently has been available under global Trace Flag 7471 since SQL Server 2014 SP1 CU6 and SQL Server 2016 CU1. Microsoft have documented this trace flag here and here.

It sounds like, for the most part, you might not want this flag turned on, but read the whole post.

Comments closed

Parallel Stats Sampling

Published 2016-12-20 by Kevin Feasel

SQL Scotsman shows which statistics-building operations are parallel and which are single-threaded:

“Starting with SQL Server 2016, sampling of data to build statistics is done in parallel, when using compatibility level 130, to improve the performance of statistics collection. The query optimiser will use parallel sample statistics whenever a table size exceeds a certain threshold.”

As per the previous demos, prior to SQL Server 2016, the gathering of sampled statistics are serial, single-threaded operations. Only statistic operations gathered using a full scan would qualify for parallelism. Now in SQL Server 2016, all automatic and manual statistic operations qualify for parallelism.

He also has a neat trick for invalidating stats on a large table, so check out this article-length blog post.

Comments closed

Stats Histogram DMV

Published 2016-12-20 by Kevin Feasel

Erik Darling looks at a new DMV in vNext CTP 1.1:

It’s not exactly perfect

For instance, if you just let it loose without filters, you get a severe error. The same thing happens if you try to filter on one of the columns in the function, rather than a column in sys.stats, like this.

Very cool. It’s one step closer to us removing our dependencies on DBCC SHOW_STATISTICS.

Comments closed

Synchronous Or Asynchronous Stats Updates

Published 2016-12-12 by Kevin Feasel

SQL Scotsman explains synchronous versus asynchronous stats updates:

With that said, it would seem that asynchronous statistics are better suited to OLTP environments and synchronous statistics are better suited to OLAP environments. As synchronous statistics are the default though, people are reluctant to change this setting without good reason to. I’ve worked exclusively in OLTP environments over the last few years and have never seen asynchronous statistics rolled out as the default. I have personally been bitten by synchronous statistic updates on large tables causing query timeouts which I resolved by switching to asynchronous statistic updates.

This is an interesting, nuanced take on the issue. My bias is toward asynchronous stats updates because I have been burned, but it’s interesting to read someone thinking through the implications of this seemingly simple choice.

Comments closed

Truncate Table And Stats

Published 2016-12-09 by Kevin Feasel

Kendra Little shows that TRUNCATE TABLE does not always reset stats:

You might expect to see that the statistic on Quantity had updated. I expected it, before I ran through this demo.

But SQL Server never actually had to load up the statistic on Quantity for the query above. So it didn’t bother to update the statistic. It didn’t need to, because it knows that the table is empty, and this doesn’t show up in our column or index specific statistics.

Check it out.

Comments closed

30K Non-Indexed Column Stats

Published 2016-12-08 by Kevin Feasel

Lonny Niederstadt tests the limits of statistics on non-indexed columns in SQL Server:

A friend pointed out that the same references indicates a maximum of 30,000 columns in a wide table. That got me thinking – maybe 30,000 stats is a per-table maximum?

Not too hard to test. Yep – limit per table.

Filed under Swart’s Ten Percent Rule.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Statistics

Columnstore Indexes On Cloned Databases

AG Secondary Stats Overwritten With Sample

Which Data Types Can Create Statistics?

Automating Stats Maintenance With Azure SQL DW

Updating Multiple Statistics Concurrently

Parallel Stats Sampling

Stats Histogram DMV

It’s not exactly perfect

Synchronous Or Asynchronous Stats Updates

Truncate Table And Stats

30K Non-Indexed Column Stats