Yeah, there you go, all these _WA_Sys_ stats tell me they have been automatically created (there is a flag in sys.stats if you don’t believe me) but I can see there are only 31, where I created 34 columns.
That’s funny, let’s see which data types did get statistics.
The results are pretty interesting.
NOTE: The most important habit you can start with in Azure is putting everything into discrete, planned, Resource Groups. These make management so much easier.
Once the account is set, the first thing you need is to create a Runbook. There is a collection of them for your use within Azure. None of them are immediately applicable for what I need. I’m just writing a really simple Powershell script to do what I want:
Runbooks are an important part of Azure maintenance, and this is a gentle introduction to them.
Running multiple UPDATE STATISTICS commands for different statistics on a single table concurrently has been available under global Trace Flag 7471 since SQL Server 2014 SP1 CU6 and SQL Server 2016 CU1. Microsoft have documented this trace flag here and here.
It sounds like, for the most part, you might not want this flag turned on, but read the whole post.
“Starting with SQL Server 2016, sampling of data to build statistics is done in parallel, when using compatibility level 130, to improve the performance of statistics collection. The query optimiser will use parallel sample statistics whenever a table size exceeds a certain threshold.”
As per the previous demos, prior to SQL Server 2016, the gathering of sampled statistics are serial, single-threaded operations. Only statistic operations gathered using a full scan would qualify for parallelism. Now in SQL Server 2016, all automatic and manual statistic operations qualify for parallelism.
He also has a neat trick for invalidating stats on a large table, so check out this article-length blog post.
It’s not exactly perfect
For instance, if you just let it loose without filters, you get a severe error. The same thing happens if you try to filter on one of the columns in the function, rather than a column in sys.stats, like this.
Very cool. It’s one step closer to us removing our dependencies on DBCC SHOW_STATISTICS.
With that said, it would seem that asynchronous statistics are better suited to OLTP environments and synchronous statistics are better suited to OLAP environments. As synchronous statistics are the default though, people are reluctant to change this setting without good reason to. I’ve worked exclusively in OLTP environments over the last few years and have never seen asynchronous statistics rolled out as the default. I have personally been bitten by synchronous statistic updates on large tables causing query timeouts which I resolved by switching to asynchronous statistic updates.
This is an interesting, nuanced take on the issue. My bias is toward asynchronous stats updates because I have been burned, but it’s interesting to read someone thinking through the implications of this seemingly simple choice.
You might expect to see that the statistic on Quantity had updated. I expected it, before I ran through this demo.
But SQL Server never actually had to load up the statistic on Quantity for the query above. So it didn’t bother to update the statistic. It didn’t need to, because it knows that the table is empty, and this doesn’t show up in our column or index specific statistics.
Check it out.
A friend pointed out that the same references indicates a maximum of 30,000 columns in a wide table. That got me thinking – maybe 30,000 stats is a per-table maximum?
Not too hard to test. Yep – limit per table.
Filed under Swart’s Ten Percent Rule.
The query optimiser hotfixes contained under Trace Flag 4199 are intentionally not enabled by default. This means when upgrading from SQL Server 2008 R2 to SQL Server 2012 for example, new query optimiser logic is not enabled. The reason behind this according to the article linked above is to prevent plan changes that could cause query performance regressions. This makes sense for highly optimised environments where application critical queries are tuned and rely on specific execution plans and any change in query optimiser logic could potentially cause unexpected / unwanted query regressions.
Read the whole thing.
For most modern versions of SQL Server, I like to join to sys.dm_db_stats_properties() — you can get a LOT of detail in a single query! (This works with SQL Server 2008 R2 SP2+ / SQL Server 2012 SP1+ / All higher versions)
Here’s the query, looking at a sample table in the WideWorldImporters database:
Click through for the script, as well as a version which works on 2005 and 2008.