I mentioned in the introduction that not all eight-byte data types can fit in 64 bits. This fact is important because many columnstore and batch mode performance optimizations only work with data 64 bits in size. Aggregate pushdown is one of those things. There are many more performance features (not all documented) that work best (or at all) only when the data fits in 64 bits.
In our specific example, aggregate pushdown is disabled for a columnstore segment when it contains even one data value that does not fit in 64 bits. SQL Server can determine this from the minimum and maximum value metadata associated with each segment without checking all the data. Each segment is evaluated separately.
Paul goes deep into the concept, making this well worth your while.
This post is a follow-up to my prior post inspecting the performance of PARSE vs CAST & CONVERT, where we see that
PARSEis an order of magnitude slower than
CONVERT. In this post, we’ll check if there is a similar difference between using
CONVERT. But just to be clear,
CONVERToffers a lot more functionality than
CAST; this post will not help you decide which of these functions to use for a specific use-case – I leave that to the reader to decide for themselves.
Max gets slightly different numbers but under the covers they both call the same
CONVERT() function. The difference in numbers is noise: both of them have standard deviations of ~200ms, so a t-test can’t distinguish the two. The big choice is whether you’d rather have ANSI standard code (if so, use
CAST()) or if you’d prefer additional functionality around dates and times (like
With the query pasted and formatted in my SSMS editor window, I like retrieving the estimated execution plan first, and then pasting the query into a second editor window and executing the query with the “Include Actual Execution Plan” option turned on. For bonus points, I’ll split the SSMS window vertically so I can start looking at the estimated execution plan while the query runs and returns the actual execution plan: I like this combination because I (almost) immediately receive my estimated execution plan and can start looking for problems. Once the query on the right finishes executing and I get the actual plan with all of its lovely run-time stats, I usually switch to that looking at that one.
Even if your approach is quite different, it’s good to compare and contrast.
At first I was thinking maybe it is a short cut to help. Nope, it is actually an easy way to get the estimated execution plan. The key is estimated execution plan. While in SQL Server Management Studio we have been able to easily get both the estimated or actual execution plan.
Just as a reminder, the actual execution plan requires the query to actually run. While the estimated plan will use statistics to generate the plan. Generally, if the actual and estimated are capture very close to each other, you are more than likely not going to see a difference between the tow.
Now let’s talk about the Explain button. It will return two things, the XML of the execution plan and a graphical representation of the execution plan.
There are a few differences between Azure Data Studio’s implementation of execution plans and SQL Server Management Studio’s.
Scenario: A client reports a memory spike during processing. They have a Tabular semantic model deployed to Azure Analysis Services. They fully process the model daily. The model normally takes less than 50 GB RAM but during processing, it spikes five times and Azure Analysis Services terminates the processing task complaining that it “reached the maximum allowable memory in our pricing tier”. Normally, fully processing the model should take about twice the memory but five times?
Teo gives us the explanation for this problem as well as a recommendation on how to fix it.
In the Microsoft Research paper Plan Stitch: Harnessing the Best of Many Plans by Bailu Ding, Sudipto Das, Wentao Wu, Surajit Chaudhuri, and Vivek Narasayya (2018), the authors propose something really cool: watching an execution plan change over time, splitting it into parts, and then combining parts to form a SuperPlan™. (They don’t call it a SuperPlan™ – it just seems appropriate, right?)
That looks like an interesting paper, and Brent has a few more if you agree.
I executed every single query enough times so that the execution would be run totally In-Memory (64GB of RAM is enough because we have our data compressed, as I mentioned earlier in the SETUP part). This would allow me to mimic a busy system that has enough resources to process the reading workload. Since the Batch Execution Mode is focusing on the CPU improvements, I decided to sample not only the total elapsed time, but the CPU times so that we can do some judgements of the CPU bandwidth variation. Lowering the CPU consumption fo the high-demanding CPU queries is a key in order to improve the overall system parallelism (well, watch-out for the memory, of-course).
Each successful execution was sampled at least 3 (and in some query cases over 5 times) and then the result would be averaged, so that we can have a higher confidence.
Niko has some interesting findings, some good for SQL Server and some not so good.
For everyone who hates caching, this is the section for you! Yes, I’m totally playing both sides.
Given the above and how drastic the wins are, why wouldn’t we cache something? Well, because every single decision has trade-offs. Every. Single. One. It could be as simple as time spent or opportunity cost, but there’s still a trade-off.
This is a long but very useful post.
Snapshot isolation avoids most locking and blocking by using row versioning. When data is modified, the committed versions of affected rows are copied to tempdb and given version numbers. This operation is called copy on write and is used for all inserts, updates and deletes using this technique. When another session reads the same data, the committed version of the data as of the time the reading transaction began is returned.
By avoiding most locking, this approach can greatly increase concurrency at a lower cost than transactional isolation. Of course, “There ain’t no such thing as a free lunch!” and snapshot isolation has a hidden cost: increased usage of tempdb.
Gerald covers both varieties, Read Committed Snapshot Isolation and proper Snapshot Isolation. RCSI is definitely worth understanding in almost any environment, and even Snapshot Isolation has its uses.
I had a server that looked like it had been suffering from memory contention. I wanted to see what queries were being run that had high memory requirements. The problem was that it wasn’t happening right now – I needed to be able to see what had happened over the last 24 hours.
Enter Query Store. In the run-time stats captured by Query Store are included details relating to memory.
Click through for a script which retrieves this data over a time frame.