In the SQL Sever 2019 preview, we are further expanding query processing capabilities with several new features under the Intelligent Query Processing (QP) feature family. In this blog post we’ll discuss one of these Intelligent QP features that is now available in public preview, batch mode on rowstore. This feature unlocks the advantages of batch mode execution in cases where there is no columnstore participating in the query.
Batch mode is a different execution mode primarily targeted at analytics queries which are characterized as scanning many rows, and doing significant aggregations, sorts, and group-by operations across these rows. Batch mode has been reserved for queries which involve columnstore indexes until now.
Performing scans and calculations using batches of ~ 900 rows at a time rather than row by row is much more efficient for analytic-type queries. For queries that can take advantage of it, batch mode can easily make queries execute many times faster than the same query against the same data in row mode.
To date, the workaround you could use was to create an empty filtered columnstore index on a rowstore table. This solution is more architecturally pleasing and means one less hack to explain.
All it does is take the value of the Sales Amount measure at the lowest granularities of the Customer, Date and Product dimensions, multiply it by 0.08 to find a tax value, and because [Tax Amount] is a real, non-calculated measure, the result of the calculation aggregates up through the cube. [I know that I don’t have to aggregate the result of this specific calculation but remember that this is a simplified example – in the real case I did have to write the calculation using Scope statements – and anyway the best way of handling a basic multiplication like this would be with a measure expression]
The performance was sub-second for my test query and I was happy, but then I realised that the same tax rate was being used in other calculations and may change in the future, so I thought I would store the value 0.08 in a calculated measure:
Chris walks through several iterations of this before landing on a solution which is both reasonable and fast.
I got less than 50 rows back so figured I had everything covered, but the total elapsed time across all the queries was less than ten minutes, I knew the server hadn’t been rebooted for about a month so potentially that could be as little as 20 seconds of query execution a night. Even if the full 10 minutes was from the last 24 hours that still didn’t account for the long run times.
So, I convinced myself it wasn’t the queries and started looking at other things they could check. It was a long list. Lots of theories revolved around the idea that something else was running on the host at the same time.
Click through for the rest of the story.
First let me give you a little background of why you would want to clear SSAS cache from C# code when you can do this using an XMLA command from SSMS.
If you have a slow MDX/DAX SSAS query , you have a couple of options for improving the performance (assuming no hardware changes):
- Rewrite the query differently if you have control over the query. (You will have two queries that you want to compare against the same database.)
- Make changes to the SSAS database to follow a better design. (You will have one query to run against two databases)
Regardless of which route you go, you should compare the performance before and after the changes to see how much you gained from the change.
Click through for more, including the code.
Four years ago, after a bunch of dithering and some negotiations with Tony Davis, my editor, I started to update my book, SQL Server Execution Plans. We managed to convince Hugo Kornelis to be the tech editor. I started to do the real writing in early 2015.
I was most of the way through a first draft and no one liked it. Tony was unhappy. Hugo was unhappy. I was unhappy. I was just trying to update the existing book, SQL Server Execution Plans. It wasn’t working.
We all came to the conclusion that the old book was wrong. Not simply in a technical sense, although there was a lot of that, but in a structural sense. So we started rearranging things. SQL Server 2014 came out, but I was ready for it, having been on a preview, so it was no big deal. We started a second draft.
The book Grant mentions is free in PDF format thanks to Red Gate, so go give it a download and enjoy the fruits of Grant’s labor.
SQL Server 2017 Automatic Tuning looks for queries where execution plans change and performance regresses. This feature depends on Query Store being enabled. Note, even if you don’t turn on Automatic Tuning you still get the benefits of having access to the data. That is right. Automatic Tuning would tell you what it would do if it was enabled. Think of this as free performance tuning training. Go look at the DMVs and try to understand why the optimizer would want to lock in an execution plan. We will actually go through a real-world example:
Click through for the example. I wouldn’t automatically trust these automatic results, but my experience has been generally positive.
As much as I love all this extra data, it’s important to note that some information is more relevant for an actual execution plan, versus an estimated one (e.g. tempdb spill information). Some days we can capture and use the actual plan for troubleshooting, other times we have to use the estimated plan. Very often we get that estimated plan – the plan that has been used for problematic executions potentially – from SQL Server’s plan cache. And pulling individual plans is appropriate when tuning a specific query or set or queries. But what about when you want ideas on where to focus your tuning efforts in terms of patterns?
The SQL Server plan cache is a prodigious source of information when it comes to performance tuning, and I don’t simply mean troubleshooting and trying to understand what’s been running in a system. In this case, I’m talking about mining information from the plans themselves, which are found in sys.dm_exec_query_plan, stored as XML in the query_plan column.
When you combine this data with information from sys.dm_exec_sql_text (so you can easily view the text of the query) and sys.dm_exec_query_stats (execution statistics), you can suddenly start to look for not just those queries that are the heavy hitters or execute most frequently, but those plans that contain a particular join type, or index scan, or those that have the highest cost. This is commonly referred to as mining the plan cache, and there are several blog posts that talk about how to do this. My colleague, Jonathan Kehayias, says he hates to write XML yet he has several posts with queries for mining the plan cache:
It’s a good article with a lot of useful information.
While I was playing around with the new release (August 2018) of Power BI Desktop I noticed there was an undocumented change: similar to the OData improvements I blogged about here, there is a new option in the AnalysisServices.Database() and AnalysisServices.Databases() M functions that turns on a newer version of the MDX generation layer used by the Power Query engine. Like the OData improvements it is an option called Implementation=”2.0”, used like this:AnalysisServices.Databases( "localhost", [ TypedMeasureColumns=true, Implementation="2.0" ] )
…and also, as with the OData improvements, you will need to manually edit any existing M queries to take advantage of this.
Read on for Chris’s test and analysis of the resulting MDX output.
So no, user-defined functions are not the devil. Scalar user-defined functions can cause big problems if misused, but generally inline user-defined functions do not cause problems.
The real rule of thumb is not to avoid functions, but rather to avoid adorning your index fields with logic or functions. Because when you hide your intentions from the optimizer with complex syntax, you risk not getting the better performing index seek.
Riley shows an example where his inline table-valued UDF was just as efficient an execution plan as without the UDF.
In this sample there are absolutely no waits whatsoever on CXPACKET.
They are nonexistent.
If you were hoping to find out that they were way crazy out of control call a priest and ditch your bar tab we’re driving on the train tracks, you’ll be awfully disappointed.
There just aren’t any.
There’s only one core in use for nearly the entire duration, aside from some blips.
That’s disappointing. I was hoping to be able to ignore this wait altogether.