Press "Enter" to skip to content

Category: Performance Tuning

Batch Mode Processing On Rowstore Tables

Dmitry Pilugin shares some thoughts on the expansion of batch mode processing to rowstore tables:

The main advantages of Batch Mode are:

  • Algorithms optimized for the multi-core modern CPUs;
  • Better CPU cache utilization and increased memory throughput;
  • Reduced number of CPU instructions per processed row.

All these features make Batch Mode much faster than Row Mode (typically an order of magnitude, 10x-100x times faster) for analytical queries with CS indexes. One major condition for Batch Mode is a presence of a CS index. If you don’t have a CS index on a table involved in a query, you won’t get Batch Mode.

However, some analytical queries may benefit from Batch Mode without a CS index, or CS cannot be created due to some limitations.

There are a few tricks that allow you to enable Batch Mode on a Rowstore table for example with a dummy filtered CS index (see this post from Itzik Ben-Gan), but SQL Server 2019 may use Batch Mode on Rowstore without any extra efforts from your side.

Dmitry dives into the debugger and teases out the specific circumstances which can help get a query considered for rowstore.  If you want a deep dive into what’s currently available, this is your post.

Comments closed

Redshift Architecture Performance Tips

John Ryan has a few hints to help us build speedy Redshift clusters:

The Need to Vacuum

As Redshift does not reclaim free space automatically, updates and delete operations can frequently lead to table growth. Equally, it’s important as new entries are added, that the data is maintained in a sorted sequence.

The VACUUM command is used to re-sequence data, and reclaim disk space as a result of DELETE and UPDATE operations. Although it won’t block other processes, it can be a resource-intensive operation, especially for data stored using interleaved sort keys.

It should be run periodically to ensure consistent performance and to reduce disk usage.

Some of this is good Postgres advice; some of it is good MPP advice (and serves well, for example, when dealing with Azure SQL Data Warehouse); the rest is Redshift-specific.

Comments closed

Improvements In Table Variable Performance In SQL Server 2019

Matthew McGiffen tries out SQL Server 2019 to test a scenario where table variables were giving poor estimates in prior versions:

One of the most popular posts on my blog last year was where I pretty much suggested that people not use table variables:

Think twice before using table variables

This wasn’t new information when I wrote it, but bad performance due to the use of table variables remained such a common anti-pattern that I thought it was worth stressing again.

So, when I saw the above 2019 feature I thought I’d better investigate and update what I’m telling people.

TL;DR It looks like table variables are no longer a problem.

Read the whole thing.  This has the potential of changing long-standing advice going back a decade regarding table variables.

Comments closed

Batch Mode On Rowstore

Kevin Farlee announces another query processing improvement in SQL Server 2019:

In the SQL Sever 2019 preview, we are further expanding query processing capabilities with several new features under the Intelligent Query Processing (QP) feature family.  In this blog post we’ll discuss one of these Intelligent QP features that is now available in public preview, batch mode on rowstore. This feature unlocks the advantages of batch mode execution in cases where there is no columnstore participating in the query.

Batch mode is a different execution mode primarily targeted at analytics queries which are characterized as scanning many rows, and doing significant aggregations, sorts, and group-by operations across these rows.  Batch mode has been reserved for queries which involve columnstore  indexes until now.

Performing scans and calculations using batches of ~ 900 rows at a time rather than row by row is much more efficient for analytic-type queries.  For queries that can take advantage of it, batch mode can easily make queries execute many times faster than the same query against the same data in row mode.

To date, the workaround you could use was to create an empty filtered columnstore index on a rowstore table.  This solution is more architecturally pleasing and means one less hack to explain.

Comments closed

Storing Constants For MDX Calculated Measures

Chris Webb walks us through an interesting performance problem when using Analysis Services multidimensional:

All it does is take the value of the Sales Amount measure at the lowest granularities of the Customer, Date and Product dimensions, multiply it by 0.08 to find a tax value, and because [Tax Amount] is a real, non-calculated measure, the result of the calculation aggregates up through the cube. [I know that I don’t have to aggregate the result of this specific calculation but remember that this is a simplified example – in the real case I did have to write the calculation using Scope statements – and anyway the best way of handling a basic multiplication like this would be with a measure expression]

The performance was sub-second for my test query and I was happy, but then I realised that the same tax rate was being used in other calculations and may change in the future, so I thought I would store the value 0.08 in a calculated measure:

Chris walks through several iterations of this before landing on a solution which is both reasonable and fast.

Comments closed

What sys.dm_exec_query_stats Can Miss

Matthew McGiffen takes us through a scenario where sys.dm_exec_query_stats did not give a complete view of what was running on SQL Server:

I got less than 50 rows back so figured I had everything covered, but the total elapsed time across all the queries was less than ten minutes, I knew the server hadn’t been rebooted for about a month so potentially that could be as little as 20 seconds of query execution a night. Even if the full 10 minutes was from the last 24 hours that still didn’t account for the long run times.

So, I convinced myself it wasn’t the queries and started looking at other things they could check. It was a long list. Lots of theories revolved around the idea that something else was running on the host at the same time.

Click through for the rest of the story.

Comments closed

Clearing The SSAS Cache Using C#

Shabnam Watson shows us a small console program to clear the SQL Server Analysis Services cache:

First let me give you a little background of why you would want to clear SSAS cache from C# code when you can do this using an XMLA command from SSMS.

If you have a slow MDX/DAX SSAS query , you have a couple of options for improving the performance (assuming no hardware changes):

  1. Rewrite the query differently if you have control over the query. (You will have two queries that you want to compare against the same database.)
  2. Make changes to the SSAS database to follow a better design. (You will have one query to run against two databases)

Regardless of which route you go, you should compare the performance before and after the changes to see how much you gained from the change.

Click through for more, including the code.

Comments closed

The Story Behind Grant’s Execution Plan Book

Grant Fritchey shares a funny (in a Schadenfreude sense) look at what it took to get SQL Server Execution Plans, 3rd Edition out:

Four years ago, after a bunch of dithering and some negotiations with Tony Davis, my editor, I started to update my book, SQL Server Execution Plans. We managed to convince Hugo Kornelis to be the tech editor. I started to do the real writing in early 2015.

I was most of the way through a first draft and no one liked it. Tony was unhappy. Hugo was unhappy. I was unhappy. I was just trying to update the existing book, SQL Server Execution Plans. It wasn’t working.

We all came to the conclusion that the old book was wrong. Not simply in a technical sense, although there was a lot of that, but in a structural sense. So we started rearranging things. SQL Server 2014 came out, but I was ready for it, having been on a preview, so it was no big deal. We started a second draft.

The book Grant mentions is free in PDF format thanks to Red Gate, so go give it a download and enjoy the fruits of Grant’s labor.

Comments closed

A Look At Automatic Plan Tuning In SQL Server 2017

John Sterrett shows us how to review automatic plan tuning suggestions:

SQL Server 2017 Automatic Tuning looks for queries where execution plans change and performance regresses. This feature depends on Query Store being enabled. Note, even if you don’t turn on Automatic Tuning you still get the benefits of having access to the data. That is right. Automatic Tuning would tell you what it would do if it was enabled.  Think of this as free performance tuning training.  Go look at the DMVs and try to understand why the optimizer would want to lock in an execution plan. We will actually go through a real-world example:

Click through for the example.  I wouldn’t automatically trust these automatic results, but my experience has been generally positive.

Comments closed

Mining The Plan Cache, Query Store, And More

Erin Stellato shows the benefit of digging through the plan cache, Query Store, and third-party performance tool databases (using SentryOne’s SQL Sentry as an example):

As much as I love all this extra data, it’s important to note that some information is more relevant for an actual execution plan, versus an estimated one (e.g. tempdb spill information). Some days we can capture and use the actual plan for troubleshooting, other times we have to use the estimated plan. Very often we get that estimated plan – the plan that has been used for problematic executions potentially – from SQL Server’s plan cache. And pulling individual plans is appropriate when tuning a specific query or set or queries. But what about when you want ideas on where to focus your tuning efforts in terms of patterns?

The SQL Server plan cache is a prodigious source of information when it comes to performance tuning, and I don’t simply mean troubleshooting and trying to understand what’s been running in a system. In this case, I’m talking about mining information from the plans themselves, which are found in sys.dm_exec_query_plan, stored as XML in the query_plan column.

When you combine this data with information from sys.dm_exec_sql_text (so you can easily view the text of the query) and sys.dm_exec_query_stats (execution statistics), you can suddenly start to look for not just those queries that are the heavy hitters or execute most frequently, but those plans that contain a particular join type, or index scan, or those that have the highest cost. This is commonly referred to as mining the plan cache, and there are several blog posts that talk about how to do this. My colleague, Jonathan Kehayias, says he hates to write XML yet he has several posts with queries for mining the plan cache:

It’s a good article with a lot of useful information.

Comments closed