Press "Enter" to skip to content

Category: Performance Tuning

Inlining Scalar Functions

Brent Ozar covers a Microsoft paper which looks exciting:

Froid replaces the scalar UDF operator in the calling query with the newly constructed relational expression as a scalar sub-query.

That one statement comes with a lot of interesting gotchas that they discuss throughout the paper:

  • You won’t see the scalar function in the plan (just like we don’t currently see single-statement inline table-valued functions by name in the plan – we just see their effects, kinda like views)

  • The function and the underlying tables might have different permissions (you could have permissions on the function but not the tables, or vice versa, which makes compilation & execution a little trickier)

  • Code that doesn’t get used can just get removed outright (just like SQL Server can do join elimination)

  • Costs and row estimates are now useful inside the plan

To the extent that this works (and I hope it does), it helps fulfill the promise of SQL Server 2000, with encapsulation of code.  Today, one of the easiest big performance gains I can give is to strip something out of a user-defined function and inline it.

Comments closed

Avoid Scalar Functions In Computed Columns

Daniel Hutmacher shows why you should not include scalar functions inside computed column definitions:

Scalar functions can be a real headache when you’re performance tuning. For one, they don’t parallelize. In fact, if you use a scalar function in a computed column, it will prevent any query that uses that table from going parallel – even if you don’t reference that column at all!

Read on for a demonstration.

Comments closed

Using Computed Columns For Pattern Matching

Erik Darling shows how you can use a computed column to speed up %something searches:

We were talking about computed columns, and one of our students mentioned that he uses computed columns that run the REVERSE() function on a column for easier back-searching.

What’s back-searching? It’s a word I just made up.

The easiest example to think about and demo is Social Security Numbers.

One security requirement is often to give the last four.

Obviously running this for a search WHERE ssn LIKE '%0000' would perform badly over large data sets.

One of the trickiest parts of performance tuning is understanding that the way people want to see data is not necessarily the way you should store the data.

Comments closed

Using Powershell To Deploy Perfmon Collectors

Raul Gonzalez has a bonus post in his Perfmon data series:

As I said, when it’s time to deploy the solution explained in my previous posts to a number of servers it might get very tedious, specially if we have servers running multiple instances, since each have different counter names because the instance name is part of that name, and if we create one template, that won’t apply to all cases, so a lot of manual intervention.

So I decided to do what I like the most and got to write some queries that combined with some powershell will do the job for me.

Read on for the script and more information.

Comments closed

Row Goals In SQL Server 2017

Erik Darling points out a new bonus when you upgrade to SQL Server 2017 CU3:

Don’t go looking in SSMS just yet. If you get an actual or estimated plan from a query in SSMS, it’s not in the XML.

However, If you get them from the plan cache later, you can see them in the XML.

According to People Much Smarter Than Me®, SSMS strips out XML that it doesn’t recognize, so we’ll have to wait for the next version to drop before we can access it easily.

Erik also has links to get more information.

Comments closed

Performance Testing Post-Updates

Joe Chang has some quick and dirty performance tests from SQL Server 2016 SP1 compared to SQL Server 2106 SP1 CU7 (the first post-Meltdown/Spectre release):

linear sum, SQL Server 2016 sp1 cu 7 bld 4466 and OS patches vs sp1 base
9% faster, 12% more CPU efficiently
individual queries range from 24% faster to 0.3% slower.
there probably is a penalty in the recent fixes, but fixes since SP1 also made improvements?

Click through for more details.  We’ll have to see a lot more testing to know, but that’s certainly not awful.

Comments closed

Using Relog To Load Perfmon Counters Into SQL Server

Raul Gonzalez shows how to use the Relog executable to bring Perfmon counters into SQL Server:

Relog is an application and as such it might be able to connect to a SQL Server. In this case, the way to connect is via ODBC driver, hence we have to create a new DSN in the computer we want to run Relog.exe, which does not have to be necessarily the same server where SQL is installed.

If we click Start and start typing ODBC it’ll show the 2 versions available in modern OS’s which are 32bit and 64bit, we need to choose the 64bit version to make it work.

Seems like being relog.exe a bit old, the only driver that actually works is the oldest, but whatever, it works!, so we choose the SQL Server 10.00.xxx one.

He also provides a bonus script at the end, which builds a dynamic pivot table of your counter information.

Comments closed

Automatic Tuning In SQL Server

Grant Fritchey is a fan of SQL Server 2017’s automatic tuning feature:

The core of automatic tuning at this point in time (because I’m sure it’s going to evolve) is the ability of the query engine to spot when a query has generated a new plan and that new plan is causing performance to degrade. This is known as a regression in the plan. It comes from bad parameter sniffing, changes in statistics, cumulative updates, or the big notorious one, the cardinality estimator introduced in SQL Server 2014 (it’s been almost four years, I’m not calling it new any more). Spotting a plan regression prior to SQL Server 2016 and the introduction of the Query Store was a major pain. However, now it’s easy. You can spot them by reading the data collected. Further, Microsoft can spot them by reading the data collected, and it does.

If the engine sees that a plan is causing a regression (you have to have Query Store enabled for this), then it will write out a suggestion for fixing it to the new DMV, sys.dm_db_tuning_recommendations. If you further enable automatic tuning on your database, then SQL Server can automatically substitute the last good plan for you, fixing the problem. Further, SQL Server monitors this fix and if, over time, it’s clear that the forced plan is causing problems, it will unforce the plan, again automagically.

Click through for more information, including a query to read from the tuning recommendations DMV.

Comments closed

Data Skip Techniques In Impala

Mostafa Mokhtar, et al, explain a few methods for skipping unneeded data in Impala queries:

Each Apache Parquet file contains a footer where metadata can be stored including information like the minimum and maximum value for each column. Starting in v2.9, Impala populates the min_value and max_value fields for each column when writing Parquet files for all data types and leverages data skipping when those files are read. This approach significantly speeds up selective queries by further eliminating data beyond what static partitioning alone can do. For files written by Hive / Spark, Impala only reads the deprecated min and max fields.

The effectiveness of the Parquet min_value/max_value column statistics for data skipping can be increased by ordering (or clustering1) data when it is written by reducing the range of values that fall between the minimum and maximum value for any given file. It was for this reason that Impala 2.9 added the SORT BY clause to table DDL which directs Impala to sort data locally during an INSERT before writing the data to files.

Even if your answer is “throw more hardware at it,” there eventually comes a point where you run out of hardware (or budget).

Comments closed

Fragmentation Can Affect Execution Plans

Jonathan Kehayias explains how index fragmentation can potentially affect execution plans:

Index fragmentation removal and prevention has long been a part of normal database maintenance operations, not only in SQL Server, but across many platforms. Index fragmentation affects performance for a lot of reasons, and most people talk about the effects of random small blocks of I/O that can happen physically to disk based storage as something to be avoided. The general concern around index fragmentation is that it affects the performance of scans through limiting the size of read-ahead I/Os. It’s based on this limited understanding of the problems that index fragmentation cause that some people have begun circulating the idea that index fragmentation doesn’t matter with Solid State Storage devices (SSDs) and that you can just ignore index fragmentation going forward.

However, that is not the case for a number of reasons. This article will explain and demonstrate one of those reasons: that index fragmentation can adversely impact execution plan choice for queries. This occurs because index fragmentation generally leads to an index having more pages (these extra pages come from page split operations, as described in this post on this site), and so the use of that index is deemed to have a higher cost by SQL Server’s query optimizer.

Let’s look at an example.

Check out the example, but definitely read the comments as there are some good conversations in there.

Comments closed