Implicit Conversion (Sometimes) Harms Performance

Grant Fritchey looks at implicit conversion and the havoc it can wreak:

Letting SQL Server change data types automatically can seriously impact performance in a negative way. Because a calculation has to be run on each column, you can’t get an index seek. Instead, you’re forced to use a scan. I can demonstrate this pretty simply. Here’s a script that sets up a test table with three columns and three indexes and tosses a couple of rows in:

You might get lucky and have the database engine realize that it doesn’t need to give you a horrible execution plan, but it’s sound advice to ensure that data types match on joins and filters.

Primary Keys On TVPs And Plan Forcing

Michael J. Swart notes that you cannot force query plans if you’re using a user-defined table type with a non-named primary key constraint:

When defining table variables, avoid primary key or unique key constraints. Opt instead for named indexes if you’re using SQL Server 2014 or later. Otherwise, be aware that plan forcing is limited to queries that don’t use these table variables.

Helpful advice when dealing with user-defiened table types.  Read the whole thing.

More On String_Split

Aaron Bertrand has another update on the String_Split function, specifically how it compares to user-defined table types:

For this specific test, with a specific data size, distribution, and number of parameters, and on my particular hardware, JSON was a consistent winner (though marginally so). For some of the other tests in previous posts, though, other approaches fared better. Just an example of how what you’re doing and where you’re doing it can have a dramatic impact on the relative efficiency of various techniques, here are the things I’ve tested in this brief series, with my summary of which technique to use in that case, and which to use as a 2nd or 3rd choice (for example, if you can’t implement CLR due to corporate policy or because you’re using Azure SQL Database, or you can’t use JSON or STRING_SPLIT() because you aren’t on SQL Server 2016 yet). Note that I didn’t go back and re-test the variable assignment and SELECT INTO scripts using TVPs – these tests were set up assuming you already had existing data in CSV format that would have to be broken up first anyway. Generally, if you can avoid it, don’t smoosh your sets into comma-separated strings in the first place, IMHO.

That’s a rather interesting result, given how poorly JSON fared in some of the previous tests.

WhoIsActive For Azure

Adam Machanic has a new version of sp_whoisactive specifically for Azure SQL Database:

So I set about looking for a workaround. This week I think I’ve finally managed to get something working that approximates the number I need from that view, ms_ticks.

Attached is sp_whoisactive v11.112 — Azure Special Edition v2. Please give it a shot, and I am especially interested in feedback if you use the @get_task_info = 2 option when running sp_whoisactive. That is the main use case that’s impacted by the lack of ms_ticks information and my attempt at a workaround.

If you’re using on-prem SQL Server, this doesn’t add anything new, but if you’re on Azure SQL Database, give it a try.

Transaction Log Analysis

Michael Swart shows how to dig into the transaction log to trace down those WRITELOG waits:

WRITELOG waits are a scalability challenge for OLTP workloads under load. Chris Adkin has a lot of experience tuning SQL Server for high-volume OLTP workloads. So I’m going to follow his advice when he writes we should minimize the amount logging generated. And because I can’t improve something if I can’t measure it, I wonder what’s generating the most logging? OLTP workloads are characterized by frequent tiny transactions so I want to measure that activity without filters, but I want to have as little impact to the system as I can. That’s my challenge.

Check out the entire post, as this is a good exercise in investigating busy transactional systems.

Keep Check Constraints Simple

Erik Darling shows performance implications around having scalar UDFs in check constraints:

Really. Every single time. It started off kind of funny. Scalar functions in queries: no parallelism. Scalar functions in computed columns: no parallelism, even if you’re not selecting the computed column. Every time I think of a place where someone could stick a scalar function into some SQL, it ends up killing parallelism. Now it’s just sad.

This is (hopefully. HOPEFULLY.) a less common scenario, since uh… I know most of you aren’t actually using any constraints. So there’s that! Developer laziness might be a saving grace here. But if you read the title, you know what’s coming. Here’s a quick example.

Yeah, UDFs in check constraints is a pretty bad idea most of the time.

Multi-Threaded Log Writer

Chris Adkin has a very detailed post digging into log writer changes affecting high-scale throughput:

To understand why we get this performance degradation with SQL Server 2016 RC1 three key parts of a transactions life cycle need to be understood along with the serialisation mechanisms that protect them

Chris digs into call stacks as part of his post.  We’ll see if there are some performance improvements between now and RTM on this front.

Optimizing OR Clauses

Daniel Hutmacher looks at different ways of optimizing queries with multiple conditionals and different parameters:

The SQL Server query optimizer can find interesting ways to tackle seemingly simple operations that can be hard to optimize. Consider the following query on a table with two indexes, one on (a), the other on (b):

FROM #data
WHERE a<=10 OR b<=10000;

The basic problem is that we would really want to use both indexes in a single query.

We get to see a few different versions of the query as well as the execution plans which result.

Invalid Perfmon Calculations

Paul Popovich notes that certain Perfmon counters could be wrong on certain versions of Windows:

Performance Monitor uses incorrect calculation for certain types of counters in Windows 8, Windows Server 2012, Windows 7 SP1, or Windows Server 2008 R2 SP1

This only cost us a week of reviewing results.

Follow up on the link because there’s a fix available through Windows Update.

Live Query Stats Versus Actual Execution Plans

Kendra Little compares and contrasts Live Query Statistics against actual execution plans:

Getting plan details isn’t free. The amount of impact depends on what the query is doing, but there’s a stiff overhead to collecting actual execution plans and to watching live query statistics.

These tools are great for reproing problems and testing outside of production, but don’t time query performance while you’re using them– you’ll get too much skew.

Live Query Statistics is one additional tool, but won’t replace actual execution plans.  At its best, it will make you think more about what’s going on with the system, whether row counts are what you’re expecting, and take account of which operators stream data through without blocking (such as nested loop joins) versus those which require all the data before continuing (sorts).


June 2017
« May