Category: Performance Tuning

Apart from catching up on news during my commute I only really use notifications for a certain number of hashtags i.e. #SqlServer, #tsql2sday, #sqlhelp, and #PowerShell.

So during work, every so often a little notification will pop up on the bottom right of my window and I can quickly glance down and decide whether to ignore it or check it out.

That’s what happened with the following tweet:

Click through for Shane’s demo.

Comments closed

Legacy Cardinality Estimation In SQL Server

Published 2017-11-29 by Kevin Feasel

Kellyn Pot’vin-Gorman explains what the Legacy Cardinality Estimation setting does in SQL Server:

Oracle DBAs have used the CARDINALITY hint for some time and it should be understood that this may appear to be similar, but is actually quite different. As hinting in TSQL is a bit different than PL/SQL, we can compare similar queries to assist:

TSQL
SELECT CustomerId, OrderAddedDate 
FROM OrderTable 
WHERE OrderAddedDate >= '2016-05-01';
OPTION (USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION'));
go
PL/SQL

Where you might first mistake the CE hint for the following CARDINALITY hint in Oracle:
SELECT /*+ CARDINALITY(ORD,15000) */ ORD.CUSTOMER_ID, ORD.ORDER_DATE 
FROM ORDERS ORD WHERE ORD.ORDER_DATE >= '2016-05-01';
This would be incorrect and the closest hint in Oracle to SQL Server’s legacy CE hint would be the optimizer feature hint:
SELECT /*+ optimizer_features_enable('9.2.0.7') */ ORD.CUSTOMER_ID, ORD.ORDER_DATE FROM ORDERS ORD 
WHERE ORD.ORDER_DATE >= '2016-05-01';
If you’re wondering why I chose a 9i version to force the optimizer to, keep reading and you’ll come to understand.

Read on for the comparative explanation as well as more details on SQL Server’s legacy cardinality estimator hint and database-scoped configuration setting.

Comments closed

Finding The Right Batch Size For Bulk Loads

Published 2017-11-15 by Kevin Feasel

Dan Guzman has some bulk load batch size considerations:

Bulk load has long been the fastest way to mass insert rows into a SQL Server table, providing orders of magnitude better performance compared to traditional INSERTs. SQL Server database engine bulk load capabilities are leveraged by T-SQL BULK INSERT, INSERT…SELECT, and MERGE statements as well as by SQL Server client APIs like ODBC, OLE DB, ADO.NET, and JDBC. SQL Server tools like BCP and components like SSIS leverage these client APIs to optimize insert performance.

SQL Server 2016 and later improves performance further by turning on bulk load context and minimal logging by default when bulk loading into SIMPLE and BULK LOGGED recovery model databases, which previously required turning on trace flags as detailed in this blog post by Parikshit Savjani of the MSSQL Tiger team. That post also includes links to other great resources that thoroughly cover minimal logging and data loading performance, which I recommend you peruse if you use bulk load often. I won’t repeat all that information here but do want to call attention to the fact that these new bulk load optimizations can result in much more unused space when a small batch size is used compared to SQL Server 2014 and older versions.

Click through for some tips.

Comments closed

Actual Execution Plan Enhancements

Published 2017-11-13 by Kevin Feasel

Pedro Lopes points out some additional data available in the properties section when you generate an actual execution plan:

Looking at the actual execution plan is one of the most used performance troubleshooting techniques. Having information on elapsed CPU time and overall execution time, together with session wait information in an actual execution plan allows a DBA to use showplan to troubleshoot issues away from the server, and be able to correlate and compare different types of waits that result from query or schema changes.

A few months ago we had introduced exposed in SSMS some of the per-operator statistics, such as CPU and elapsed time per thread. More recently, we have introduced overall query CPU and elapsed time tracking for statistics showplan xml (both in ms). These can be found in the root node of an actual plan. Available using the latest versions of SSMS v17, when used with SQL Server 2012 SP4, SQL Server 2016 SP1 and SQL Server 2017. For SQL Server 2014 it will become available in a future Service Pack.

Also be sure to check out Geoff Patterson’s Connect item asking that the execution plan results show the top ten waits in descending order rather than ascending order. That’s the appropriate ordering in my mind: show me the most important things first.

Comments closed

Soft-NUMA Doesn’t Limit MAXDOP

Published 2017-11-07 by Kevin Feasel

Lonny Niederstadt tests whether soft-NUMA forces MAXDOP = 1:

I mentioned that I was planning to set up a soft-NUMA node for each vcpu on a 16 vcpu VM, to evenly distribute incoming connections and thus DOP 1 queries over vcpus. Thomas Kejser et al used this strategy to good effect in “The Data Loading Performance Guide”, which used SQL Server 2008 as a base.
https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

My conversation partner cautioned me that leaving this soft-NUMA configuration in place after the specialized workload would result in DOP 1 queries whether I wanted them or not. The claim was, effectively, a parallel query plan generated by a connection within a soft-NUMA node would have its MAXDOP restricted by the scheduler count (if lower than other MAXDOP contributing factors). Though I wasn’t able to test at the time, I was skeptical: I’d always thought that soft-NUMA was consequential to connection placement, but not to MAXDOP nor to where parallel query workers would be assigned.

I’m back home now… time to test!!

Read on for the test.

Comments closed

CXCONSUMER Waits And More From PASS Summit

Published 2017-11-06 by Kevin Feasel

Brent Ozar relays a couple exciting announcements from PASS Summit:

Microsoft’s Joe Sack & Pedro Lopes held a forward-looking session for performance tuners at the PASS Summit and dropped some awesome bombshells.

Pedro’s Big Deal: there’s a new CXPACKET wait in town: CXCONSUMER. In the past, when queries went parallel, we couldn’t differentiate harmless waits incurred by the consumer thread (coordinator, or teacher from my CXPACKET video) from painful waits incurred by the producers. Starting with SQL Server 2016 SP2 and 2017 CU3, we’ll have a new CXCONSUMER wait type to track the harmless ones. That means CXPACKET will really finally mean something.

Read on to see what Joe has for us.

Comments closed

Collation Compatibility And Linked Servers

Published 2017-11-03 by Kevin Feasel

Greg Low points out an important property which can help linked server performance:

The on-premises versions of SQL Server have the ability to connect one server to another via a mechanism called Linked Servers.

Azure-based SQL Server databases can communicate with each other by a mechanism called External Tables. I’ll write more about External Tables soon.

With Linked Servers though, I often hear people describing performance problems and yet there’s a configuration setting that commonly causes this. In Object Explorer below, you can see I have a Linked Server called PARTNER.

Read on for more.

Comments closed

Azure SQL DB Automatic Tuning FAQ

Published 2017-11-03 by Kevin Feasel

Arun Sirpal has a self-Q&A session regarding Azure SQL Database’s automatic tuning options:

What are the options?

CREATE INDEX that identifies the indexes that may improve performance of your workload, creates the indexes, and verifies that they improve performance of the queries.
DROP INDEX that identifies redundant and duplicate indexes, and indexes that were not used in the long period of time.
PLAN REGRESSION CORRECTION that identifies SQL queries that are using execution plan that are slower than previous good plan, and uses the last known good plan instead of the regressed plan.

Very useful information.

Comments closed

Replacing DAX PathContains With OR

Published 2017-10-19 by Kevin Feasel

Chris Koester shows the performance benefits of replacing the PathContains function in DAX with a simple OR operator:

This post shows how you can generate optimized multi-value DAX parameters in SSRS and achieve greater performance compared to the DAX PathContains function. This will be a short post that provides the SSRS expression to convert multiple SSRS parameters into a double-pipe delimited string for use in a DAX query. In other words, the goal is to use the DAX OR operator (||) instead of the PathContains function. I’m assuming the reader has experience with SSRS, so not all steps will be shown.

Read on for the example, which ended up being a 16X performance improvement.

Comments closed

Performance Tuning TVFs With Optional Parameters

Published 2017-10-17 by Kevin Feasel

Arvind Shyamsundar walks us through a scenario with user-defined functions with optional parameters:

If you notice carefully, the above query is an example of ‘optional parameters’ wherein the same query caters to situations where there are specific values for the parameters as well as other cases where there are none. Due to the implementation of the query (specifically the usage of ISNULL(@paramname, ColName)) what ends up happening is that the query plan thus generated will not leverage any indexes on the table. While this query can be refactored to separate versions for cases where the parameter values are supplied, and where they are not, another viable option is to use OPTION (RECOMPILE) on the statement level. This is an acceptable solution in most cases because the cost of scanning the table is often far higher than the cost of recompiling this query. So here is how we used OPTION RECOMPILE in this case:

Arvind walks us through three separate solutions. My fourth solution is, don’t use user-defined table-valued functions.

Comments closed