Category: Performance Tuning

JDBC Roundtrip Costs

Published 2017-12-18 by Kevin Feasel

Lukas Eder shows the performance cost of doing row-by-row data retrieval:

The obvious difference between the JDBC benchmark and the PL/SQL one is the fact that the JDBC call has to traverse a vast amount of logic, APIs, “barriers” between the JVM and the Oracle kernel before it can actually invoke the really interesting part. This includes:

JVM overhead

JDBC logic

Network overhead

Various “outer” layers inside the Oracle database

Oracle’s API layers to get into the SQL and PL/SQL execution engines

The actual code running in the PL/SQL engine

In Toon’s talk (which again, you should definitely watch), the examples are running SQL code, not PL/SQL code, but the results are the same. The actual logic is relatively cheap inside of the database (as we’ve seen in the PL/SQL only benchmark), but the overhead is significant when calling database logic from outside the database.

Thus: It is very important to minimise that overhead

This particular example focuses on Oracle and JDBC, but it certainly applies to other database platforms and distributed architectures.

Comments closed

Reducing Reads In Queries

Published 2017-12-15 by Kevin Feasel

Bert Wagner has a few tips for improving query performance by reducing the number of reads:

If SQL Server thinks it only is going to read 1 row of data, but instead needs to read way more rows of data, it might choose a poor execution plan which results in more reads.

You might get a suboptimal execution plan like above for a variety of reasons, but here are the most common ones I see:

Parameter sniffing

Use of table variables (via Brent Ozar)

Outdated statistics (via Kimberly L. Tripp)

If you had a query that previously ran fine but doesn’t anymore, you might be able to utilize Query Store to help identify why SQL Server started generating suboptimal plans.

Click through for a few more ideas as well.

Comments closed

Powershell Speed Testing

Published 2017-11-30 by Kevin Feasel

Shane O’Neill shows off a Powershell script which allows you to simplify performance testing:

Apart from catching up on news during my commute I only really use notifications for a certain number of hashtags i.e. #SqlServer, #tsql2sday, #sqlhelp, and #PowerShell.

So during work, every so often a little notification will pop up on the bottom right of my window and I can quickly glance down and decide whether to ignore it or check it out.

That’s what happened with the following tweet:

Click through for Shane’s demo.

Comments closed

Legacy Cardinality Estimation In SQL Server

Published 2017-11-29 by Kevin Feasel

Kellyn Pot’vin-Gorman explains what the Legacy Cardinality Estimation setting does in SQL Server:

Oracle DBAs have used the CARDINALITY hint for some time and it should be understood that this may appear to be similar, but is actually quite different. As hinting in TSQL is a bit different than PL/SQL, we can compare similar queries to assist:

TSQL
SELECT CustomerId, OrderAddedDate 
FROM OrderTable 
WHERE OrderAddedDate >= '2016-05-01';
OPTION (USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION'));
go
PL/SQL

Where you might first mistake the CE hint for the following CARDINALITY hint in Oracle:
SELECT /*+ CARDINALITY(ORD,15000) */ ORD.CUSTOMER_ID, ORD.ORDER_DATE 
FROM ORDERS ORD WHERE ORD.ORDER_DATE >= '2016-05-01';
This would be incorrect and the closest hint in Oracle to SQL Server’s legacy CE hint would be the optimizer feature hint:
SELECT /*+ optimizer_features_enable('9.2.0.7') */ ORD.CUSTOMER_ID, ORD.ORDER_DATE FROM ORDERS ORD 
WHERE ORD.ORDER_DATE >= '2016-05-01';
If you’re wondering why I chose a 9i version to force the optimizer to, keep reading and you’ll come to understand.

Read on for the comparative explanation as well as more details on SQL Server’s legacy cardinality estimator hint and database-scoped configuration setting.

Comments closed

Finding The Right Batch Size For Bulk Loads

Published 2017-11-15 by Kevin Feasel

Dan Guzman has some bulk load batch size considerations:

Bulk load has long been the fastest way to mass insert rows into a SQL Server table, providing orders of magnitude better performance compared to traditional INSERTs. SQL Server database engine bulk load capabilities are leveraged by T-SQL BULK INSERT, INSERT…SELECT, and MERGE statements as well as by SQL Server client APIs like ODBC, OLE DB, ADO.NET, and JDBC. SQL Server tools like BCP and components like SSIS leverage these client APIs to optimize insert performance.

SQL Server 2016 and later improves performance further by turning on bulk load context and minimal logging by default when bulk loading into SIMPLE and BULK LOGGED recovery model databases, which previously required turning on trace flags as detailed in this blog post by Parikshit Savjani of the MSSQL Tiger team. That post also includes links to other great resources that thoroughly cover minimal logging and data loading performance, which I recommend you peruse if you use bulk load often. I won’t repeat all that information here but do want to call attention to the fact that these new bulk load optimizations can result in much more unused space when a small batch size is used compared to SQL Server 2014 and older versions.

Click through for some tips.

Comments closed

Actual Execution Plan Enhancements

Published 2017-11-13 by Kevin Feasel

Pedro Lopes points out some additional data available in the properties section when you generate an actual execution plan:

Looking at the actual execution plan is one of the most used performance troubleshooting techniques. Having information on elapsed CPU time and overall execution time, together with session wait information in an actual execution plan allows a DBA to use showplan to troubleshoot issues away from the server, and be able to correlate and compare different types of waits that result from query or schema changes.

A few months ago we had introduced exposed in SSMS some of the per-operator statistics, such as CPU and elapsed time per thread. More recently, we have introduced overall query CPU and elapsed time tracking for statistics showplan xml (both in ms). These can be found in the root node of an actual plan. Available using the latest versions of SSMS v17, when used with SQL Server 2012 SP4, SQL Server 2016 SP1 and SQL Server 2017. For SQL Server 2014 it will become available in a future Service Pack.

Also be sure to check out Geoff Patterson’s Connect item asking that the execution plan results show the top ten waits in descending order rather than ascending order. That’s the appropriate ordering in my mind: show me the most important things first.

Comments closed

Soft-NUMA Doesn’t Limit MAXDOP

Published 2017-11-07 by Kevin Feasel

Lonny Niederstadt tests whether soft-NUMA forces MAXDOP = 1:

I mentioned that I was planning to set up a soft-NUMA node for each vcpu on a 16 vcpu VM, to evenly distribute incoming connections and thus DOP 1 queries over vcpus. Thomas Kejser et al used this strategy to good effect in “The Data Loading Performance Guide”, which used SQL Server 2008 as a base.
https://technet.microsoft.com/en-us/library/dd425070(v=sql.100).aspx

My conversation partner cautioned me that leaving this soft-NUMA configuration in place after the specialized workload would result in DOP 1 queries whether I wanted them or not. The claim was, effectively, a parallel query plan generated by a connection within a soft-NUMA node would have its MAXDOP restricted by the scheduler count (if lower than other MAXDOP contributing factors). Though I wasn’t able to test at the time, I was skeptical: I’d always thought that soft-NUMA was consequential to connection placement, but not to MAXDOP nor to where parallel query workers would be assigned.

I’m back home now… time to test!!

Read on for the test.

Comments closed

CXCONSUMER Waits And More From PASS Summit

Published 2017-11-06 by Kevin Feasel

Brent Ozar relays a couple exciting announcements from PASS Summit:

Microsoft’s Joe Sack & Pedro Lopes held a forward-looking session for performance tuners at the PASS Summit and dropped some awesome bombshells.

Pedro’s Big Deal: there’s a new CXPACKET wait in town: CXCONSUMER. In the past, when queries went parallel, we couldn’t differentiate harmless waits incurred by the consumer thread (coordinator, or teacher from my CXPACKET video) from painful waits incurred by the producers. Starting with SQL Server 2016 SP2 and 2017 CU3, we’ll have a new CXCONSUMER wait type to track the harmless ones. That means CXPACKET will really finally mean something.

Read on to see what Joe has for us.

Comments closed

Collation Compatibility And Linked Servers

Published 2017-11-03 by Kevin Feasel

Greg Low points out an important property which can help linked server performance:

The on-premises versions of SQL Server have the ability to connect one server to another via a mechanism called Linked Servers.

Azure-based SQL Server databases can communicate with each other by a mechanism called External Tables. I’ll write more about External Tables soon.

With Linked Servers though, I often hear people describing performance problems and yet there’s a configuration setting that commonly causes this. In Object Explorer below, you can see I have a Linked Server called PARTNER.

Read on for more.

Comments closed

Azure SQL DB Automatic Tuning FAQ

Published 2017-11-03 by Kevin Feasel

Arun Sirpal has a self-Q&A session regarding Azure SQL Database’s automatic tuning options:

What are the options?

CREATE INDEX that identifies the indexes that may improve performance of your workload, creates the indexes, and verifies that they improve performance of the queries.
DROP INDEX that identifies redundant and duplicate indexes, and indexes that were not used in the long period of time.
PLAN REGRESSION CORRECTION that identifies SQL queries that are using execution plan that are slower than previous good plan, and uses the last known good plan instead of the regressed plan.

Very useful information.

Comments closed