To sum up, the workflow for tuning your query is:
Make some changes to the LongQuery query that hopefully make it faster
Update the Trace Message parameter with some notes about which version of the LongQuery query it is that you’ll be testing
Click the Refresh Preview button for the Diagnostics query to test how long LongQuery now runs for
Refresh, or load, the query that reads the data from the trace logs so you can see how all of your changes have affected query execution times
I give it two months before the Power BI team releases a change to make this easier…
While presenting at SQLDay in Wroclaw, Poland, on the Query Store, I was asked a pretty simple question, which takes precedence, the Query Store or a Plan Guide?
One of my favorite answers to questions is “I don’t know” because it gives me the opportunity to learn. Let’s figure this one out together.
I’ll post the code to recreate this experiment within AdventureWorks at the end of the article. I’m doing this because the code for forcing execution plans using Plan Guides can be pretty doggone long (you may need to generate your own XML from a plan on your own system, fair warning).
The answer is not quite as clear-cut as I would have expected, and I’ll be interested to see what others find.
In order for SQL Server to read and update data, the data must be in the buffer. SQL Server does not work directly with the data in the files on disk. Once the pages of data are in the buffer, they can be used for multiple queries. This means that the data doesn’t have to be retrieved from disk every time it’s needed, thereby decreasing the amount of I/O work required.
You may have seen this yourself when selecting all the rows of a large table twice. The second time, the query can run much faster because the data does not have to be copied from the disk to the buffer. If you run another query from a different large table, it may cause the pages from the first table to be removed to make room. If there is not enough memory, pages will have to be read from disk more frequently causing your queries to be slow.
Even with extremely fast SSDs and flash storage arrays, RAM is still typically an order of magnitude faster, so having enough RAM and using it wisely is critical to a well-functioning SQL Server instance.
Letting SQL Server change data types automatically can seriously impact performance in a negative way. Because a calculation has to be run on each column, you can’t get an index seek. Instead, you’re forced to use a scan. I can demonstrate this pretty simply. Here’s a script that sets up a test table with three columns and three indexes and tosses a couple of rows in:
You might get lucky and have the database engine realize that it doesn’t need to give you a horrible execution plan, but it’s sound advice to ensure that data types match on joins and filters.
When defining table variables, avoid primary key or unique key constraints. Opt instead for named indexes if you’re using SQL Server 2014 or later. Otherwise, be aware that plan forcing is limited to queries that don’t use these table variables.
Helpful advice when dealing with user-defiened table types. Read the whole thing.
For this specific test, with a specific data size, distribution, and number of parameters, and on my particular hardware, JSON was a consistent winner (though marginally so). For some of the other tests in previous posts, though, other approaches fared better. Just an example of how what you’re doing and where you’re doing it can have a dramatic impact on the relative efficiency of various techniques, here are the things I’ve tested in this brief series, with my summary of which technique to use in that case, and which to use as a 2nd or 3rd choice (for example, if you can’t implement CLR due to corporate policy or because you’re using Azure SQL Database, or you can’t use JSON or
STRING_SPLIT()because you aren’t on SQL Server 2016 yet). Note that I didn’t go back and re-test the variable assignment and
SELECT INTOscripts using TVPs – these tests were set up assuming you already had existing data in CSV format that would have to be broken up first anyway. Generally, if you can avoid it, don’t smoosh your sets into comma-separated strings in the first place, IMHO.
That’s a rather interesting result, given how poorly JSON fared in some of the previous tests.
So I set about looking for a workaround. This week I think I’ve finally managed to get something working that approximates the number I need from that view, ms_ticks.
Attached is sp_whoisactive v11.112 — Azure Special Edition v2. Please give it a shot, and I am especially interested in feedback if you use the @get_task_info = 2 option when running sp_whoisactive. That is the main use case that’s impacted by the lack of ms_ticks information and my attempt at a workaround.
If you’re using on-prem SQL Server, this doesn’t add anything new, but if you’re on Azure SQL Database, give it a try.
WRITELOG waits are a scalability challenge for OLTP workloads under load. Chris Adkin has a lot of experience tuning SQL Server for high-volume OLTP workloads. So I’m going to follow his advice when he writes we should minimize the amount logging generated. And because I can’t improve something if I can’t measure it, I wonder what’s generating the most logging? OLTP workloads are characterized by frequent tiny transactions so I want to measure that activity without filters, but I want to have as little impact to the system as I can. That’s my challenge.
Check out the entire post, as this is a good exercise in investigating busy transactional systems.
Really. Every single time. It started off kind of funny. Scalar functions in queries: no parallelism. Scalar functions in computed columns: no parallelism, even if you’re not selecting the computed column. Every time I think of a place where someone could stick a scalar function into some SQL, it ends up killing parallelism. Now it’s just sad.
This is (hopefully. HOPEFULLY.) a less common scenario, since uh… I know most of you aren’t actually using any constraints. So there’s that! Developer laziness might be a saving grace here. But if you read the title, you know what’s coming. Here’s a quick example.
Yeah, UDFs in check constraints is a pretty bad idea most of the time.
To understand why we get this performance degradation with SQL Server 2016 RC1 three key parts of a transactions life cycle need to be understood along with the serialisation mechanisms that protect them
Chris digs into call stacks as part of his post. We’ll see if there are some performance improvements between now and RTM on this front.