Erik has had many great videos, though not enough text for me to copy and paste. Nonetheless, I consider it a civic duty to remind people once in a while that Erik consistently puts out great content and you should watch his videos.
1 CommentCategory: Query Tuning
Radim Marek takes us through the process of deleting rows:
Your database is ticking along nicely – until a simple DELETE brings it to its knees. What went wrong? While we tend to focus on optimizing SELECT and INSERT operations, we often overlook the hidden complexities of DELETE. Yet, removing unnecessary data is just as critical. Outdated or irrelevant data can bloat your database, degrade performance, and make maintenance a nightmare. Worse, retaining some types of data without valid justification might even lead to compliance issues.
Read on to learn about the process, some challenges, and a common pattern for resolving these challenges. The solution is pretty similar in SQL Server as well: batching delete operations, ideally with a supporting index.
Comments closedHubert Lubaczewski digs into a performance issue:
So, we have some databases on PostgreSQL 14 (yes, I know, we should upgrade, but it is LONG project to upgrade, so it’s not really on the table now).
Last week someone noticed problems with website. These were tracked to queries getting killed because of statement_timeout of 3 minutes.
The query was relatively simple (most of the identifiers were obfuscated, but tried to keep them somewhat memorable):
Click through for the story, analysis of the problem, and how creating a filtered index worked in this case. Filtered indexes are a beautiful thing when the optimizer knows how to make use of them.
Comments closedThe ability to write an efficient query starts with a well-designed database. If the database you are working with is poorly designed, your choices are limited. Even in those scenarios, you can still enhance your query design and follow best practices.
The goal of this post is to provide simple guidelines for writing efficient queries. These guidelines aren’t advanced SQL techniques. These are just the basics that anyone can use to write fast, efficient queries. There are many functions and keywords available in TSQL, so many scenarios aren’t covered by these guidelines. But – they are a good place to start for any query.
Click through for the process.
Comments closedBrent Ozar thinks about consolidation:
I’ve got a BEGIN TRAN in there before the updates just so I can test the same queries repeatedly, and roll them back each time. The execution plan for the updates is quite nice: SQL Server divebombs into the supporting indexes:
Relatively few rows match, so our query does less than 1,000 logical reads – way less than there are pages in the table. In this case, separate UPDATE statements make sense.
Brent then continues with, but what do we do when we need to perform multiple independent scans of the same table? Read on for Brent’s answer, but definitely check out Thomas Franz’s comments, which indicate a potential complicating factor.
Comments closedSteve Stedman lays out a warning:
When writing SQL queries, it’s easy to focus on getting the right results without thinking too much about performance. One common mistake that can lead to significant slowdowns is using functions in the
JOIN ON
clause of aSELECT
statement.While SQL Server supports a wide range of built-in functions, using them incorrectly—especially in the join conditions—can severely impact performance. Let’s break down why this happens, how it can affect your query speed, and what you can do to avoid it.
Read on for the answer.
Comments closedAndy Brownsword speeds up a query:
Bad code smells can run through a system, and one instance can quickly spread when code is recycled. Simon recently looked at a non-sargable example and was able to fix it by using an index and refactoring the query.
I wanted to consider an alternative approach if we saw the same issue repeated consistently. We don’t want to fix each instance, we want a single fix. We’ll solve this with indexed computed column.
We can index computed columns to help resolve deterministic (i.e. won’t change, no
GETDATE()
allowed) clauses. Let’s get started.
Read on to learn more. This is a powerful approach to the problem of needing to perform some sort of data transformation before filtering your data.
Comments closedAndrei Lepikhov and Alena Rybakina ask a question:
As usual, this project was prompted by multiple user reports with typical complaints, like ‘SQL server executes the query times faster’ or ‘Postgres doesn’t pick up my index’. The underlying issue that united these reports was frequently used VALUES sequences, typically transformed in the query tree into an
SEMI JOIN
.I also want to argue one general question: Should an open-source DBMS correct user errors? I mean optimising a query even before the search for an optimal plan begins, eliminating self-joins, subqueries, and simplifying expressions – everything that can be achieved by proper query tuning. The question is not that simple since DBAs point out that the cost of query planning in Oracle overgrows with the complexity of the query text, which is most likely caused, among other things, by the extensive range of optimisation rules.
My short answer is, yes. SQL is a 4th generation language, meaning that end users describe the results they need but leave it to the engine to determine how to get there. As performance tuners, we may understand some of the foibles of the database engine and how it does (or does not) perform these translations, but in an ideal world, every unique representation of an end state for a given query should have the same, maximally optimized internal way of getting there. This is impossible in practice, but it should be a guiding principle for engine behavior.
Comments closedJared Westover performs a technique showdown:
A few weeks ago, I helped someone combine multiple integer columns into a single column. The common term for this process is unpivoting or transposing. The table they worked with had millions of rows, so they needed the most efficient method to accomplish this task. If you search online, you’ll find several suggestions for doing this. But which performs best? Is one easier to maintain than the others?
Click through for a review of three separate techniques: using the UNPIVOT
operator, using UNION ALL
, and using CROSS APPLY
. The dataset was relatively small, but even at that size, CROSS APPLY
did a good job. But I won’t spoil too much here.
Steve Jones makes a comparison:
Often I see running totals that are written in SQL using a variety of techniques. Many pieces of code were written in pre-2012 techniques, prior to window functions being introduced.
After SQL Server 2012, we had better ways to write a total. In this case, let’s see how much better. This is based on an article showing how you might convert code from the first query to the second. This is a performance analysis of the two techniques are different scales..
Steve shows a very fancy version of the self-join technique, which is actually even slower than using a cursor for this work. The fastest variant on the technique was something nicknamed the ‘quirky update’ technique, but it relied on an accidental property of how clustered indexes worked on temp tables in SQL Server and was not something Microsoft ever officially supported, meaning that any service pack, cumulative update, or hotfix might have broken your code and the best you’d get is an indifferent shrug.
Comments closed