Press "Enter" to skip to content

Category: Query Tuning

Explaining FIRST_VALUE() and LAST_VALUE() Execution Plans

Hugo Kornelis wraps up a mini-series on window functions:

In part twenty-eight of the plansplaining series, I’ll wrap up the six-part mini-series on window functions. After covering the basicsfast-track optimizationwindow frames ending at UNBOUNDED FOLLOWINGwindow frames specified with RANGE instead of ROWS, and LAG and LEAD, we will look at the LAST_VALUE and FIRST_VALUE analytical functions, and find that a function we would have expected to be available as an internal aggregate function does not exist at all! We’ll also find out how SQL Server works around that.

Click through to unravel that mystery.

Comments closed

Fixing Non-SARGable Predicates

Erik Darling makes the donuts:

Okay, deep breath. Deep, deep breath.

Would you believe I still have to fix queries with a bunch of functions in joins and where clauses? Like every day? All day?

Where things get tough is when there’s some third party vendor app where code can’t be changed, but customers have become so fed up with performance that they’re willing to make other changes to help things along.

Read on for an example of one scenario in which reversing an index can help improve performance without touching the code.

Comments closed

Hash Aggregates and Hash Joins in Postgres

Muhammad Ali plays matchmaker:

PostgreSQL employs various techniques for data joining and aggregation in its queries, among which the hash-based method stands out for its efficiency in particular situations and different data sizes. We will discuss hash joins and hash aggregates in PostgreSQL, providing insights on how they work and parameters which influence this algorithm.

Read on to learn more. This looks fundamentally similar to hash matches in SQL Server, so if you’re familiar with that, the concepts should be pretty clear.

Comments closed

The Problem with Automatic Plan Correction

Kendra Little lays it out:

I’ve written a bit about SQL Server’s Automatic Plan Correction feature before– I have an hour long free course with demos on Automatic Plan Correction here on the site.

Today I’m updating that course with a note: after using Automatic Plan Correction in anger for a good amount of time, I do not recommend enabling the feature. I’ve had it cause too many performance problems, and there are not a ton of options for an administrator when it’s causing those problems.

Kendra is still bullish about the potential of this but has some major issues with the current implementation. Read on to learn more about it.

Comments closed

Fixing Key Lookups

Erik Darling has some advice:

Key lookups represent a choice made by the optimizer between:

  • Clustered index scan
  • Nonclustered index seek/scan + Clustered index lookup

Lookups can be quite useful to avoid creating quite wide nonclustered index, but the optimizer is heavily biased against them. It does not like random I/O.

Read on for Erik’s thoughts on the matter and how you may be able to improve the performance of a query by removing a key lookup—or sometimes, how you may safely ignore a key lookup because hey, it’s a 1 millisecond difference on a query that runs once a day.

Comments closed

Implicit Join Elimination in JooQ

Lukas Eder talks about implicit join elimination:

One of jOOQ’s key features so far has always been to render pretty much exactly the SQL that users expect, without any surprises – unless some emulation is required to make a query work, of course. This means that while join elimination is a powerful feature of many RDBMS, it isn’t part of jOOQ’s feature set, so far.

As Lukas mentions, many relational database products already do this–SQL Server is an example of one product that does. But not all of them do, so it’s nice to have that option available in the data access layer.

Comments closed

Optimized Locking in Azure SQL DB

Aaron Bertrand tries out a new feature:

In a sentence: Instead of locking individual rows and pages for the life of the transaction, a single lock is held at the transaction level, and row and lock pages are taken and released as needed.

This is made possible by previous investments in Accelerated Database Recovery and its persistent version store. A modification can evaluate the predicate against the latest committed version, bypassing the need for a lock until it is ready to update (this is called lock after qualification, or LAQ). There’s a lot more to it than that, and I’m not going to dive deep today, but the result is simple: long-running transactions will lead to fewer lock escalations and will do a lot less standing in the way of the rest of your workload. Locks held for shorter periods of time will naturally help reduce blocking, update conflicts, and deadlocks. And with fewer locks being held at any given time, this will help improve concurrency and reduce overall lock memory.

Read on to learn more about how it works and Aaron’s initial thoughts on the feature.

Comments closed

Digging into Execution Plans for LAG() and LEAD()

Hugo Kornelis looks at a pair of useful window functions:

LAG and LEAD were introduced in SQL Server 2012. They require an OVER clause, but it can only specify PARTITION BY and ORDER BY. No ROWS / RANGE specification for a window frame. Which makes them the stand out as unusual in this series.

By default, they return a value from the last row before the current row, or from the first row after the current row, based on the specified sort order and while observing the specified partition boundaries. But there are two optional parameters, an offset to specify that you want, for instance, the third-last row or the second-next row. And the default parameter specifies a value to be used instead of NULL when the indicated row falls outside of the partition.

Click through to see what the plans look like, as well as how very welcome though potentially performance-impacting changes in SQL Server 2022 have affected this.

Comments closed

Pagination in Stored Procedures

Erik Darling hits on a bugbear of mine:

A common-enough practice to limit search results is to write paginated queries. You may give users the ability to specify how many rows per page they want to see, or you may have a set number of rows per page.

But the end goal is to allow users to very quickly get a smaller number of rows returned to them. Almost no one needs to get many thousands of rows back, unless they’re planning on exporting the data.

Pagination is such a common activity that I wish there were a way to say, “Here is my data. Hang onto it in memory and quickly retrieve a subset of rows upon request” without doing all kinds of shenanigans on my end. Something like a data snapshot that remains in memory as long as the session is active, until the service restarts, until there is memory pressure, or until the caller manually evicts the data. That would make OFFSET and FETCH really useful instead of barely usable for most pagination scenarios because you wouldn’t need to re-run the entire query for every offset/fetch page.

There are ways to make pagination efficient, but the efficient ones aren’t easy or intuitive.

Comments closed