Erik Darling has a new slide deck. This video focuses on how skewed data can potentially lead to parallelism not distributing the data on different CPU threads effectively. Erik demonstrates how you can identify this issue in practice, though solving the problem will come in a later video. It’s an interesting scenario and stick around for a poke at an argument Pedro Lopes has made.
Comments closedCategory: Query Tuning
Reitse Eskens tries out regular expression support in SQL Server 2025:
One of the new features in SQL Server 2025 is that you can now use regular expressions directly in your T-SQL queries. Now, regular expressions (or RegEx) have never been a syntax that’s easy to read. There are a lot of brackets, dashes and other symbols that make no sense when you first see them. Before delving into how these can be used in SQL Server, a few basics are provided to get you started, along with a link to a website for further learning.
Read on for a quick primer and a bit of pain when it comes to performance.
Comments closedErik Darling continues a series on transaction isolation levels. If you haven’t been watching this series, I’d recommend checking out his channel and catching up, as he’s spent a good amount of time on transaction isolation levels and what, exactly, that means in a relational database management system like SQL Server.
Comments closedForrest McDaniel joins tables together:
Even at a full cartesian join, 49 by 49 rows does not produce 27670.
Usually, when the optimizer applies transformations it updates cardinality estimates, but I ran into a particular construction and transformation where it doesn’t (and I don’t know why).
Read on for the example and a workaround that Forrest found.
Comments closedAndy Brownsword takes a closer look at the big three join operators in SQL Server:
In the previous post I explained Join Operators in SQL Server. Whilst compiling that I dug a little deeper and came across a few interesting points I thought were worth sharing.
Let’s look at behaviour of the operators which may occur under specific conditions. Hopefully you find them as interesting as I did:
Click through for an interesting tidbit about nested loops joins, merge joins, and hash joins.
Comments closedErik looks at a classic performance difference between ROWS and RANGE, as well as what batch mode does to even the score. This is particularly nice because ROWS and RANGE both have their utility and focusing on one versus the other for performance differences can lead to awkward development practices to get around a window spool.
Erik also focuses primarily on batch mode on rowstore, so keep in mind the minimum requirements for it: 131,072 (or 2^17) rows in at least one table in the query, at least one operator that benefits from batch mode (which we’d cover in the window function), at least one input of the batch with 2^17 rows, and where the batch mode cost is lower than the row mode cost.
Comments closedErik Darling walks through something that can slow down your temp table loads. The video covers a scenario in which a user creates a temp table, loads some data from it, and also wants to include an identity integer. Erik shows how this can lead to sub-optimal performance, as well as one alternative if you really do need a monotonically increasing value on your temp table.
Comments closedAndrei Lepikhov responds to feedback:
In the previous post, I explored some nuances of Postgres related to indexes and parallel workers. This text sparked a lively discussion on LinkedIn, during which one commentator (thanks to Ants Aasma) proposed an index that was significantly more efficient than those discussed in the article. However, an automated comparison of EXPLAINs did not clarify the reasons for its superiority, necessitating further investigation.
Click through for the index and what Andrei learned along the way.
Comments closedBrent Ozar flips everything around:
99.9% of you are never gonna need this.
But let’s say you need to run queries with leading (not trailing) wildcards, like this search for all the different national versions of the Encabulator, each of which has different prefixes depending on which government it’s being sold to:
This is indeed a pretty uncommon scenario. I’m pretty sure I’ve only ever needed to do this once. Well, twice, but in one case I couldn’t actually use the REVERSE() function because the column was itself an awful non-deterministic function and this solution wouldn’t work.
Andrei Lepikhov gets under the hood:
I wanted to explore whether Postgres could be improved by thoroughly utilising all available tools, and for this, I chose the RTABench benchmark. RTABench is a relatively recent benchmark that is described as being close to real-world scenarios and highly selective. One of its advantages is that the queries include expressions involving the JSONB type, which can be challenging to process. Additionally, the Postgres results on RTABench have not been awe-inspiring.
Ultimately, I decided to review all of the benchmark queries, and fortunately, there aren’t many, to identify possible optimisations. However, already on the zero query, there were enough nuances that it was worth taking it out into a separate discussion.
Click through for a dive into this particular query and what Andrei did and some of the lessons you can draw from it.
Comments closed