Press "Enter" to skip to content

Category: Query Tuning

An Overview of Parameter-Sensitive Plan Optimization

Erik Darling is diving into what we currently know about Parameter-Sensitive Plan Optimization, starting with an overview:

The way this feature works is, rather than caching a single query plan for every other execution to use, it creates what’s called a Dispatcher plan (if your query qualifies).

You’ll see something like this in the properties of the root node of your query plan, and your query will have some additional funny business at the end of it.

Read on to see what information is available to us and current feature limitations.

Comments closed

Resolving Implicit Conversion on a Join

Andrea Allred has a process:

As I was troubleshooting a performance issue, I noticed that there was an implicit conversion (SQL Server automatically converts the data from one data type to another) happening in my join. The join was on a column that was named the same in both tables, but one was datatype INT (integer) and the other was a datatype of VARCHAR(50) (variable character up to 50 places).

Read on for one way to resolve this issue…so long as no other calling code expects a string on a call.

Comments closed

Improving Join Performance with Skewed Datasets in Spark

Ajay Gupta gets into the topic of join performance:

Performing Joins on Skewed DatasetsA Dataset is considered to be skewed for a Join operation when the distribution of join keys across the records in the dataset is skewed towards a small subset of keys. For example when 80% of records in the datasets contribute to only 20% of Join keys.

Implications of Skewed Datasets for Join: Skewed Datasets, if not handled appropriately, can lead to stragglers in the Join stage (Read this linked story to know more about Stragglers). This brings down the overall execution efficiency of the Spark job. Also, skewed datasets can cause memory overruns on certain executors leading to the failure of the Spark job. Therefore, it is important to identify and address Join-based stages where large skewed datasets are involved.

Read on for five techniques which may help you out.

Comments closed

Lack of Fun with Scalar Functions

Tom Zika takes away the scalars:

I’m still surprised many people don’t realise how lousy Scalar functions are. So because it’s my current focus in work and this Stack Overflow question, I’ll be revisiting this topic.

The focus of part one is parallelism. Unfortunately, parallelism often gets a bad rep because of the prominent wait stats. Also, if there is a skew, it can run slow. But for the most part, it’s advantageous.

Whether or not you want parallelism should be an informed choice. But Scalar functions will enforce the query to run serially, even if you are unaware. That’s why I want to shine a light on this.

Read on for a demo of how even a no-op scalar function can affect query performance. Given the mess we normally see in scalar functions, it’s all downhill from there.

Comments closed

Why that Plan Didn’t Go Parallel

Erik Darling looks at another update in SQL Server 2022:

The thing is, the reason always seemed to be “Could Not Generate Valid Parallel Plan” for most of them, even though more explicit reasons were available.

They started cropping up, as things do, in Azure SQL DB, and have finally made it to the box product that we all know and mostly love.

Let’s explore some of them! Because that’s what we do.

Also check out Rob Volk’s comment, as he lists out all the ones he could find, noting that most of these do exist in SQL Server 2019.

Comments closed

Query Store Hints in SQL Server 2022

Erik Darling has thoughts:

When you’re dealing with untouchable vendor code full of mistakes, ORM queries that God has turned away from, and other queries that for some reason can’t be tinkered with, we used to not have a lot of options.

In SQL Server 2022, Query Store gains a new super power: you can add hints to queries without intercepting the code in some other manner.

There are a couple of useful hints which won’t be available but Erik seems mostly upbeat about what is there.

Comments closed

Continuing a Dive into Simple Parameterization

Paul White shows how not-simple simple parameterization really is:

The output of parsing is a logical representation of the statement called a parse tree. This tree does not contain SQL language elements. It’s an abstraction of the logical elements of the original query specification.

It’s frequently possible to write the same logical requirement in different ways. Using a SQL analogy, x IN (4, 8) is logically the same as writing x = 4 OR x = 8. This flexibility can be useful when writing queries, but it makes implementing an optimizer more difficult.

In general terms, normalization is an attempt to standardize. It recognises common variations expressing the same logic, and rewrites them in a standard way. For example, normalization is responsible for turning x BETWEEN y AND z into x >= y AND x <= z. The normalized, or standardized, form chosen is one the query processor finds convenient to work with internally.

This has been a very interesting series and Paul does promise one more article.

Comments closed

Understanding Missing Index Impact

Erik Darling delves into the depths of missing indexes:

Breaking each of those down, the only one that has a concrete meaning is Uses, but that of course doesn’t mean that a query took a long time or is even terribly inefficient.

That leaves us with Average Query Cost, which is the sum of each operator’s estimated cost in the query plan, and Impact.

But where does Impact come from?

Read on to learn where, as well as why you shouldn’t blindly trust that number.

Comments closed