Press "Enter" to skip to content

Category: Query Tuning

When Arrow Widths Mislead

Hugo Kornelis provides one example of misleading arrow widths in execution plans:

In my previous post, I talked about the Actual Number of Rows and Estimated Number of Rows properties, their visual representation in execution plans, and the most important ways in which this can be used.

But life is not always perfect, and Microsoft likes to remind us of that. Sometimes, these properties report values in weird and confusing ways. Sometimes the arrow width, as the visual representation in the execution plan, misleads us.

Let’s look at one such case.

Read the whole thing and vote for Hugo’s Feedback item to improve this experience.

Comments closed

Row Counts and Arrow Widths in SSMS

Hugo Kornelis takes us through one of the niceties SQL Server Management Studio (and subsequent tools like SentryOne Plan Explorer and Azure Data Studio) has for reading execution plans:

Because both the Actual Number of Rows and the Estimated Number of Rows are very valuable for many use cases, Microsoft has built their primary tool for viewing execution plans (SQL Server Management Studio) to highlight the value of at least one of these two properties, in a visual way, when you look at the graphical representation of an execution plan.

Look at the arrows that show which operator (is called by and) returns rows to which operator, and that therefore can be thought of as representing the data stream. In most execution plans, they are not all the same width. Management Studio uses the width of the arrow as a quick, simple visual representation of the value of Actual Number of Rows (in an execution plan plus run-time statistics, aka “execution plan plus” or “actual execution plan”), or of the value of the Estimated Number of Rows in an execution plan only (aka “estimated execution plan”).

This first post is the happy path post, where the information is correct and useful. Later posts promise to cover cases in which this does not hold.

Comments closed

Issues with Window Functions and Views / Derived Tables

Paul White takes us through some of the difficulties you are liable to see when using window functions in views and derived tables:

Our expectation is that the execution plan for this new query will be exactly the same as before we created the view. The query optimizer should be able to push the product filter specified in the WHERE clause down into the view, resulting in an index seek.

We need to stop and think a bit at this point, however. The query optimizer can only produce execution plans that are guaranteed to produce the same results as the logical query specification. Is it safe to push our WHERE clause into the view?

Read the whole thing.

Comments closed

The Halloween Problem, Continued

Paul White continues a series on the Halloween Problem. Part 2 looks at insert and delete statements:

In the first part of this series, we saw how the Halloween Problem applies to UPDATE queries. To recap briefly, the problem was that an index used to locate records to update had its keys modified by the update operation itself (another good reason to use included columns in an index rather than extending the keys). The query optimizer introduced an Eager Table Spool operator to separate the reading and writing sides of the execution plan to avoid the problem. In this post, we will see how the same underlying issue can affect INSERT and DELETE statements.

Part 3 looks at the train wreck MERGE operator:

In the right circumstances, the SQL Server optimizer can recognize that the MERGE statement is hole-filling, which is just another way of saying that the statement only adds rows where there is an existing gap in the target table’s key.

For this optimization to be applied, the values used in the WHEN NOT MATCHED BY TARGET clause must exactly match the ON part of the USING clause. Also, the target table must have a unique key (a requirement satisfied by the PRIMARY KEY in the present case).

Where these requirements are met, the MERGE statement does not require protection from the Halloween Problem.

If only it weren’t busted in so many other ways!

Part 4 wraps up the series:

The SQL Server optimizer has specific features that allow it to reason about the level of Halloween Protection (HP) required at each point in the query plan, and the detailed effect each operator has. These extra features are incorporated into the same property framework the optimizer uses to keep track of hundreds of other important bits of information during its search activities.

Each operator has a required HP property and a delivered HP property. The required property indicates the level of HP needed at that point in the tree for correct results. The delivered property reflects the HP provided by the current operator and the cumulative HP effects provided by its subtree.

This last one goes into some nice detail.

Comments closed

Reading Query Plans in Spark

Daniel Ciocirlan has a primer on query plans in Apache Spark:

Let’s go over some examples of query plans and how to read them. Let’s go back to the one we’ve just shown:

 == Physical Plan == *(1) Project [(id#0L * 5) AS id#2L]

+- *(1) Range (1, 1000000, step=1, splits=6)

We read this plan backwards, bottom to top:

Spark does have some UI components which make this a bit easier, but you’ll probably end up in a situation where you need to read it in this format.

Comments closed

I Remember Halloween

Paul White talks about the Halloween Problem:

Much has been written over the years about understanding and optimizing SELECT queries, but rather less about data modification. This series looks at an issue that is specific to INSERTUPDATEDELETE and MERGE queries – the Halloween Problem.

The phrase “Halloween Problem” was originally coined with reference to a SQL UPDATE query that was supposed to give a 10% raise to every employee who earned less than $25,000. The problem was that the query kept giving 10% raises until everyone earned at least $25,000.

We will see later on in this series that the underlying issue also applies to INSERTDELETE and MERGE queries, but for this first entry, it will be helpful to examine the UPDATE problem in a bit of detail.

This is a classic problem in data management and has led to a good bit of confusion over time about why database updates can perform worse than you’d expect.

Comments closed

When a Non-Clustered Index on Clustered Columns Makes Sense

Allen White gives us a scenario where adding a non-clustered index which is the same column as the clustered index can make sense:

Recently I was asked about adding a non-clustered index to a table (let’s call it Images) with just one column. It had been added in the development database and it improved performance dramatically. I looked at it and it had the same key as the clustered index on that table.

In reviewing the query I saw that Images was joined to the other tables in the query, but none of the columns were used, so Images was joined to ensure that values from the other tables had rows in Images. The query plan shows a significantly higher number of reads against Images without the new NCI (non-clustered index) than when it’s present.

I do agree that this can help—as we obviously see. The backseat query tuner in me wonders if maybe there’s another way to write the query to prevent the scan by using CROSS APPLY, but that’d only help if they were getting a small percentage of rows from the parent table expression built from the combination of the clustered index scan and index seek in the second example.

Comments closed

The Peril of Local Variables

Erik Darling dives into the tradeoffs you make when using local variables in stored procedures to avoid parameter sniffing:

In a stored procedure (and even in ad hoc queries or within dynamic SQL, like in the examples linked above), if you declare a variable within that code block and use it as a predicate later, you will get either a fixed guess for cardinality, or a less-confidence-inspiring guess than when the histogram is used.

The local variable effect discussed in the rest of this post produces the same behavior as the OPTIMIZE FOR UNKNOWN hint, or executing queries with sp_prepare. I have that emphasized here because I don’t want to keep qualifying it throughout the post.

This deserves a careful read-through.

Comments closed

Figuring out How a Plan was Forced

Erin Stellato wants to know whether a DBA forced a plan or SQL Server did automatically:

If you use Automatic Plan Correction, and thus also Query Store, you may wonder how was a plan forced: manually or automatically with APC?  The type of forced plan is tracked in sys.query_store_plan, and you can use this simple query to determine how a plan was forced:

Click through for a simple query, as well as a more complex form which gives you a bit more info.

Comments closed

Improving Queries at the Margin

Jared Poche has a story about improving a query which is already pretty fast:

In my last post, I spoke about optimizing a procedure that was being executed hundreds of millions of times per day, and yes, that is expected behavior.

The difficult thing about trying to optimize this procedure is that it only takes 2.5ms on average to run. Tuning this isn’t a matter of changing a scan to a seek; we’ll have to look hard to find the opportunities here. A one millisecond Improvement on a procedure running 100 million times a day would save 100,000 seconds every day.

Well, I’ve found a few more options since my last post, and wanted to share my findings.

Read on to see how Jared tries to tackle one specific case.

Comments closed