Press "Enter" to skip to content

Category: Query Tuning

EF Core Merge Statements

Richie Rump looks at SQL that Entity Framework Core generates when inserting a batch of records:

If you’re an experienced SQL tuner, you’ll notice some issues with this statement. First off the query has not one but two table variables. It’s generally better to use temp tables because table variables don’t have good statistics by default. Secondly, the statement uses a MERGE statement. The MERGE statement has had more than it’s fair share of issues. See Aaron’s Bertrand’s post “Use Caution with SQL Server’s MERGE Statement” for more details on those issues.

But that got me wondering, why would the EF team use SQL features that perform so poorly? So I decided to take a closer look at the SQL statement. Just so you know the code that was used to generate the SQL saves three entities (Katana, Kama, and Tessen) to the database in batch. (Julie used a Samurai theme so I just continued with it.)

Yeah…I’m not liking the MERGE statement very much here.

Comments closed

More With Adaptive Joins

Erik Darling continues his adaptive joins exploration with two more posts.  First, how local variables can affect the query plan:

The easiest way to look at this is to compare Adaptive Joins with literal values to the same ones using local variables. The results are a little… complicated.

Here are three queries with three literal values. In my copy of the Super User database (the largest Stack Overflow sub-site), I’ve made copies of all the tables and added Clustered ColumnStore indexes to them. That’s the only way to get Adaptive Joins at this point — Column Store has to be involved somewhere along the line.

The last day of data in this dump is from December 11. When I query the data, I’m looking at the last 11 days of data, the last day of data, and then a day where there isn’t any data.

Then Erik takes on non-SARGable queries:

The queries with non-SARGable predicates on the Users table used Adaptive Joins.

The queries with non-SARGable predicates on the Posts table did not.

Now, there is an Extended Events… er… event to track this, called adaptive_join_skipped, however it didn’t seem to log any information for the queries that didn’t get Adaptive Joins.

Bummer! But, if I had to wager a guess, it would be that this happens because there’s no alternative Index Seek plan for the Posts table with those predicates. Their non-SARGableness takes that choice away from the optimizer, and so Adaptive Joins aren’t a choice. The Users table is going to get scanned either way — that’s the nature of ColumnStore indexes, so it can withstand the misery of non-SARGable predicates in this case and use the Adaptive Join.

Two more good posts in Erik’s series, and both definitely worth reading.

Comments closed

Interleaved Execution

Arun Sirpal looks at how Interleaved Execution affects table cardinality estimates with multi-statement table-valued functions in SQL Server 2017:

Joe states in the article “MSTVFs have a fixed cardinality guess of “100” in SQL Server 2014 and SQL Server 2016, and “1” for earlier versions. Interleaved execution will help workload performance issues that are due to these fixed cardinality estimates associated with multi-statement table valued functions.”

This is exactly what I saw where the below is just a basic screen shot of 1 of many tests that I carried out.

Read the whole thing for more details.

Comments closed

Deeper Into Adaptive Join Optimization

Erik Darling has a couple blog posts getting deeper into Adaptive Join Optimizations in SQL Server 2017.  First, Erik discusses the basics:

You see, in every plan, you see both possible paths the optimizer could have chosen. Right now it’s only limited to two choices, Nested Loops and Hash Joins.

Just guessing that Merge Joins weren’t added because there would have been additional considerations around the potential cost of a Sort operation to get the data in order.

Be sure to read Brent’s comment that in the initial release, it will just support columnstore indexes.  Then, Erik talks about execution plan details:

Some points of interest:

  • Actual Join Type: doesn’t tell you whether it chose Hash or Nested Loops
  • Estimated Join Type: Probably does
  • Adaptive Threshold Rows: If the number of rows crosses this boundary, Join choice will change. Over will be Hash, under will be Nested Loops.

The rest is fairly self-explanatory and is the usual stuff in query plans.

Good stuff here.

Comments closed

How Query Store And Plan Guides Interact

Grant Fritchey shows that query metadata gets a little weird when you have a plan guide trying to use one particular query and Query Store is forcing a different plan:

If we rerun the query and then take a look at the first operator in the execution plan, we can see that the Plan Guide is in use… and that the query hash has changed. It no longer matches the original query. Now it matches the query that included the query hint. This actually makes perfect sense. The Plan Guide is basically changing the query from the first example above, into the second.

Now, what happens when we toss in the Query Store

The query behavior is exactly what you want, but some of the metadata is no longer correct.

Comments closed

Join Simplification With Table Variables

Erik Darling has an example of how adding a key constraint to a table variable allowed the optimizer to filter it out:

I was trying to come up with a demo for something totally different. Don’t ask. Seriously. It’s top secret.

Okay, so it’s just embarrassing.

Anyway. I had these two queries. Which are actually the same query twice. The only difference is the table variable definition.

Click through for the demo and additional information.

Comments closed

Trying Out Interleaved Execution

Dmitry Pilugin has a test of interleaved execution of a multi-statement table-valued function:

First of all, note that Estimated Number of Rows is 10 000 now, which is correct and equals Actual Number of Rows. Due to the correct estimate the optimizer decided that there are enough rows to benefit from a partial (local/global) aggregation and introduced a partial Hash aggregate before the join.

If you take a Profiler, enable events SP:StmtStarting, SP:StmtCompleted, SQL:StmtStarting, SQL:StmtCompleted and run the query without and with a TF, you’ll see what does it actually mean “interleaved” in terms of the execution sequence.

During the regular execution the query starts executing, then the function is executed, the query continues execution and finishes the execution. We see the following event sequence:

This is a very thorough post, but if you have multi-statement TVFs, you absolutely want to read it.

Comments closed

Understanding Probe Residuals

Daniel Janik explains what a probe residual is in an execution plan:

A probe residual is important because they can indicate key performance problems that might not otherwise be brought to your attention.

What is a probe residual?

Simply put, a probe residual is an extra operation that must be performed to compete the matching process. Extra being left over things to do.

Click through for an example brought about by implicit conversion.

Comments closed

Benchmarking

Lukas Eder notes that the best way to compare performance is to…compare performance:

To bust a myth, if you have good reasons to think that a differently written, but semantically equivalent query might be faster (on your database), you should measure. Don’t even trust any execution plan, because ultimately, what really counts is the wall clock time in your production system.

If you can measure your queries in production, that’s perfect. But often, you cannot – but you don’t always have to. One way to compare two queries with each other is to benchmark them by executing each query hundreds or even thousands of times in a row.

Lukas goes on to compare a left join to a correlated subquery in three separate database products and the results are very interesting although absolutely not comparable across products because the DeWitt Clause is still a thing.  Great read.

Comments closed