Press "Enter" to skip to content

Category: Performance Tuning

Finding Queries with Missing Index Requests in SQL Server 2019

Erik Darling shows off a nicety in SQL Server 2019 and later:

Note that this script does not assemble the missing index definition for you. That stuff is all readily available in the query plans that get returned here, and of course the missing index feature has many caveats and limitations to it.

You should, as often as possible, execute the query and collect the actual execution plan to see where the time is spent before adding anything in.

Read on for the script.

Comments closed

Perspective on Spinlocks

Erik Darling speaks with wisdom:

The more people want to avoid fixing what’s really wrong with their server, the more they go out and find all the weird stuff that they can blame on something else (usually the product), or keep doing the same things that aren’t fixing the problem.

Spinlocks are one of those things. People will measure them, stare them, Google them, and have no idea what to make of them, all while being sure there’s something going on with them.

I don’t want to discount when spinlocks can actually cause a problem, but you shouldn’t treat every performance problem like it’s a bridge too far from what you can solve.

I have seen performance problems which actually did come down to spinlock issues. For every one of those, I’ve seen, oh, about 95-100 or so which came down to inefficient code.

Comments closed

An Index for Change Tracking Cleanup

James Ferebee creates an index:

If the issue persists and we stack a lot of data that needs to be removed which can be too much for autocleanup to manage. At that point, we recommend running (as is indicated in the error message) sp_flush_CT_internal_table_on_demand which is discusseed in detailed in Amit’s blog post Change Tracking Cleanup–Part 1.

I will not discuss the internals of the process as Amit Banerjee already touched on this in the referenced blog. However if you are routinely encountering issues where autocleanup can’t keep up and/or manual cleanup is taking significant time, you can add the index and see if it helps and I have the process enumerated below. Keep in mind this is not guaranteed to fix all cleanup issues and it may still be necessary to run manual cleanup regularly. If you continue to have issues with cleanup feel free to create a ticket with us here at CSS to assist you and get specific data to your environment.

Read on for the index definition as well as some important notes about whether you might need it.

Comments closed

Optimizing Hive Performance with Tez

Jay Desai has some recommendations around tuning Tez queries:

Tuning Hive on Tez queries can never be done in a one-size-fits-all approach. The performance on queries depends on the size of the data, file types, query design, and query patterns. During performance testing, evaluate and validate configuration parameters and any SQL modifications. It is advisable to make one change at a time during performance testing of the workload, and would be best to assess the impact of tuning changes in your development and QA environments before using them in production environments. Cloudera WXM can assist in evaluating the benefits of query changes during performance testing.

Click through for several configuration and query considerations.

Comments closed

Slow File Open Times in Power BI

Marco Russo explains why opening some Power BI files might take so long:

There could be many reasons for that, but if you have calculated columns and/or calculated tables in your model, you should be aware that they could be the reasons why this happens. It could be, so I want to explain when this happens.

The short explanation is the following: when you open a PBIX file, Power BI Desktop automatically recalculates those calculated columns and calculated tables that depend on a volatile formula.

Read on for the longer explanation, which includes a (possibly incomplete) list of volatile formulas.

Comments closed

PyODBC vs C# ODBC Performance Differences

Jose Manuel Jurado Diaz explains a performance difference:

A customer asked today, why using ODBC Driver 17 for SQL Server in Python with PYODBC we have a slightly difference in terms of time taken if we compare with C# System.Data.Odbc. Following, I would like to share my lesson learned about it.

Read on for Jose’s explanation. My short version is, it seems particularly important when using the Python ODBC driver to write the exact query you want rather than a SELECT * or query which returns rows/columns you don’t need.

Comments closed

Entity Framework and Include Operations

Josh Darnell has a warning:

I can imagine someone reading that and not seeing the gravity of the situation. “Hey, 500 rows isn’t that many – we have modern hardware!”

I thought it was worth writing about a real world situation where this can get seriously out of hand.

Read on for a scenario in which 64 rows turns into 100,000 rows pretty quickly.

Comments closed

The User Transaction Scope for Temporary Objects

Bob Dorr troubleshoots a performance problem:

When the temporary table is bound to the user transaction it is both created and destroyed as part of the transaction.  The same logic in a procedure attempts to avoid the creation and destruction, for each execution, by using temporary table caching.

From the issue I was debugging, the user transaction scope mattered because creation and destruction of metadata may be an expensive operation.  

This post ties into two separate things: first, how temp objects tie to specific sessions; and second, the cost of creating and destroying temporary objects. For the latter, a couple quick pieces of advice:

  • Reduce the number of temporary objects you create. If you can solve a problem with fewer temp tables or table variables while maintaining acceptable performance, that can help on busy systems.
  • Never explicitly drop temp tables. There’s no benefit to explicitly dropping temp tables, as they’ll go away as soon as the session ends. Also, not dropping temp tables is the first step to:
  • Embrace temp table reuse. There are specific rules around when you can re-use a temp table. Each re-use of a temp table means two fewer metadata operations (one delete and one create).
  • Use memory-optimized table variables instead of temp tables or table variables.
  • Turn on memory-optimized tempdb metadata. The biggest issue here is that you lose cross-database queries into tempdb views. That can end up being painful and is why I can’t recommend it as a general solution.

Comments closed

Azure Redis Tips

Arun Sirpal enumerates some advice:

My learnings on Redis thus far which you may find useful:

1. Location of Redis should be close to your app.

2. Data structures within Redis, larger key value sizes lead to fragmentation of memory space and these larger memory requirements means more network data transfer, Redis states to use 100KB maximum, this will affect the transfer time allocated from the app. It could time out if the data request is big.

Click through for the rest of Arun’s advice. My advice on the 100KB maximum is that it really should be closer to 100 bytes or 1KB max in practice, especially for storing data which differs by entity (user, customer, organization, whatever your domain uses).

Comments closed

Measure Explosion and Optimization with Tabular Editor 3

Matt Allington pulls out the dynamite:

OK, so what’s the problem? The problem is that sometimes you get such a deep, nested path of measures that the underlying formula is more complex than it needs to be, and it is virtually impossible to debug. If you simply explode/expand a final measure into the full, underlying DAX code, it makes it much easier to see what is going on. Returning to the DAX example above, this is what the third measure actually looks like under the hood, removing all references to the dependent measures

Click through for an article and a video.

Comments closed