When messing about (developing) things that are based on the query store I came across an unusual phenomenon. We were missing query store intervals. It’s not that data had been deleted but rather than it was never created in the first place. Take a look at this;
I’d set the query store to 5 minute intervals for testing purposes and left it. Notice that there is no data between 15:20 and 15:25. It’s not that the data has been deleted, we know that because the runtime_stats_interval_id is still creating increments correctly.
Click through for the answer.
Whether you force plans manually, or let SQL Server force them with the Automatic Plan Correction feature, I still view plan forcing as a temporary solution. I don’t expect you to have plans forced for years, let alone months. The life of a forced plan will, of course, depend on how quickly code and schema changes are ported to production. If you go the “set it and forget it route”, theoretically a manually forced plan could get used for a very long time. In that scenario, it’s your responsibility to periodically check to ensure that plan is still the “best” one for the query. I would be checking every couple weeks; once a month at most. Whether or not the plan remains optimal depends on the tables involved in the query, the data in the tables, how that data changes (if it changes), other schema changes that may be introduced, and more.
Further, you don’t want to ignore forced plans because there are cases where a forced plan won’t be used (you can use Extended Events to monitor this). When you force a plan manually, forcing can still fail. For example, if the forced plan uses an index and the index is dropped, or its definition is changed to the point where it cannot be used in plan in the same manner, then forcing will fail. Important note: if forcing fails, the query will go through normal optimization and compilation and it will execute; SQL Server does not want your query to fail! If you’re forcing plans and not familiar with the reasons that it can fail, note the last_force_failure_reason values listed for sys.query_store_plan. If you have manually forced a plan for a query, and the force plan fails, it remains forced. You have to manually un-force it to stop SQL Server from trying to use that plan. As you can see, there are multiple factors related to plan forcing, which is why you don’t just force a plan and forget it.
There is much sound advice in this post.
The core of the answer is very simple. Query Store, like any other data written to a database, whether a system table or a user table, is a logged operation. So, when you backup the database, you’re backing up Query Store data. When you backup the logs, you’re also backing up Query Store data. A point in time will include all the data written to the Query Store at that point.
However, that’s the kicker. At what point was the Query Store information written to disk?
Read on to learn when, and what you can do about it if you prefer otherwise.
If you are trying to execute an ALTER DATABASE command to change a Query Store option (e.g. turn it off, change a setting) and it is blocked, take note of the blocking session_id and what that session_id is executing. If you are trying to execute this ALTER command right after a failover or restart, you are probably blocked by the Query Store data loading.
As a reminder, when a database with Query Store enabled starts up, it loads data from the Query Store internal tables into memory (this is an optimization to make specific capabilities of Query Store complete quickly). In some cases this is a small amount of data, in other cases, it’s larger (potentially a few GB), and as such, it can take seconds or minutes to load. I have seen this take over 30 minutes to load for a very large Query Store (over 50GB in size).
Erin has a story which ties this together, so check that out.
As much as I love all this extra data, it’s important to note that some information is more relevant for an actual execution plan, versus an estimated one (e.g. tempdb spill information). Some days we can capture and use the actual plan for troubleshooting, other times we have to use the estimated plan. Very often we get that estimated plan – the plan that has been used for problematic executions potentially – from SQL Server’s plan cache. And pulling individual plans is appropriate when tuning a specific query or set or queries. But what about when you want ideas on where to focus your tuning efforts in terms of patterns?
The SQL Server plan cache is a prodigious source of information when it comes to performance tuning, and I don’t simply mean troubleshooting and trying to understand what’s been running in a system. In this case, I’m talking about mining information from the plans themselves, which are found in sys.dm_exec_query_plan, stored as XML in the query_plan column.
When you combine this data with information from sys.dm_exec_sql_text (so you can easily view the text of the query) and sys.dm_exec_query_stats (execution statistics), you can suddenly start to look for not just those queries that are the heavy hitters or execute most frequently, but those plans that contain a particular join type, or index scan, or those that have the highest cost. This is commonly referred to as mining the plan cache, and there are several blog posts that talk about how to do this. My colleague, Jonathan Kehayias, says he hates to write XML yet he has several posts with queries for mining the plan cache:
It’s a good article with a lot of useful information.
For those of you not familiar with the Query Store settings, please check out my post which lists each one, the defaults, and what I would recommend for values and why. When discussing MAX_STORAGE_SIZE_MB, I mention monitoring via sys.database_query_store_options or Extended Events. As much as I love Extended Events, there isn’t an event that fires based on a threshold exceeded. The event related to size is query_store_disk_size_over_limit, and it fires when the space used exceeds the value for MAX_STORAGE_SIZE_MB, which is too late. I want to take action before the maximum storage size is hit.
Therefore, the best option I’ve found is to create an Agent job which runs on a regular basis (maybe every four or six hours initially) that checks current_storage_size_mb in sys.database_query_store_options and calculates the space used by Query Store as a percentage of the total allocated, and then if that exceeds the threshold you set, send an email. The code that you can put into an Agent job is below. Please note you want to make sure the job runs in the context of the user database with Query Store enabled (as sys.database_query_store_options is a database view), and configure the threshold to a value that makes sense to your MAX_STORAGE_SIZE_MB. In my experience, 80% has been a good starting point, but feel free to adjust as you see fit!
Click through for the script.
This is question I’ve gotten a few times in class…Can you force a plan for a different query with Query Store?
Assume you have two similar queries, but they have different query_id values in Query Store. One of the queries has a plan that’s stable, and I want to force that plan for the other query. Query Store provides no ability to do this in the UI, but you can try it with the stored procedure. Let’s take a look…
I can see the potential benefit, but the downside risk is huge, so it makes sense not to allow this.
I’ve had two comments recently on my blog about Query Store causing the plan cache to be flushed. There was a known issue related to the plan cache flushing after Query Store was enabled, but this was fixed in CU2 for SQL Server 2016 SP1. So I did some testing and here is what I think is causing the confusion:
When you enable Query Store, which is done with an ALTER DATABASE SET statement, the plan cache for the database is flushed.
Now, before anyone writes up a UserVoice item, understand that there are several ALTER DATABASE SET commands that cause the plan cache for a database to be flushed. For example, taking a database OFFLINE causes the database plan cache to be flushed. That one seems intuitive, right? So why is the plan cache cleared when you enable Query Store, or change one of the settings? To ensure that new Query Store data is not lost. This relates to the internals of how Query Store works, which aren’t essential to dig into, the point is that this behavior is known by Microsoft and expected.
There is what I’d consider a documentation bug around describing what happens when you enable Query Store, but the fact that the plan cache gets cleared makes sense.
This is an isolated test system, so I went to clean out Query Store as a reset. I didn’t need any of the old information in there, so I ran:
- ALTER DATABASE BabbyNames SET QUERY_STORE CLEAR ALL;
I was surprised when I didn’t see this complete very quickly, as it normally does.
Click through to see how Kendra diagnoses the issue.
Query Store has mechanisms for automatically cleaning your data. It is possible to cause them to break down. While presenting a session about the Query Store recently, I was asked what happened if you set the size of the Query Store below the amount of data currently in the store. I didn’t know the answer, so we tried it. Things got a little weird.
Click through to see how weird.