The first thing I found is that there were 16 attempts at promotion, and four successful promotions.
Why did this seem weird? I dunno.
Why would there be only 4 successful attempts with no competing locks from other queries?
Why wouldn’t all 16 get promotions?
Find out the answer to this and much, much more if you click the link.
You will find that the SELECT statement executes, ignoring the exclusive lock, because it is not a write lock, and the data on the page has not been changed.
The main reason people try to do this is to force access to a row in a single threaded manner. For example, building their own sequence number, either in a row they update, or by trying to do MAX() on all of the data in a table to make sure only one reader gets the same value.
This is generally a bad idea, since locking an entire table is a generally bad idea, but if you needed to block readers, you can couple the XLOCK with a PAGLOCK. So, change the first reader to:
FROM Demo.Test WITH (XLOCK,PAGLOCK);
As Louis points out in the summary, locking is complicated. Having a good understanding of the locking model will serve you well, though.
I have a client that used Itzik Ben-Gan’s solution of creating a filtered nonclustered columnstore index to achieve batch mode on a rowstore (in fact I proposed that the client consider it). They have an OLTP system, and often perform YTD calculations. When they tested, processing time was reduced by 30 to 50 percent, without touching a single line of application code. If that ain’t low hanging fruit, I don’t know what is —
However, during testing, I noticed some intermittent blocking that didn’t make sense to me. But I couldn’t nail it down, and they went live with the “filtered nonclustered columnstore index” solution.
Once they deployed – and there was a lot of concurrency – I could see what had eluded me during my proof of concept: blocking in tempdb.
Read on for the repro and check out Ned’s UserVoice bug report.
I can’t make heads or tails of that but I can tell you that seems like a really bad brawl for resources. It’s like a Jerry Springer show with a few extras thrown in. Since I knew that my graph wasn’t going to be helpful in this instance I went to the actual xml and tried to figure out how I could tune this to make it better in the future. I needed to know exactly where the issue was so the waitresource pointer is a good place to start.
You will see many blog articles on how to find SQL wait resources when the resource type is a key, a page, or an object (I suggest Kendra Little’s blog post) There is however a noticeable glut on articles explaining RID (a RID is a key on a table with no clustered index). I finally found how to tie a RID to an actual resource name but it was used for corruption so the details were a bit hazy at first.
Click through for this work of database art as well as a script which links RIDs back to specific object names.
So what causes Range Locks? Just ask Sunil. He knows everything (this assumes the serializable isolation level):
If the key value exists, then the range lock is only taken if the index is non-unique. In the non-unique index case, the ‘range’ lock is taken on the requested key and on the ‘next’ key.
If the ‘next’ key does not exist, then a range lock is taken on the ‘infinity’ value. If the index is unique then a regular S lock on the key.
If the key does not exist, then the ‘range’ lock is taken on the ‘next’ key both for unique and non-unique index.
If the ‘next’ key does not exist, then a range lock is taken on the ‘infinity’ value.
Range Predicate (key between the two values)
‘range lock on all the key values in the range when using ‘between’
‘range’ lock on the ‘next’ key that is outside the range. This is true both for unique and non-unique indexes. This is to ensure that no row can be inserted between the requested key and the one after that. If the ‘next’ key does not exist, then a range lock is taken on the ‘infinity’ value.
Erik has an interesting example and lets us see a potential concurrency problem with multi-table indexed views.
There’s a lot going on there, but much of it is noise. There is a whole bunch of contention on the table
SqlPerf.Session— session 342 is trying to perform an update, but it is stuck waiting on shared locks taken by two services. Now, let’s check the Optimize Layout box above, and look at the circular graph again. Simplified, right?
This checkbox is easily the most powerful option to discard noise and help you focus on the crux of the deadlock issue. In the original graph, you can see that many of the elements presented are simply innocent bystanders — waiters that are captured as part of the deadlock activity, but in no way contributing to it. We can detect this in a lot of cases and so, when you check the box, we hide them from view, allowing you to focus much more directly on the key players involved in the deadlock. There is no question that eliminating the noise can really speed up troubleshooting; with those extra nodes removed, I can clearly see that I have some kind of order-of-operations issue on the
SqlPerf.Sessiontable, between the transfer service and the processor service.
The truncate option is fast and efficient but did you know that it takes a certain lock where you could actually be blocked?
What am I talking about? When you issue a truncate it takes a Sch-M lock and it uses this when it is moving the allocation units to the deferred drop queue. So if it takes this lock and you look at the locking compatibility matrix below you will see what can cause a conflict (C).
Arun includes an image which shows what can block what, and also shows us an example.
Troubleshooting of the blocking and concurrency issues is, in the nutshells, a simple process. You need to identify the processes involved in blocking conditions or deadlocks and analyze why those processes acquire the locks on the same resources. In majority of cases, you need to analyze queries and their execution plans identifying possible inefficiencies that led to excessive number of locks being acquired.
Collecting this information is not a trivial task. The information is exposed through DMVs (you can download the set of scripts here); however, it requires you to run the queries at time when blocking occurred. Fortunately, SQL Server allows you to capture blocking and deadlock conditions with the blocked process report and deadlock graph, analyzing them later.
There is the caveat though. Neither blocked process report nor deadlock graph provide you execution plans of the statements. Nor do they always include affected statements in the plain text. You may need to query plan cache and other DMVs to get this information and longer you wait lesser is the chance that the information is available. Moreover, SQL Server may generate enormous number of blocked process reports in cases of prolonged blocking and complex blocking chains, which complicates the analysis.
Confirmed to work with SQL Server 2012 and later, but might work on earlier versions as well. Dmitri has released it to the public, so check it out.
This one isn’t bad, but imagine a multi-statement deadlock, or a server with several deadlocks in an hour – how do you easily see if there were other errors on the server at the same time?
With SQL Server 2012+, we have a better tool to see when deadlocks occur – and the deadlock graphs are saved by default, so we don’t have to read the text version to figure it out, or run a separate trace to capture them.
In SSMS, open Object Explorer and navigate to Extended Events > Sessions > system_health > package0.event_file. Double-click to view the data.
Click through for the entire process.
This is an isolated test system, so I went to clean out Query Store as a reset. I didn’t need any of the old information in there, so I ran:
- ALTER DATABASE BabbyNames SET QUERY_STORE CLEAR ALL;
I was surprised when I didn’t see this complete very quickly, as it normally does.
Click through to see how Kendra diagnoses the issue.