Press "Enter" to skip to content

Category: Locks, Blocks, and Deadlocks

Finding Locking Chains in SQL Server

David Fowler is looking for lead blockers:

By far the most common cause for poor performance that I see is blocking. Unless you’ve got some monitoring in place it’s near impossible to identify the cause of any historical blocking incidents. Luckily, if you’re doing as I do and running the First Responder Kit regularly into tables, you will have a history of running statements thanks to sp_BlitzWho. That data will contain the SPIDs of any blocking processes.

Now, if you’ve ever had to look at that data and tried to figure out what the blocking chain is then you’ll know that it can be a total headache.

But David has a solution to find those fullbacks and pulling guards.

Comments closed

Negative Blocking Session IDs

Bob Dorr explains what those negative session IDs actually mean:

SQL Server may report a blocking session id as a negative integer value. SQL Server uses negative sessions ids to indicate special conditions.​​ 

Click through for the table. Bob also includes information on -5, the “any task/session can release the latch” scenario. This also covers information on the latches themselves and is worth keeping around in case you run into an issue at some point.

Comments closed

Alerting on Blocking in SQL Server

Ajay Dwivedi sets up an alert:

Recently one of my LinkedIn friends contacted me for a blocking alert that would work on on-prem & cloud SQL Server instances alike. Previously I wrote https://ajaydwivedi.com/2018/08/how-to-setup-blocking-alerts-on-sql-server blog post for on-prem SQL Server blocking alert which makes use of WMI based event.

So I wrote the following blocking alert setup code for SQL Server that has the below features –

– Send mail notification to one or more recepients unlike SQL Agent job which is restricted to only one operator.

– Parameter to control the consistent blocking threshold. Nobody wants spontaneous blockings which comes/goes.

– Parameter to control the mail notification delay. Say, I want to be notified every X minutes (defult 15 minutes).

– Parameter to control the mail notification subject.

– Separate mail notification for Blocking & Script failure itself.

Auto-clearing feature. Means, if the blocking is resolved, we should get an automated mail notification saying Blocking is cleared.

Click through for the instructions.

Comments closed

Locking Issue with Columnstore Indexes

Joe Obbish troubleshoots an issue on tables with columnstore indexes:

I recently ran into a production issue where a SELECT query that referenced a NOLOCK-hinted table was hitting a 30 second query timeout. Query store wait stats suggested that the issue was blocking on a table with a nonclustered columnstore index (NCCI). This was quite unexpected to me and I was eventually able to produce a reproduction of the issue. I believe this to be a bug in SQL Server that’s present in both RTM and the current CU as of this blog post (CU14). The issue also impacts CCIs as well but I did significantly less testing with that index type.

Read on for the issue, how you can replicate it, and a couple ways to work around it.

Comments closed

Intent Shared Locks in RCSI

Tomas Zika troubleshoots a deadlocking problem:

I was analyzing a deadlock graph and there was a mystery lock of type IS (Intent Shared). That was weird by itself because the database has Read Committed Snapshot Isolation (RCSI) enabled, which is the Optimistic Concurrency model that shouldn’t take shared locks. All the statements were contained in this database. Also, the locked table was seemingly unrelated to anything that has been going on in the deadlock report.

Click through for an image, a repro script, and an explanation as to what exactly is going on.

Comments closed

Finding Processes Waiting on Spinlocks

David Fowler helps us find processes waiting on spinlocks:

Spinlock contention is always a real headache to deal with. I recently saw an issue when spinlock contention on SOS_CACHESTORE was making the server virtually unresponsive. The issue was very intermittent with no obvious pattern but the assumption is that it was caused by a particular process in the application. Finding what that process was the tricky part, they don’t show up as waiting tasks so your usual scripts for looking for waiting processes may not work here.

It’s almost never spinlocks, but when it is, this is how you figure it out.

Comments closed

BULK_OPERATION Locks during a NOLOCK Scan

Paul Randal does some explaining:

I had an email question over the weekend where someone noticed that while executing a scan of a heap using NOLOCK, there was a BULK_OPERATION lock held on the heap for the duration of the scan. The question was why is the BULK_OPERATION lock needed, as surely there’s no way for the NOLOCK scan to read a problematic page?

Well, the answer is that the extra lock is needed *precisely* because the NOLOCK scan *can* read a problematic page if there’s a bulk operation happening on the heap at the same time.

And don’t call me Shirley.

Click through for a demonstration of this answer.

Comments closed

When APPEND_ONLY_STORAGE_INSERT_POINT is a Bottleneck

Paul Randal explains the APPEND_ONLY_STORAGE_INSERT_POINT latch and describes when and why this might be a problem:

Continuing my series of articles on latches, this time I’m going to discuss the APPEND_ONLY_STORAGE_INSERT_POINT latch and show how it can be a major bottleneck for heavy update workloads where either form of snapshot isolation is being used.

I strongly recommend you read the initial post in the series before this one, so you have all the general background knowledge about latches.

As always, read the whole thing.

Comments closed

Finding Bad (Worse?) NOLOCK Statements across Instances

Aaron Bertrand powers up for about six episodes straight, but the results are amazing:

In Part 1 of this series, I showed how to use a Visitor pattern to walk through one or more T-SQL statements to identify a problematic pattern where NOLOCK hints are applied to the target of an update or delete. The method in my initial examples was very manual, though, and won’t scale if this problem might be widespread. We need to be able to automate collecting a potentially large number of statements across an entire environment, and then try to eliminate false positives without manual intervention.

Read on to see how you can take what Aaron wrote last time and make it scalable.

Comments closed