This one isn’t bad, but imagine a multi-statement deadlock, or a server with several deadlocks in an hour – how do you easily see if there were other errors on the server at the same time?
With SQL Server 2012+, we have a better tool to see when deadlocks occur – and the deadlock graphs are saved by default, so we don’t have to read the text version to figure it out, or run a separate trace to capture them.
In SSMS, open Object Explorer and navigate to Extended Events > Sessions > system_health > package0.event_file. Double-click to view the data.
Click through for the entire process.
This is an isolated test system, so I went to clean out Query Store as a reset. I didn’t need any of the old information in there, so I ran:
- ALTER DATABASE BabbyNames SET QUERY_STORE CLEAR ALL;
I was surprised when I didn’t see this complete very quickly, as it normally does.
Click through to see how Kendra diagnoses the issue.
There are two main kinds of SQL queries. SELECT/INSERT/UPDATE/DELETE statements are examples of Data Manipulation Language (DML). CREATE/ALTER/DROP statements are examples of Data Definition Language (DDL).
With schema changes – DDL – we have the added complexity of the SCH-M lock. It’s a kind of lock you don’t see with DML statements. DML statements take and hold schema stability locks (SCH-S) on the tables they need. This can cause interesting blocking chains between the two types where new queries can’t start until the schema change succeeds
Click through for suggestions with regard to schema locks, as well as a few tips for modifying large tables.
In order to update a row, SQL Server first needs to find that row, and only then it can perform the update. So every UPDATE operation is actually split into two phases – first read, and then write. During the read phase, the resource is locked for read, and then it is converted to a lock for write. This is better than just locking for write all the way from the beginning, because during the read phase, other sessions might also need to read the resource, and there is no reason to block them until we start the write phase. We already know that the SHARED lock is used for read operations (phase 1), and that the EXCLUSIVE lock is used for write operations (phase 2). So what is the UPDATE lock used for?
If we used a SHARED lock for the duration of the read phase, then we might run into a deadlock when multiple sessions run the same UPDATE statement concurrently.
Read on for more details.
We’ve only observed this deadlock with multiple concurrent sessions insert to the delta store for the same target CCI due to server memory pressure or very low cardinality estimates (less than 251 rows). The correct mitigation depends on why you’re seeing the issue in the first place. If you’re seeing it due to low cardinality estimates then fix your estimates, or at the very least get them above 250 rows. If you’re seeing them because the memory grant for the CCI build times out after 25 seconds then lower concurrency or increase server memory.
The problem can also be avoided by not doing concurrent inserts in the first place. In some cases a parallel insert may be a reasonable alterative. There’s also some evidence that the deadlock is only seen when the number of rows for insert is very close to 1048576, but we weren’t able to make any definitive conclusions around that.
Read the whole thing. Also check out his Connect item.
It might look complicated but it is actually very simple – query sys.sysprocesses with a cross apply using the sql_handle to get the text of the query, and then an outer apply with the same query again but you are joining to the blocking spid so you can get the text for the query that is doing the blocking. Beyond that, you can filter on various columns and refine your output
When blocking goes bad, it can go really bad. Sometimes it’s because someone (usually, that someone is me) forgets to commit a transaction before going to lunch, and those open locks cause a bunch of blocking. Sometimes a data load runs at a strange time, or an unusual amount of data gets loaded, or a query gets a bad plan and starts running long, or… you get the idea. There are a bunch of reasons this can come up.
The hardest part is that sometimes big blocking chains build up. The session I forgot to commit blocks 5 session. Each of those block 5 sessions. Each of those block 5 sessions… Eventually, I have 8000 sessions waiting on me, and I’m off eating a kale & farro salad. Oops.
The moral of the story is, don’t eat kale and farro salads; that sounds like rabbit food.
This is where my understanding of NOLOCK was wrong: while NOLOCKwon’t lock row level data, it will take out a schema stability lock.
A schema stability (Sch-S) lock prevents the structure of a table from changing while the query is executing. All SELECT statements, including those in the read uncommitted/NOLOCK isolation level, take out a Sch-S lock. This makes sense because we wouldn’t want to start reading data from a table and then have the column structure change half way through the data retrieval.
However, this also means there might be some operations that get blocked by a Sch-S lock. For example, any command requesting a schema modification (Sch-M) lock gets blocked in this scenario.
Read on to see which types of commands take schema modification locks, and ways to minimize the pain.
I was playing around with a nonclustered columnstore index on a disk-based table. Here’s what I was doing:
Session 1: this session repeatedly changed the value for a single row, back and forth. This puts it into the delta store.
Session 2: this session repeatedly counted the number of rows in the table, using the columnstore index.
With my sample data in this scenario, I found I frequently generated deadlocks.
Read on for more.
I ran this several times to see if there was a pattern to the madness, and it turned out it was. All waits were concentrated in database ID 2 – TEMPDB. Many people perk up by now and jump to the conclusion that this is your garden variety SGAM/PFS contention – easily remedied with more TEMPDB files and a trace flag. But, alas- this was further inside the TEMPDB. The output from the query above gave me the exact page number, and plugging that into DBCC PAGE gives the metadata object ID.
His conclusion is to reduce temp table usage and/or use memory-optimized tables. We solved this problem with replacing temp tables with memory-optimized TVPs in our most frequently-used procedures.
As noted, there are other ways to handle deadlocks in SQL Server. The approach presented here may have some drawbacks compared to others. There is an authorization issue for msdb.dbo.sp_send_dbmail that will need to be addressed for logins without elevated permissions. Additionally, you might get hit with an unexpected deluge of emails. (The first time I got deadlock alerts, there were more than 500 of them waiting for me in my Inbox.) Lastly, there’s the XML issue: it’s not everyone’s cup of tea. On the plus side, I really like the proactive nature: an event occurs, I get an email. I think most would agree it’s better to know something (bad) happened before the customers start calling. The automated generation of Deadlock Graph (*.xdl) files is convenient. And event notifications have been available since SQL Server 2005. As far as I know, the feature is available in all editions, including Express Edition.
Click through for all of the code Dave used to set this up.