Press "Enter" to skip to content

Category: Internals

Looking For Wait Types

Ewald Cress uses the debugger to search for particular waits:

In this case I was looking for PREEMPTIVE_COM_RELEASE, and sys.dm_xe_map_values tells me that on my 2014 RTM instance it has an index of 01d4 hexadecimal. Crazy as it sounds, I’m going to do a simple search through the code to look for places that magic number is used. As a two-byte (word) pattern we’ll get lots of false positives, but fortunately wait types are internally doublewords, with only one bit set in the high-order word. In other words, we’re going to look for the pattern 000101d4, 000201d4, 000401d4 and so forth up to 800001d4. Ignore the meaning of when which bit is going to be set; with only sixteen permutations, it’s quick enough to try them all.

Let’s focus on sqllang as the likely source – the below would apply to any other module too.

This post reminds me that my debugger skills aren’t very good.

Comments closed

Understanding Status In DMVs

Ewald Cress looks at a number of DMVs and how they expose query status:

sys.dm_exec_sessions: status

This metric is completely disjunct from the above ones, and mostly reflects attributes of a CSession class instance. The respective values are derived through the following decision tree:

  • If the internal Boolean member m_fIsConnReset is set, return dormant

  • Else if a flag living outside of CSession itself is set, return preconnect (I’ll touch on the source of this mystery flag below)

  • Else if a flag within the CSession itself is set (indicating that it has been provided some work to do) return running

  • Else return sleeping

It’s interesting to see how something so very similar can have so many different understandings.

Comments closed

SQL Server EXE Size

Arun Sirpal points out that sqlservr.exe is a lot smaller in 2012 and up as compared  to 2008 R2:

I never really noticed the difference before, but I understand why.

From 2012 onwards the architecture changed, it has been broken up into multiple DLLs. I can see the extra DLL files within the BINN folder these being sqllang.dll and sqlmin.dll where each are roughly 30MB each.

Makes me a bit curious as to the reason behind the breakout.

2 Comments

Thinking About Linux Internals

Anthony Nocentino speculates on the internals of SQL Server on Linux:

OK, so everyone wants to know how Microsoft did it…how they got SQL Server running on Linux. In this article, I’m going to try to figure out how.

There’s a couple of approaches they could take…a direct port or some abstraction layer…A direct port would have been hard, basically any OS interaction would have had to been looked at and that would have been time consuming and risk prone. Who comes along to save the day? Abstraction. The word you hear about a million times when you take Operating Systems classes in undergrad and grad computer science courses. 🙂

Anthony talks about picoprocesses, which causes me to say that containers (like Docker) are probably the most important administrative concept of the decade.  If you don’t fundamentally get the concept, learning it opens so many doors.

Comments closed

Preemptive Scheduling

Ewald Cress looks at preemptive scheduling:

Cooperative scheduling is a relay race: you simply don’t stop without passing over the baton. If you write code which reaches a point where it may have to wait to acquire a resource, this waiting behaviour must be implemented by registering your desire with the resource, and then passing over control to a sibling worker. Once the resource becomes available, it or its proxy lets the scheduler know that you aren’t waiting anymore, and in due course a sibling worker (as the outgoing bearer of the scheduler’s soul) will hand the baton back to you.

This is complicated stuff, and not something that just happens by accident. The textbook scenario for such cooperative waiting is the traditional storage engine’s asynchronous disk I/O behaviour, mediated by page latches. Notionally, if a page isn’t in buffer cache, you want to call some form of Read() method on a database file, a method which only returns once the page has been read from disk. The issue is that other useful work could be getting done during this wait.

Read on for a detailed example looking at xp_cmdshell.

Comments closed

Scheduler Stories

Ewald Cress has a couple of posts about the scheduler.  First, fiber mode scheduling:

The title of this post is of course an allusion to Ken Henderson’s classic article The perils of fiber mode, where he hammers home the point that fiber scheduling, a.k.a. lightweight pooling, appears seductive until you realise what you have to give up to use it.

We’ll get to the juicy detail in a moment, but as a reminder, the perils of fibers lie in their promiscuity: many fibers may share one thread, its kernel structures and its thread-local storage. This is no problem for code that was written with fibers in mind, including all of SQLOS, but unfortunately there are bodies of code for which this isn’t true.

Next up is the Windows scheduler:

Hardware interrupts, which run in kernel mode and return to user mode quickly, should be nothing more than tiny hiccups in a running thread’s quantum. The other 90% of the interrupt iceberg manifests in user mode as Deferred Procedure Calls (DPCs, or “bottom halves” to the Linux crowd) but should still only steal small change in terms of CPU cycles. Context switches to another thread represent a completely different story, because it could be ages before control returns to our thread, meaning that our fiber scheduler is completely out of commission for a while.

This possibility – a SQLOS scheduler losing the CPU for an extended period – is just one of those things we need to live with, but on a sane server, it shouldn’t be something to be too concerned about. Consider that this happens all the time in virtualised environments, where our vCPU can essentially cease to exist while another VM has a ride on the physical CPU.

These are fairly long reads, but we’re getting to levels where you can see these settings in the Database Engine (like Lightweight Pooling).

Comments closed

Database Snapshot Creation History

Paul Randal shows how to read the master transaction log to find when database snapshots were created:

Earlier today someone asked on the #sqlhelp Twitter alias if there is a history of database snapshot creation anywhere, apart from scouring the error logs.

There isn’t, unfortunately, but you can dig around the transaction log of the master database to find some information.

When a database snapshot is created, a bunch of entries are made in the system tables in master and they are all logged, under a transaction named DBMgr::CreateSnapshotDatabase. So that’s where we can begin looking.

Click through for the script and some explanation around it.

Comments closed

Phantom Reads

Wayne Sheffield discusses phantom reads:

Run both code scripts again (Code Script 1 first, and Code Script 2 within 10 seconds). This time, you will see that Code Script 2 completes immediately without being blocked, and when Code Script 1 finishes, it has spawned additional data in its result set. A phantom read in action.

If you want to hide from phantom reads completely, then you’ll need to use either the serializable or snapshot transaction isolation levels. Both of these have the same concurrency effects: No dirty reads, non-repeatable reads, or phantom reads. The difference is in how they are implemented: the serializable transaction isolation level will block all other transactions affecting this data, while the snapshot isolation level utilizes row versions to create connection-specific versions of the table for the transaction – all of these row versions will cause increased activity in the tempdb database. Let’s take a look at how the snapshot isolation level will eradicate the phantom reads. First off, we need to modify the database to accept this isolation level.

After reading Wayne’s post, if you want a more academic (i.e., less fun) read, you can also go back to the Microsoft Research isolation levels paper, which describes most of the isolation levels we have in place today for SQL Server.

Comments closed

Page And Key WaitResources For Deadlocks

Kendra Little explains page and key information in deadlock graphs and blocking chains:

1.4) Can I see the data on the page that was locked?

Well, yes. But … do you really need to?

This is slow even on small tables. But it’s kinda fun, so… since you read this far… let’s talk about %%physloc%%!

%%physloc%% is an undocumented piece of magic that will return the physical record locator for every row. You can  use %%physloc%% with sys.fn_PhysLocFormatter in SQL Server 2008 and higher.

This was a very interesting read; check it out.

Comments closed