SOS_RWLock

Ewald Cress looks at SOS_RWLock, a reader-writer lock (at least the pre-2016 version):

This lock class can best be appreciated by comparing it to a mutex. Like the mutex, a reader-writer lock can only be acquired in exclusive mode by one requestor at a time, but instead of only exposing this exclusive-acquire (Writer) option, it alternatively allows acquisition in shared (Reader) mode. This stuff is completely natural to us database folks of course, because the semantics is a subset of the behaviours we get from familiar database locks.

Basic rules of the road:

  • Any number of simultaneous “clients” can share ownership of the lock in Read mode.

  • Readers block writers.

  • Writers block readers and other writers.

  • Blocking means that the requesting worker gets suspended (scheduled off the processor) and accrues a wait of a type specified in the lock acquisition request.

There’s a huge amount of detail here, and I for one am glad that there isn’t a quiz later..

SOS_Mutex

Ewald Cress continues his dive into system internals, this time looking at SOS_Mutex:

Put differently, we can build a mutex from an auto-reset EventInternal by tacking on an owner attribute, making a rule that only the owner has the right to signal the event, and adding assignment of ownership as a fringe benefit of a successful wait. A nonsignalled event means an acquired mutex, and a signalled event means that the next acquisition attempt will succeed without waiting, since nobody currently owns the mutex. The end result is that our SOS_Mutex class exposes the underlying event’sSignal() method and its own take on Wait(). From the viewpoint of the mutex consumer, the result of a successful wait is that it owns the mutex, and it should act honourably by calling Signal() as soon as it is done using the resource that the mutex stands guard over.

There’s some deep detail here, so this is definitely one of those “after your first cup of coffee” posts to read.

EventInternal

Ewald Cress continues to dig into scheduling, this time looking at EventInternal:

signalMode adds a twist. The behaviour described for the traffic light corresponds to a signal mode of 0, also known as a manual reset event. Here the event stays signalled irrespective of how many consumers pass through it (=successfully wait on it).

A signal mode of 1, however, turns it into an auto-reset event, where the act of successfully waiting on the event resets it to unsignalled. This is now more akin to a turnstile that only lets one person through after being signalled, e.g. by a scan of a valid transport pass or a button press by a security guard.

Interestingly, a event object is also sometimes known as a latch – that’s something to chew on for SQL Server folks. Don’t get hung up about who or what signals it; that is a separate issue altogether. Just keep in mind that the signal mode is a permanent attribute of the event – you construct it as manual reset or auto-reset. Full disclosure: there seems to be at least one more SignalMode (2, used by the related SOS_WaitableAddress), but let’s ignore it today.

This is part of a great series, and I hope Ewald keeps it up.  I’d probably drop a few bucks on a cleaned up and edited version of his discussion of internals in an 80-page or so e-book.

SystemThread

Ewald Cress looks at SystemThread:

SystemThread, a class within sqldk.dll, can be considered to be at the root of SQLOS’s scheduling capabilities. While it doesn’t expose much obviously exciting functionality, it encapsulates a lot of the state that is necessary to give a thread a sense of self in SQLOS, serving as the beacon for any code to find its way to an associated SQLOS scheduler etc. I won’t go into much of the SQLOS object hierarchy here, but suffice it to say that everything becomes derivable by knowing one’s SystemThread. As such, this class jumps the gap between a Windows thread and the object-oriented SQLOS.

The simplest way to conceptualise a thread in SQL Server is to think of a spid or connection busy executing a simple query, old skool sysprocesses style. It’s not hip, but it’s close enough to the truth to be useful. This conflates a few things that are separate entities, but it is a good starting point for teasing things apart as needed

This is another dive into internals; prepare your thinking caps.

Thread-Local Storage

Ewald Cress digs into fundamentals:

As a teaser for where this is heading, I’ll reframe the problem as classic SQL Server examples. Firstly, when a latch wait occurs somewhere in the bowels of a LatchBase subclass instance, how does that latch method know to track the wait against an instance of a Worker, or make it known to the world that it is holding up that Worker? And secondly, at a much higher abstraction level, when a task executes a user query and needs to access a table, how does the access methods code know what security principal to do security checks against? We are taking the first steps towards answering these questions here.

I enjoy Ewald’s explanations because when I’m done, I really feel like I have a clue of what’s going on.  It all fades away as soon as I look away from the screen, but that’s on me, not him.

Linked Lists

Ewald Cress digs into linked lists to explain (deep) SQLOS internals:

The memory layout of a linked list doesn’t imply specific usage semantics. If we consistently insert at the head and remove from the tail, we have a queue. If we both insert and remove items from the head, we have a stack. And it is possible to have variations of these as well.

Finally, it is clear that insert and remove operations are multi-step, and the list is in an inconsistent state – i.e. not safe to traverse or modify – in the middle of such an operation. For this reason, locking semantics must be implemented. This will typically take the form of a spinlock which must be aquired before trying to access the list for any purpose. The object which owns the list head will then normally have a spinlock as a data member associated with the list head, although it is possible to have one spinlock protect multiple items beyond just a single linked list; this could be a sign of sane design, but conversely it means a coarser locking grain, which can sometimes work against you.

Even at this “simple” level, we’re digging pretty deep here.

Analyze Transaction Log Hierarchy

Paul Randal has written code to determine how much log an operation has generated:

Over the weekend there was a discussion on the MVP distribution list about the sys.dm_tran_database_transactions DMV and how one cannot use it to accurately determine how much log an operation has generated because it doesn’t provide a roll-up of the sub-transaction metrics to the outer transaction. This makes the output somewhat non-intuitive.

The discussion prompted me to write some code I’ve been meaning to do since 2012, when SQL Server 2012 introduced a field in LOP_BEGIN_XACT log records that tracks the transaction ID of the parent transaction, allowing the hierarchy of transactions to be investigated.

The actual code is at the bottom of the article, and is available in a zip file here.

It provides two stored procs, sp_SQLskillsAnalyzeLog and sp_SQLskillsAnalyzeLogInner, with the former making use of the latter, and the latter calling itself recursively.

I’d consider this squarely in the advanced troubleshooting realm.  Definitely read the whole thing and Paul’s code.

Updated Scheduling

The CSS SQL Server Engineers team talks about a new scheduling algorithm in SQL Server 2016:

SQL Server 2016 gets a scalability boost from scheduling updates.   Testing uncovered issues with the percentile scheduling based algorithms in SQL Server 2012 and 2014.  A large, CPU quantum worker and a short, CPU quantum worker can receive unbalanced access to the scheduling resources.

Take the following example.  Worker 1 is a large, read query using read ahead and in-memory database pages and Worker 2 is doing shorter activities.   Worker 1 finds information already in buffer pool and does not have to yield for I/O operations.    Worker 1 can consume its full CPU quantum. 

On the other hand, Worker 2 is performing operations that require it to yield.  For discussion let’s say Worker 2 yields at 1/20th of its CPU, quantum target.  Taking resource governance and other activities out of the picture the scheduling pattern looks like the following.

I’m going to have to reserve judgment on this.  It’s been in Azure SQL Database for a while, so I’m not expecting bugs, but I wonder if there are any edge cases in which performance gets worse as a result of this.

Blocking Operators

Gail Shaw explains how operators like sort can reduce the actual row count:

A non-blocking operator is one that consumes and produces rows at the same time. Nested loop joins are non-blocking operators.

A blocking operator is one that requires that all rows from the input have been consumed before a single row can be produced. Sorts are blocking operators.

Some operators can be somewhere between the two, requiring a group of rows to be consumed before an output row can be produced. Stream aggregates are an example here.

Gail ends by explaining that this is why “Which way do you read an execution plan, right-to-left or left-to-right?” has a correct answer:  both ways.  This understanding of blocking, non-blocking, and partially-blocking operators will also help you optimize Integration Services data flows by making you think in terms of streams.

Finding A Page

Kenneth Fisher is looking for the database page tied to a particular row:

It’s not one of those things you have to do frequently but every now and again you need to know what page a particular row is on. It’s not terribly difficult. There is a virtual column called %%physloc%%, but unfortunately it’s not terribly useful on it’s own.

Finding a record’s page probably isn’t something you want to do every day; this is more one of those “once in a great while” activities which can help with troubleshooting.

Categories

May 2017
MTWTFSS
« Apr  
1234567
891011121314
15161718192021
22232425262728
293031