Press "Enter" to skip to content

Hypervisor-Driven Wait Stats

Paul Randal explains that delays in the hypervisor layer could be responsible for SOS_SCHEDULER_YIELD waits in SQL Server:

Specifically, I was concerned about SOS_SCHEDULER_YIELD waits. This is a special wait type that occurs when a thread is able to run for 4ms of CPU time (called the thread quantum) without needing to get suspended waiting for an unavailable resource. In a nutshell, a thread must call into the SQLOS layer every so often to see whether it has exhausted its thread quantum, and if so it must voluntarily yield the processor. When that happens, a context switch occurs, and so a wait type must be registered: SOS_SCHEDULER_YIELD. A deeper explanation of this wait type is in my waits library here.

My theory was this: if a VM is prevented from running for a few milliseconds or more, that could mean that a thread that’s executing might exhaust its thread quantum without actually getting 4ms of CPU time, and so yield the processor causing an SOS_SCHEDULER_YIELD wait to be registered. If this happened a lot, it could produce a set of wait statistics for a virtualized workload that appears to have lots of SOS_SCHEDULER_YIELDs, when in fact it’s actually a VM performance problem and the SOS_SCHEDULER_YIELD waits are really ‘fake’.

Read on for more details, and definitely check out the link.  It was an eye-opener when I learned that SOS_SCHEDULER_YIELD didn’t mean “need more/more powerful CPUs.”