OS Threads DMV

Ewald Cress moves up the internals stack a little further and looks at a DMV:

Broadly speaking, a DMV presents just another iterator that can be plugged into a query plan. The execution engine calls GetRow() repeatedly until it reaches the end, and the iterator emits rows. The only unusual thing is that the ultimate source of the data may have nothing to do with the storage engine.

Now if you asked me to guess where in the world we’d find a list of all threads to iterate over, I would have expected that we’d start with the NodeManager, iterating over all SOS_Nodes, and then for each of them iterating over its collection of associated SystemThreads. After all, we have a guaranteed 1:1 correspondence between threads and SystemThreads, and I figured that all SystemThreads enlist themselves into a parent SOS_Node upon creation. No-brainer, right?

Turns out that this guess would have been completely wrong, and the reason it would have been a broken implementation will become apparent when we look at the started_by_sqlservr column.

Definitely read the whole thing, but you may need to top off your caffeinated beverage of choice first.  Also, I’m pretty sure Ewald promised to hammer one out once per week; that’s how I read it, at least…

Context Switches In SQL Server

Ewald Cress continues his journey to the center of the SQLOS:

The SQLOS scheduler exists in the cracks between user tasks. As we’re well aware, in order for scheduling to happen at all, it is necessary for tasks to run scheduler-friendly code every now and again. In practice this means either calling methods which have the side effect of checking your quantum mileage and yielding if needed, or explicitly yielding yourself when the guilt gets too much.

Now from the viewpoint of the user task, the experience of yielding is no different than the experience of calling any long-running CPU-intensive function: You call a function and it eventually returns. The real difference is that the CPU burned between the call and its return was spent on one or more other threads, while the current thread went lifeless for a bit. But you don’t know that, because you were asleep at the time!

Definitely read the whole thing.

R Services Internals

Niels Berglund has an excellent series on R Services internals.  Here’s the latest post:

This post is the ninth post about Microsoft SQL Server R Services, and the eight post that drills down into the internal of how it works.

So far in this series we have been looking at what happens in SQL Server as well as the launchpad service when we execute sp_execute_external_script, and we have still no real “clue” to where the R engine comes into play.

Well, hopefully that will change (at least a little bit) with this post, as we here will look at what happens when we leave the launchpad service.

This series is like candy to me.  It’s the best write-up I’ve seen so far about what’s really happening when you run SQL Server R Services.

SQLOS Task Origins

Ewald Cress explains how SQLOS tasks come into being:

The above system tasks wear their purpose on their sleeves, because the function pointer in the SOS_Task::Param is, well, to the point. The tasks that run user queries are more abstract, because the I/O completion port listener can’t be bothered with understanding much beyond the rudiments of reading network packets – it certainly can’t be mucking about with fluent TDS skills, SQL parsing and compilation, or permission checks.

So what it does is to enqueue a task that speaks TDS well, pointing it to a bunch of bytes, and sending it on its way. Here is an example of such a dispatch, which shows the “input” side of the WorkDispatcher:

Read the whole thing.

Decomposing Power BI Desktop Files

Reza Rad wants to see exactly where the M scripts in a Power BI Desktop file are stored:

Talking about Power Query; DataMashup file is all you need. It includes everything from the structure of queries, tables, parameters, list, to the actual M scripts behind the scene. You can Fetch all of these information from this single file. Let’s look at the structure of this file. If you open this file with a text editor. you will see some binary things first (which are related to the zipped nature of this file), and also some XML information. Yes, this is a zipped file. Let’s start with unzipping it into a folder. I’ve done that with 7-zip application.

This is an interesting peek under the covers of a PBIX file.

Memory Grants Affecting Columnstore Load

Denzil Ribeiro explains how memory grant pressure can determine whether a columnstore bulk insert skips the deltastore or not:

We found that only at the beginning of the run, there was contention on memory grants (RESOURCE_SEMAPHORE waits), for a short period of time. After that and later into the process, we could see some latch contention on regular data pages, which we didn’t expect as each thread was supposed to insert into its own row group. You would also see this same data by querying sys.dm_exec_requests live, if you caught it within the first minute of execution, as displayed below.

This is useful in case you run into the issue, but also useful as a case study on effective troubleshooting.

Waits: External But Not Preemptive

Ewald Cress has some thoughts on external wait types:

Previously I dug into preemptive waits in SQLOS, and to be honest, I equated “preemptive” with “external”. For the most part the two go hand in hand after all.

To recap, a preemptive wait isn’t necessarily a wait at all. What happens is that a worker needs to run some code that can’t be trusted to play by cooperative scheduling rules. And rather than put the SQLOS scheduler (and all its sibling workers) at the mercy of that code, the worker detaches itself from the scheduler and cedes control to a sibling runnable worker.

Read the whole thing.

More On Adaptive Joins

Erik Darling has a couple more posts on adaptive joins in SQL Server 2017.  First, he wonders what happens when you add scalar functions to the mix:

See, this isn’t SARGable either (and no, SCHEMABINDING doesn’t change this). When a predicate isn’t SARGable, you take away an index seek as an access choice. You don’t see too many Nested Loops with an index scan on the other end, do you?


So there you go. It’s not the function itself that bops our Adaptive Join on the head, but the lack of SARGability.

Erik also looks at using APPLY and EXISTS to make sure you can use adaptive joins with that dynamic duo:

I do have to point out that Cross Apply used to only be implemented as a Nested Loops Join. I learned that many years ago from one of the best articles written about Cross Apply by Paul White. That changed recently — it’s possible to see it implemented with a Hash Join in at least 2016. I’ve seen it crop up in Cross Apply queries without a TOP operator.

The latter results are a bit surprising.

Disabling Nested Loop Join Optimization

Dmitry Pilugin explains the differences between trace flag 2340 and the DISABLE_OPTIMIZED_NESTED_LOOP query hint:

This optimization provides a great boost with a sufficient number of rows. You can read more about its test results in the blog OPTIMIZED Nested Loops Joins, created by Craig Freedman, an optimizer developer.

However, if the actual number of rows is less than the expected one, then CPU additional costs to build this sort may hide its benefits, increase CPU consumption and reduce its performance.

Read the whole thing.  I think the likelihood of using either this hint or the trace flag is near nil, but crazy things do come up.

Adaptive Join Internals

Dmitry Pilugin digs deep into how adaptive joins work in SQL Server 2017:

We have three types of physical join algorithms in SQL Server: hash, nested loops and merge. Adaptive join allows SQL Server automatically choose an actual physical algorithm on the fly between the first two – hash (HM) and nested loops (NL).

NL has two join strategiesnaive nested loops join (inner loop scans the whole inner table or index) and index nested loops join (index on the join column of the inner table is used to find necessary rows and then those rows are applied to the outer row, also called Nested Loops Apply). Typically, the second one performs very well if you have rather small input on the outer side and indexed rather big input on the inner side.

HM is more universal and uses hash algorithms to match rows, so no indexes are necessary. You may refer to my blog post Hash Join Execution Internals for more details.

Adaptive Join starts execution as a Hash Join. It consumes all the input of the build phase and looks at the adaptive join threshold, if the number of rows is more or equal this threshold it will continue as a hash join. However, if the number of rows is less than this threshold, it will switch to a NL.

If you want to get a better understanding of how adaptive joins works, Dmitry’s post is a great start.


August 2017
« Jul