Press "Enter" to skip to content

Category: Internals

Processing Data: Aggregate and Join Operators

Erik Darling continues a series on understanding plains. First up is aggregate operators:

Aggregates can be useful for all sorts of things in a query plan, and can show up in many different forms.

It would be tough to cover all of them in a single post, but what I’d like to do is help all you nice folks out there understand some of their finer points.

Then we have join operators:

Anyone who tells you there are only three types of joins in SQL Server isn’t your friend.

Okay, maybe that’s harsh. Maybe they’re just getting you prepared for the bonne promenade through all the many different faces a join can wear in your query plans.

Maybe they have a great explanation for Grace Hash Joins in their back pocket that they’re waiting to throw in your face like a bunch of glitter.

Do read both.

Comments closed

Tempdb Improvements over Time

Melody Zacharias takes us through the history of tempdb improvements:

In my previous blog and this one, there is a lot of talk of trace flags and you can see the life cycle of them.  They are implemented and then sometimes become part of the product over time. For a long time, the trace flag 1118 was a common performance improvement trick known only by industry experts. Over the years as a consultant, I have often been asked by clients if they should use trace flags, and generally speaking, as long as they are documented by Microsoft they are safe to use.  I would certainly not recommend using undocumented trace flags.  They are not supported and therefore not recommended.  So always be sure to check the trace flag list before setting trace flags.  So of course, as I say that, I have to offer another one.  

Read on for more details and also advice on getting the most out of tempdb.

Comments closed

Capturing a Call Stack without a Dump

Bob Dorr shows off Arvind’s SQL Call Stack Resolver:

Some outputs, such as the XEvent call stack action output the raw stack frame information and require a rebase to loaded module information in order to symbolize.   The security feature for random address virtualization loads images at different addresses each time the image is loaded.   This requires the module base address and the raw address in order to calculate the relative virtual address for symbolization.

Click through for more information.

Comments closed

Reader/Writer Synchronization in SQL Server

Bob Dorr gets synchronized:

This post is not about a specific SQL Server object but instead outlines a technique used in various locations to reduce contention while still providing thread synchronization.  There are hundreds of locations throughout the SQL Server code base that must account for multi-threaded access.   A common technique used in multi-threaded coding is a reader, writer lock.

The idea behind a reader, writer synchronization object is to allow reader parallelization in conjunction with writer synchronization.  Let’s look at a simple pattern of a single path synchronization object.  (Example: spinlock)

Click through for a bit of pseudo-code and explanation.

Comments closed

Learning about Locks in SQL Server

Bob Dorr migrates over a couple of posts on locking. The first one describes lock enumeration:

SQL Server has a lock iterator class which is used by Lock Monitor (deadlock detection), DMVs such as dm_tran_locks and other workers.   The iterator is designed to remain lock free whenever possible to avoid contention with active queries.  To accomplish this the iterator uses a marker (bookmark) approach and the lock type of NL.  The NL lock does not block and provides a bookmark (reference count and location) supporting the iterator.

The other post covers lock partitioning:

Shared access blocks exclusive acquire requests. The shared access needs only to acquire on the worker’s local partition. An attempt to acquire exclusive access is blocked by the shared access holders. Conversely to acquire exclusive access the worker must acquire the exclusive access of every partition. The exclusive acquisition path is longer than the shared path because N partitions must be acquired to achieve the exclusive access. Increasing the number of partitions increases the work required to acquire exclusive access. This means that only certain protection paths should use the partitioning approach.

Both get pretty deep into internals.

1 Comment

SQL Server Scheduler Affinity

Bob Dorr re-posts an oldie but goodie:

SQL Server uses 3 types of affinity to control where the SQL Server worker threads execute.  Before explaining the different scheduler affinity types let me clarify some terminology.

Click through to see what these three types of affinity are, as well as some of the terms you’ll need to know in order to understand affinity types.

Comments closed

The Transaction Log Architecture

Paul Randal continues a series on the transaction log:

In the first part of this series, I introduced basic terminology around logging, so I recommend you read that before continuing with this post. Everything else I’ll cover in the series requires knowing some of the architecture of the transaction log, so that’s what I’m going to discuss this time. Even if you’re not going to follow the series, some of the concepts I’m going to explain below are worth knowing for everyday tasks DBAs handle in production.

Read on to learn more about some key transaction log terminology.

Comments closed

Understanding the Transaction Log

Paul Randal has a new series:

With this post, I’m starting an occasional series on the transaction log and how it works and should be managed, and I’ll touch on all the problems above over its course. In this post, I’ll explain what logging is and why it’s required.

Basic Terminology Around Logging

When I’m talking about any mechanism in SQL Server, I find there’s a chicken-and-egg problem where I need to use a word or phrase before I’ve explained it. To avoid that problem in this series, I’m going to start by explaining some terminology that needs to be used when discussing logging, and I’ll expand on many of these terms as the series progresses.

This post starts off with some of the basics and it’s always good to get the occasional refresher on the basics.

Comments closed

Understanding SQL Server I/O Size

Anthony Nocentino dives into a topic:

Master File Table (MFT) is the data structure that describes files and directories on NTFS. In Figure 1, you can see an MTF record has several sections describing the metadata about the file and pointers to blocks that make up the file. A block, also referred to as a cluster in Windows, is an abstraction over one or more physical structures (sectors or pages depending on the media) presented by the underlying disk. A block/cluster is also the atomic allocation unit from a file system and has a configurable size. On NTFS, this is referred to as the NTFS Allocation Unit Size and is a configurable attribute of the file system. By default, it is 4KB and can be as large as 2MB. Since a block is a unit of allocation, if a file is between 1 byte and the file system’s allocation unit size, it will take up exactly one block/cluster on the file system. As the file grows, more blocks/clusters are allocated to represent the file. The MFT data structure tracks which blocks make up a file. The block allocator of the file system will try to ensure blocks are physically adjacent on the disk and groups them together in runs.

A 4KB NTFS Allocation Units size is considered best practice on general-purpose file systems. And 64KB is considered best practice for SQL Server…but why? Let’s keep digging…

I was sitting at a lunch table with Anthony when it all clicked and that was fun to see.

Comments closed