Press "Enter" to skip to content

Category: Internals

Understanding RID Lookups

Hugo Kornelis takes us through an operator I usually don’t want to see:

The RID Lookup operator offers the same logical functionality within the execution plan as the Key Lookup operator. But where Key Lookup is used for tables that have a clustered index, RID Lookup is instead used when a table is “heap” (table without clustered index). It is used when another operator (usually an Index Seek, sometimes an Index Scan, rarely a combination of two or more of these or other operators) is used to find rows that need to be processed, but the index used does not include all columns needed for the query. The RID Lookup operator is then used to fetch the remaining columns from the heap structure where the table data is stored.

Click through for a great deal of information about RID Lookups.

Comments closed

Internal Storage of Numeric Values

Randolph West continues a series on how SQL Server stores values:

As we know from before, integers are whole numbers, or numbers with no fractions (i.e. no decimal places). This is going to be in the test later, so pay attention. In other words, the numbers 0 through 9 are integers, but a floating point or decimal / numeric value is not an integer. As soon as you add decimal places, it stops being an integer even if the fraction equates to zero.

Inside the storage engine, integers are mostly signed values (they can have negative values), and each integer data type has a fixed size. The exception is TINYINT which only has positive values. Like many other data types, integer types are stored byte-reversed (known as little-endian).

Click through for some good information from Randolph.

Comments closed

Thread Pool Exhaustion and Availability Groups

Sean Gallardy lays down the gauntlet:

You’re probably wondering why you couldn’t spawn a new thread, why this error happened, why you shouldn’t just up the max worker threads, and probably also thinking this must be some kind of “bug” in SQL Server. So here’s where our awkward conversation starts… It’s you. Contrary to every relationship breakup you’ve ever had, it’s definitely you. I’m not saying this to be mean but to really drive the point home here. The major reasons for this occurring are large blocking chains, too much workload for the server size (databases, users, etc.), and/or your virtual infrastructure sucks. There aren’t too many reasons for getting yourself into this situation, and while what I’ll be putting forth here isn’t exhaustive of all edge cases and scenarios, these are by far the majority of all the items in the wild that I’ve either worked on or have been involved in at some level. Side Note: If you’ve read this far, are shaking your head, calling me names that an irate sailor might utter, and telling yourself that upping the max worker threads as the product error suggests and Microsoft should fix their bugs then you can stop reading here as you’re probably not open to learning why you have issues in your environments.

One more scenario I’ve seen is mirroring thousands of databases on a single instance. That scenario fit none of Sean’s criteria—there was very little blocking, most of the databases were small and infrequently-used, and the infrastructure was the right size. It was just a huge number of databases and each database requiring a minimum of X worker threads. Mind you, it was still a bad idea…

Comments closed

Understanding Memory Grants

Taiob Ali walks us through the concept of memory grants:

DesiredMemory: Memory estimated to fit intermediate results in KB for the chosen degree of parallelism. If the query runs in serial mode, this is the same as SerialDesiredMemory.

(Amount needed to store all temporary rows in memory. This depends on the cardinality estimate, expected number rows and average size of row). This is called additional because a query can survive lack of such memory by storing part of temporary rows on hard disk. A query is not guaranteed to have the full amount if the total exceeds the preset limit.)

Read on for explanations of each of the elements in MemoryGrantInfo.

Comments closed

Change Tracking and Internal Tables

Tim Weigel continues a series on change tracking:

In my last post, I showed you how to configure change tracking at the table level and how to get configuration information about change tracking from the database engine. We looked at sys.change_tracking_databases and sys.change_tracking_tables, and looked at some sample scripts that present the information in a more readable format.

Before moving on to working with change tracking, I’d like to show you a little bit about how SQL Server handles change tracking data under the hood. Let’s take a few minutes to talk about sys.internal_tablessys.dm_tran_commit_table, and sys.syscommittab. These aren’t objects that most DBAs interact with on a routine basis, but they’re useful for understanding how change tracking does what it does.

Click through to learn more about these internal tables.

Comments closed

Date and Time Storage in SQL Server

Randolph West covers the internals of how date and time data types are stored in SQL Server:

DATE is the byte-reversed number of days since the year 0001-01-01, stored as three bytes. It goes up to 9999-12-31, which is stored as 0xDAB937. You can check this value by reversing the bytes and sticking them into a hex calculator. 37 B9 DA equals 3,652,058, which is the number of days since 0001-01-01.

If you try to cast 0xDBB937 as a DATE value (by incrementing the least significant bit DA by 1), it will throw a conversion error. There is obviously some overflow detection that protects against corruption in a date type.

Randolph looks at DATE, TIME, DATETIME(2), and DATETIME and explains how each is storedon a page.

Comments closed

Understanding Key Lookups

Hugo Kornelis continues a series on SQL Server plan operators:

The Key Lookup operator provides a subset of the functionality of the Clustered Index Seek operator, but within a specific context. It is used when another operator (usually an Index Seek, sometimes an Index Scan, rarely a combination of two or more of these or other operators) is used to find rows that need to be processed, but the index used does not include all columns needed for the query. The Key Lookup operator is then used to fetch the remaining columns from the clustered index.

A Key Lookup operator will always be found on the inner input of a Nested Loops operator. It will be executed once for each row found. Since the key values passed in always come from another index, the requested row will always exist (except in rare race scenarios when read uncommitted isolation level is used).

Click through for a great deal of information on key lookups.

Comments closed

Extended Events and Query Store

Jason Brimhall takes us through some of the internals of Query Store as exposed by Extended Events:

One of my favorite questions to ask during some of presentations on XE is “What was the first version of SQL Server to have Query Store?” You can imagine the wide array of answers but what is interesting is how often  the correct answer is always missed. I hear lots of answers for 2012, some answers for 2017 and somewhere in between for 2016. But never does the correct answer pop up.

Right now, I hope you are scratching your head at that last statement. You see, the question is somewhat of a trick question. The first version of SQL Server that has QDS is SQL Server 2014. However, the first version where you can actually use it is SQL Server 2016. This fun fact is visible when we start exploring SQL Server from the realm of XE. Let’s take a look.

Read the whole thing.

Comments closed

When the Optimizer Can Use Batch Mode on Row Store

Erik Darling looks at some internals for us:

Things like Accelerated Database RecoveryOptimize For Sequential Key, and In-Memory Tempdb Metadata are cool, but they’re server tuning. I love’em, but they’re more helpful for tuning an entire workload than a specific query.

The thing with BMOR is that it’s not just one thing. Getting Batch Mode also allows Adaptive Joins and Memory Grant Feedback to kick in.

But they’re all separate heuristics.

Read on to see the extended events around batch mode to help you determine if it’s possible for the optimizer to use it for a given query.

Comments closed

Finding the Physical Location of a Row

Max Vernon breaks out the internals toolbag:

Occasionally I’ve needed to determine the physical location of a row stored in SQL Server. The code in this post uses the undocumented feature, %%PHYSLOC%%, which returns a binary representation in hexadecimal of the location of each row returned in a SELECT statement. The system table valued function, fn_PhysLocCracker, is used to decode the binary value returned by %%PHYSLOC%% to provide the file_idpage_id, and slot_id for each row.

Read on for a demo. Unlike most demos of this sort, Max is using a partitioned table, so that’s something new.

Comments closed