Press "Enter" to skip to content

Category: Internals

The Links That Tie Row To LOB

Steve Stedman shows how to use DBCC PAGE and DBCC IND to piece together where LOB data is stored for a particular row:

The question came up as how to find a link from blog storage that is corrupt back to the table and row that contains that data.

The is no link from the blob storage back to the table and row, but this is a link from the data page containing the table and row off to the blob data.

Read the whole thing.

Comments closed

SQL Server Internal Row Structures

David Fowler gets to the guts of a row as stored in SQL Server:

DBCC page will take in a database name or id, file id and page id and return a representation of the specified page depending on the print options that you choose.

We’ve got four different print options that we can choose,

0 – Return only the page header
1 – Return the page header and hex dump of each row
2 – Return the page header and full page hex dump
3 – Return the page header, hex dump of each row as well as the details on each column

Read the whole thing.

Comments closed

SQL Server’s Referential Integrity Operator

Joe Obbish explains the purpose of the referential integrity operator in SQL Server 2016:

What would happen if a parent table was referenced by hundreds of child tables, such as for a date dimension table? Deleting or updating a row in the parent table would create a query plan with at least one join per incoming foreign key reference. Creating a query plan for that statement is equivalent to creating a query plan for a query containing hundreds or even thousands of joins. That query plan could take a long time to compile or could even time out. For example, I created a simple query with 2500 joins and it still hadn’t finished compiling after 15 minutes. That’s why I assume a table is limited to 253 incoming foreign key references in SQL Server 2014.

That restriction won’t be hit often but could be pretty inconvenient to work around. The referential integrity operator introduced with compatibility level 130 raises the limit from 253 to 10000. All of the joins are collapsed into a single operator which can reduce compile time and avoid errors.

There’s some really good information in this post, and Joe has mixed feelings on the concept.

Comments closed

How Non-Clustered Index Key Columns Are Stored

Kendra Little walks through page-level details on a non-clustered index:

Just like in the root page and the intermediate pages, the FirstName and RowID columns are present.

Also in the leaf: CharCol, our included column appears! It was not in any of the other levels we inspected, because included columns only exist in the leaf of a nonclustered index.

Kendra does a great job of explaining the topic.

Comments closed

Base Versus Simple Containment

Joe Obbish takes a crack at explaining the difference between base containment and simple containment for cardinality estimation:

We know that the first query will return 500k rows and the second query will return 0 rows. However, can SQL Server know that? Each statistics object only contains information about its own column. There’s no correlation between the UNIQUE_ID and MOD_FILTER columns, so there isn’t a way for SQL Server to know that the queries will return different estimates. The query optimizer can create an estimate based on the filters on the WHERE clause and on the histograms of the join columns, but there’s no foolproof way to do that calculation. The presence of the filters introduces uncertainty into the estimate, even with statistics that perfectly describe the data for each column. The containment assumption is all about the modeling assumption that SQL Server has to make to resolve that uncertainty.

It’s an interesting post aimed at trying to get you to think like a simplified cardinality estimator.  SQL Server doesn’t behave exactly like this, but it’s a good mental reference point.

Comments closed

R Internals: Data Sizes With Nullable Columns

Niels Berglund digs into the Binary Exchange Langage (BXL) and notices something weird about data sizes:

When looking at the data sent, the size of the packages and “drilling” into the TCP packets we could deduct that: :

  • Each column has an over-head of 32 bytes (at least for non nullable data)

  • The size of the column in one row is the size of the data type for numeric types.

  • For decimal and numeric an extra byte is added to each column, where this byte indicates the precision.

  • Columns of alpha numeric type all had 2 bytes pre-pended to the bytes, except max types.

  • For char and nchar the storage size was 2 bytes plus the size the column was defined as.

  • For varchar and nvarchar the storage size was 2 bytes plus the size of the data stored.

  • For the varmax data types the number of bytes that were pre-pended varied dependent on the data size.

Read the whole thing.

Comments closed

Three Sessions And A Funeral

Solomon Rutzky explains what happens to sessions after they see the light at the end of the tunnel:

Sessions, in SQL Server, are born when a Connection is made from a client library to SQL Server. Temporary objects – Tables and/or Stored Procedures (yes, these are a thing) – may be created during a Session’s lifetime. The question is: for those temporary objects that are not explicitly dropped, what exactly happens to them? It is commonly known that they magically (ok fine, “automagically” — ok, ok, FINE, “automatically”) get dropped. But when do they get dropped? When the Session ends, right? And the Session ends when the Connection is closed, right? Well, that is certainly the common / conventional wisdom, at least. But is that understanding of the nature of Sessions and temporary objects correct?

It’s a more complicated topic than you might get from first appearances.

Comments closed

More R Services Internals Spelunking

Niels Berglund continues his series on R Services internals:

What happens is that straight after the AuthenticateConnection you will hit the WriteAsync breakpoint twice, the same way as we see in Code Snippet 4. The first hit at WriteAsync sort of makes sense, as it ties up with the call-stack we see in Code Snippet 5. But what about the second WriteAsync (the one that causes the second package to be sent), where does that come from? To try to figure that out, we start with the call-stack for that particular WriteAsync.

Execute the code in Code Snippet 1 again, and continue to the second WriteAsync. When you hit the breakpoint do a kc again. The call-stack should now look somewhat like this (this time it is the full call-stack):

It’s interesting the kind of stuff you can find with Wireshark and a debugger.

Comments closed

When The Maximum Workspace Memory Isn’t The Internal Pool Maximum

Lonny Niederstadt answers the call from someone who needs the combination of Perfmon and DMV data:

When is a maximum not really the maximum?
When it’s a maximum for an explicitly or implicitly modified default.
Whether “the definitive documentation” says so or not.

Yesterday on Twitter #sqlhelp this question came up.

*****

*****

Aha! I thought to myself.  For this I am purposed! To show how Perfmon and DMV data tie out!

Read on for the simple form of the answer, followed by the complication which makes life interesting.

Comments closed

Unboxing ISPACs

It’s an early Christmas for Richie Lee:

The first file that we’re going to look at is the [Content_Types].xml file, and this is the file that confirms that the ZipPackage class is used. There’s an article here that is ten years old but is still valid (scroll down to the System.IO.Packaging INcludes Zip Support to read up on this.) This is because we know that the content_types file is part of the output when using the ZipPackage class to zip up a bunch of files into a .zip. The content_file contains both the extension and content type of the three other files that are included in the ispac:

  • dtsx
  • params
  • manifest

Note that the content_types file does not specify the files, either in quantity or in content, other than the fact that they will contain xml.

Read on for a good amount of detail on what’s included in an Integration Services package.

Comments closed