Category: Internals

sys.dm_db_database_page_allocations is an undocumented SQL Server T-SQL Dynamic Management Function. This DMF provides details about allocated pages, allocation units, and allocation extents.

Read on for additional details. This is an undocumented function, so it might change between versions but it will give you an idea of how it works under the covers.

Comments closed

Diving Into Clustered Index Seeks

Published 2019-07-15 by Kevin Feasel

Hugo Kornelis looks in detail at the clustered index seek:

The Clustered Index Seek operator uses the structure of a clustered index to efficiently find either single rows (singleton seek) or specific subsets of rows (range seek). Because a clustered index always contains all columns in a table, a Clustered Index Seek is one of the most efficient ways for SQL Server to find single rows or small ranges, provided there is a filter that can be used efficiently.
The behavior of the Clustered Index Seek operator is in fact exactly the same as the behavior of the Index Seek operator, with only a very few differences as noted below. Though these two operators do have different names, not only in the graphical execution plan but also in the underlying XML, I suspect that in reality they are both using the same internal logic and not a copy of it.

Read the whole thing.

Comments closed

Deep Dive on Index Seeks

Published 2019-07-09 by Kevin Feasel

Hugo Kornelis gives us a great deal of information on index seeks in SQL Server:

Every Seek Keys specification can be either for a “singleton seek”, or for a “range seek”. A singleton seek applies when at most a single row can satisfy the requirement of the Seek Keys specification. A range seek means that (potentially) more than a single row can qualify.
For a singleton seek, the index structure is used to find the row that matches the specified condition. If it exists, it is returned and then the operator immediately continues to the next Seek Keys specification. If it doesn’t, then nothing is returned and the operator continues to the next Seek Keys specification.

Read the whole thing and pair it with index scans if you haven’t read that already.

Comments closed

Unexpected Results with ANY Aggregate

Published 2019-07-03 by Kevin Feasel

Paul White points out a couple odd scenarios with the ANY aggregate in SQL Server:

The execution plan erroneously computes separate ANY aggregates for the c2 and c3 columns, ignoring nulls. Each aggregate independently returns the first non-null value it encounters, giving a result where the values for c2 and c3 come from different source rows. This is not what the original SQL query specification requested.
The same wrong result can be produced with or without the clustered index by adding an OPTION (HASH GROUP) hint to produce a plan with an Eager Hash Aggregate instead of a Stream Aggregate.

Click through for the scenarios. Paul has also reported the second scenario as a bug.

Comments closed

SQL Server 2019 CTP3 T-Log Writers Increased

Published 2019-06-25 by Kevin Feasel

Lonny Niederstadt observes a change in SQL Server 2019 CTP 3.0:

In SQL Server 2016, transaction log writing was enhanced to support multiple transaction log writers. If the instance had more than one non-DAC node in [sys].[dm_os_nodes], there would be one transaction log writer per node, to a maximum of 4.
In SQL Server 2019, it seems the maximum number of transaction log writers has been increased. The system below with 4 vNUMA nodes (and autosoftNUMA disabled) has eight transaction log writer sessions, each on their own hidden online scheduler, all on parent_node_id = 3/memory_node_id = 3 on processor group 1.

Click through for the proof.

Comments closed

Diving Into Columnstore Index Scans

Published 2019-06-11 by Kevin Feasel

Hugo Kornelis continues a series of posts on index scans:

The Columnstore Index Scan is not really an actual operator. You can encounter it in graphical execution plans in SSMS (and other tools), but if you look at the underlying XML of the execution plan, you will see that it is either an Index Scan or a Clustered Index Scan operator.
SQL Server currently supports three types of index storage: rowstore, columnstore, and memory-optimized. Indexes of each of those types can be the target of an Index Scan or Clustered Index Scan, as indicated by the Storage property. When the Storage property is RowStore or MemoryOptimized, then the normal icon for (clustered) index scan is use, but when Storage is ColumnStore than SSMS (and other tools) choose to show a different icon instead.

Click through for more details.

Comments closed

Physical Operators: Apply and Nested Loops

Published 2019-06-10 by Kevin Feasel

Paul Whtie takes us through the Apply operator versus a classic nested loop join operator:

The optimizer’s output may contain both apply and nested loops join physical operations. Both are shown in execution plans as a Nested Loops Join operator, but they have different properties:
Apply
The Nested Loops Join operator has Outer References. These describe parameter values passed from the outer (upper) side of the join to operators on the inner (lower) side of the join. The value of the each parameter may change on each iteration of the loop. The join predicate is evaluated (given the current parameter values) by one or more operators on the inner side of the join. The join predicate is not evaluated at the join itself.
Join
The Nested Loops Join operator has a Predicate (unless it is a cross join). It does not have any Outer References. The join predicate is always evaluated at the join operator.

And to make things tricky, APPLY can generate either of these. Read the whole thing.

Comments closed

Flink’s Network Stack

Published 2019-06-07 by Kevin Feasel

Nico Kruber dives into the internals of Apache Flink’s network stack:

Flink’s network stack is one of the core components that make up the flink-runtime module and sit at the heart of every Flink job. It connects individual work units (subtasks) from all TaskManagers. This is where your streamed-in data flows through and it is therefore crucial to the performance of your Flink job for both the throughput as well as latency you observe. In contrast to the coordination channels between TaskManagers and JobManagers which are using RPCs via Akka, the network stack between TaskManagers relies on a much lower-level API using Netty.
This blog post is the first in a series of posts about the network stack. In the sections below, we will first have a high-level look at what abstractions are exposed to the stream operators and then go into detail on the physical implementation and various optimisations Flink did. We will briefly present the result of these optimisations and Flink’s trade-off between throughput and latency. Future blog posts in this series will elaborate more on monitoring and metrics, tuning parameters, and common anti-patterns.

There’s a lot in here and it’s worth reading.

Comments closed

Diving Into Index Scans

Published 2019-06-04 by Kevin Feasel

Hugo Kornelis explains how index scans work in SQL Server:

The logic of the Index Scan operator itself is fairly simple, but the actual actions carried out can vary hugely depending on the type of index being scanned (as defined in the Storage and IndexKind properties). Most of this logic is carried out at the level of the storage engine. Since an understanding of this is important to get a proper understanding of the performance of this operator, the actual actions carried out at the level of the storage engine will be described on this page as well.
The current version of SQL Server (2017) supports four types of index storage. The Storage property distinguishes between RowStore, ColumnStore, and MemoryOptimized; for the latter type only IndexKind further differentiates this into NonClustered and NonClusteredHash.

Scans are an important part of the database engine and knowing how they work helps us understand when they’re the right choice for the job.

Comments closed

Diagnosing Why SYSUTCDATETIME is Faster than SYSDATETIME

Published 2019-05-30 by Kevin Feasel

Joe Obbish is on a quest:

On my machine the code takes about 11.6 seconds to execute. Replacing SYSDATETIME() with SYSUTCDATETIME() makes the code take only 4.3 seconds to execute. Why is SYSUTCDATETIME() so much faster than SYSDATETIME()?

There’s an interesting answer to the question, so read on.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31