Press "Enter" to skip to content

Category: Internals

Optimize For Sequential Key

Pam Lahoud explains the context behind a new option you can add to indexes in SQL Server 2019 CTP 3.1 and later:

With last page insert contention, as the number of insert threads increases, the queue for the page latch increases which in turn increases latency. Throughput will also decrease, but if something slows down one of the threads that is holding the latch, this can trigger a convoy and throughput suddenly falls off a cliff. This typically happens when a page fills up and a new page must be added to the index (also known as a page split). The insert that triggers the new page will naturally have to hold the latch for longer than normal while the new page operation completes. This causes the queue to build up behind the latch. Adding a new page also requires an exclusive latch on the parent page, which can cause latch requests to queue at that level as well. At this point, throughput falls off a cliff.

OPTIMIZE_FOR_SEQUENTIAL_KEY aims to do two things – control the rate at which new threads are allowed to request the latch, and favor threads that are likely to keep the throughput high. These techniques will not prevent the contention or reduce latency, but they will help keep throughput consistent as concurrency increases.

Read on to learn more about the process and by implication some places where this won’t work well at all for you.

Comments closed

Relating Nonclustered Indexes to the Clustered Index

Erin Stellato takes us through a bit of indexing strategy:

In the sp_SQLskills_helpindex output you’ll notice that OrderLineID is the only column in columns_in_tree, and both OrderLineID and OrderID_and_OrderLineID are in columns_in_leaf.

Remember that when you identify a nonclustered index as unique, it will not add the clustering key to the tree level, it just adds it to the leaf level. But if the nonclustered index is not declared as unique, then SQL Server adds the clustering key to the tree level to uniquely identify the row.

This post deserves a careful reading.

Comments closed

Database Page Allocations Function

Max Vernon takes us through the sys.dm_db_database_page_allocations Dynamic Management Function:

sys.dm_db_database_page_allocations is an undocumented SQL Server T-SQL Dynamic Management Function. This DMF provides details about allocated pages, allocation units, and allocation extents.

Read on for additional details. This is an undocumented function, so it might change between versions but it will give you an idea of how it works under the covers.

Comments closed

Diving Into Clustered Index Seeks

Hugo Kornelis looks in detail at the clustered index seek:

The Clustered Index Seek operator uses the structure of a clustered index to efficiently find either single rows (singleton seek) or specific subsets of rows (range seek). Because a clustered index always contains all columns in a table, a Clustered Index Seek is one of the most efficient ways for SQL Server to find single rows or small ranges, provided there is a filter that can be used efficiently.

The behavior of the Clustered Index Seek operator is in fact exactly the same as the behavior of the Index Seek operator, with only a very few differences as noted below. Though these two operators do have different names, not only in the graphical execution plan but also in the underlying XML, I suspect that in reality they are both using the same internal logic and not a copy of it.

Read the whole thing.

Comments closed

Deep Dive on Index Seeks

Hugo Kornelis gives us a great deal of information on index seeks in SQL Server:

Every Seek Keys specification can be either for a “singleton seek”, or for a “range seek”. A singleton seek applies when at most a single row can satisfy the requirement of the Seek Keys specification. A range seek means that (potentially) more than a single row can qualify.

For a singleton seek, the index structure is used to find the row that matches the specified condition. If it exists, it is returned and then the operator immediately continues to the next Seek Keys specification. If it doesn’t, then nothing is returned and  the operator continues to the next Seek Keys specification. 

Read the whole thing and pair it with index scans if you haven’t read that already.

Comments closed

Unexpected Results with ANY Aggregate

Paul White points out a couple odd scenarios with the ANY aggregate in SQL Server:

The execution plan erroneously computes separate ANY aggregates for the c2 and c3 columns, ignoring nulls. Each aggregate independently returns the first non-null value it encounters, giving a result where the values for c2 and c3 come from different source rows. This is not what the original SQL query specification requested.

The same wrong result can be produced with or without the clustered index by adding an OPTION (HASH GROUP) hint to produce a plan with an Eager Hash Aggregate instead of a Stream Aggregate.

Click through for the scenarios. Paul has also reported the second scenario as a bug.

Comments closed

SQL Server 2019 CTP3 T-Log Writers Increased

Lonny Niederstadt observes a change in SQL Server 2019 CTP 3.0:

In SQL Server 2016, transaction log writing was enhanced to support multiple transaction log writers.  If the instance had more than one non-DAC node in [sys].[dm_os_nodes], there would be one transaction log writer per node, to a maximum of 4.

In SQL Server 2019, it seems the maximum number of transaction log writers has been increased.  The system below with 4 vNUMA nodes (and autosoftNUMA disabled) has eight transaction log writer sessions, each on their own hidden online scheduler, all on parent_node_id = 3/memory_node_id = 3 on processor group 1.

Click through for the proof.

Comments closed

Diving Into Columnstore Index Scans

Hugo Kornelis continues a series of posts on index scans:

The Columnstore Index Scan is not really an actual operator. You can encounter it in graphical execution plans in SSMS (and other tools), but if you look at the underlying XML of the execution plan, you will see that it is either an Index Scan or a Clustered Index Scan operator.

SQL Server currently supports three types of index storage: rowstore, columnstore, and memory-optimized. Indexes of each of those types can be the target of an Index Scan or Clustered Index Scan, as indicated by the Storage property. When the Storage property is RowStore or MemoryOptimized, then the normal icon for (clustered) index scan is use, but when Storage is ColumnStore than SSMS (and other tools) choose to show a different icon instead.

Click through for more details.

Comments closed

Physical Operators: Apply and Nested Loops

Paul Whtie takes us through the Apply operator versus a classic nested loop join operator:

The optimizer’s output may contain both apply and nested loops join physical operations. Both are shown in execution plans as a Nested Loops Join operator, but they have different properties:

Apply
The Nested Loops Join operator has Outer References. These describe parameter values passed from the outer (upper) side of the join to operators on the inner (lower) side of the join. The value of the each parameter may change on each iteration of the loop. The join predicate is evaluated (given the current parameter values) by one or more operators on the inner side of the join. The join predicate is not evaluated at the join itself.

Join
The Nested Loops Join operator has a Predicate (unless it is a cross join). It does not have any Outer References. The join predicate is always evaluated at the join operator.

And to make things tricky, APPLY can generate either of these. Read the whole thing.

Comments closed

Flink’s Network Stack

Nico Kruber dives into the internals of Apache Flink’s network stack:

Flink’s network stack is one of the core components that make up the flink-runtime module and sit at the heart of every Flink job. It connects individual work units (subtasks) from all TaskManagers. This is where your streamed-in data flows through and it is therefore crucial to the performance of your Flink job for both the throughput as well as latency you observe. In contrast to the coordination channels between TaskManagers and JobManagers which are using RPCs via Akka, the network stack between TaskManagers relies on a much lower-level API using Netty.

This blog post is the first in a series of posts about the network stack. In the sections below, we will first have a high-level look at what abstractions are exposed to the stream operators and then go into detail on the physical implementation and various optimisations Flink did. We will briefly present the result of these optimisations and Flink’s trade-off between throughput and latency. Future blog posts in this series will elaborate more on monitoring and metrics, tuning parameters, and common anti-patterns.

There’s a lot in here and it’s worth reading.

Comments closed