I have highlighted in red what SQL Server adds to the table – the two system columns – graph id, which is bigint, and node id, which is nvarchar and stores json, and the unique index to help with queries.
We can also see from constraint type that this table is similar to other relational tables – it can be enabled for replication and can have related delete or update actions defined on it if need be.
This post gives a bit more insight into how graph tables work in SQL Server under the covers.
We’re attaching to an already running docker container running SQL. But what we get is an idle SQL Server process this is great if we have a running workload we want to analyze but my goal for all of this is to see how SQL Server starts up and this isn’t going to cut it.
My next attempt was to stop the sql19 container and quickly start the strace container but the strace container still missed events at the startup of the sql19 container. So I needed a better way.
Don’t worry—Anthony finds a better way.
I have the impression that
CSelCalcColumnInInterval“fails” if the predicate doesn’t fall within any of the histogram intervals. The estimation logic then chooses to try the
CSelCalcAscendingKeyFiltercalculator (a reference to the “ascending key problem”) if the predicate is specifically higher than the last histogram interval.
Josh includes a couple of demos as well, so check them out.
When a user creates a Delta Lake table, that table’s transaction log is automatically created in the
_delta_logsubdirectory. As he or she makes changes to that table, those changes are recorded as ordered, atomic commits in the transaction log. Each commit is written out as a JSON file, starting with
000000.json. Additional changes to the table generate subsequent JSON files in ascending numerical order so that the next commit is written out as
000001.json, the following as
000002.json, and so on.
It’s interesting that they chose JSON instead of a binary transaction log like relational databases use.
With last page insert contention, as the number of insert threads increases, the queue for the page latch increases which in turn increases latency. Throughput will also decrease, but if something slows down one of the threads that is holding the latch, this can trigger a convoy and throughput suddenly falls off a cliff. This typically happens when a page fills up and a new page must be added to the index (also known as a page split). The insert that triggers the new page will naturally have to hold the latch for longer than normal while the new page operation completes. This causes the queue to build up behind the latch. Adding a new page also requires an exclusive latch on the parent page, which can cause latch requests to queue at that level as well. At this point, throughput falls off a cliff.
OPTIMIZE_FOR_SEQUENTIAL_KEY aims to do two things – control the rate at which new threads are allowed to request the latch, and favor threads that are likely to keep the throughput high. These techniques will not prevent the contention or reduce latency, but they will help keep throughput consistent as concurrency increases.
Read on to learn more about the process and by implication some places where this won’t work well at all for you.
In the sp_SQLskills_helpindex output you’ll notice that OrderLineID is the only column in columns_in_tree, and both OrderLineID and OrderID_and_OrderLineID are in columns_in_leaf.
Remember that when you identify a nonclustered index as unique, it will not add the clustering key to the tree level, it just adds it to the leaf level. But if the nonclustered index is not declared as unique, then SQL Server adds the clustering key to the tree level to uniquely identify the row.
This post deserves a careful reading.
sys.dm_db_database_page_allocations is an undocumented SQL Server T-SQL Dynamic Management Function. This DMF provides details about allocated pages, allocation units, and allocation extents.
Read on for additional details. This is an undocumented function, so it might change between versions but it will give you an idea of how it works under the covers.
The Clustered Index Seek operator uses the structure of a clustered index to efficiently find either single rows (singleton seek) or specific subsets of rows (range seek). Because a clustered index always contains all columns in a table, a Clustered Index Seek is one of the most efficient ways for SQL Server to find single rows or small ranges, provided there is a filter that can be used efficiently.
The behavior of the Clustered Index Seek operator is in fact exactly the same as the behavior of the Index Seek operator, with only a very few differences as noted below. Though these two operators do have different names, not only in the graphical execution plan but also in the underlying XML, I suspect that in reality they are both using the same internal logic and not a copy of it.
Read the whole thing.
Every Seek Keys specification can be either for a “singleton seek”, or for a “range seek”. A singleton seek applies when at most a single row can satisfy the requirement of the Seek Keys specification. A range seek means that (potentially) more than a single row can qualify.
For a singleton seek, the index structure is used to find the row that matches the specified condition. If it exists, it is returned and then the operator immediately continues to the next Seek Keys specification. If it doesn’t, then nothing is returned and the operator continues to the next Seek Keys specification.
Read the whole thing and pair it with index scans if you haven’t read that already.
The execution plan erroneously computes separate
ANYaggregates for the
c3columns, ignoring nulls. Each aggregate independently returns the first non-null value it encounters, giving a result where the values for
c3come from different source rows. This is not what the original SQL query specification requested.
The same wrong result can be produced with or without the clustered index by adding an
OPTION (HASH GROUP)hint to produce a plan with an Eager Hash Aggregate instead of a Stream Aggregate.
Click through for the scenarios. Paul has also reported the second scenario as a bug.