Internals – Page 18 – Curated SQL

The Sequence Operator

Published 2021-02-16 by Kevin Feasel

Hugo Kornelis digs into the sequence operator:

The Sequence operator reads all rows from all its inputs, in order, and without modification. It returns all rows it reads from its last input only.
This operator is typically used in execution plans where some branches are needed to do preparation work (e.g. to invoke a Table-valued Function operator) and then other branches produce the final output, or multiple branches need to do work in sequence (such as in wide data modification plans). The operators on the non-last inputs often do not produce any output at all, but even if they do the rows returned are discarded. Only rows returned from the last input are returned to the parent of the Sequence operator.

Note that this is quite difference from sequences of numbers in SQL Server, which act similarly to identity integers.

Comments closed

Memory Grant Internals

Published 2021-02-08 by Kevin Feasel

Deepthi Goguri has two posts for us. First up, we learn about queries spilling to disk:

During the build time, the memory is allocated based on the cardinality estimates and the estimates are based on the input size and the probe. The buckets are created in the memory and we place the rows in those respective buckets. If the grant is exceeded then some of the buckets will be send to the disk. As some of the buckets are already in memory and some in disk, the initial probe of the data using the inner set. The probe rows will be scanned if the row hash to the in memory bucket the match is done for those rows. If the rows match to the on disk bucket, the probe row will be return to the disk along with the outer side bucket. So, because of this we have more disk writes to tempdb at the probe time. This is a build side spill.

Then, Deepthi wraps up this series with a bit of balance:

If this final part of the series, lets talk about how we balance these memory grants. If lot of your queries are requesting for more memory than they required, then that causes concurrency issues. Memory grants are allocated to a any query even before they are executed based on the cardinality estimates and based on the memory consuming iterators. Once the query memory is allocated to a query, that memory is only released once the query execution completes even if the query actually uses a part of the allocated query. If more memory is requested by the queries than they need, then we are wasting the memory. What happens if the queries only receive less memory grants than they need, then we there is a performance impact on the running queries.

Read on for ways to fix excessive or insufficient memory grant problems.

Comments closed

Memory Grants: Query Memory and Workspace Memory

Published 2021-02-04 by Kevin Feasel

Deepthi Goguri continues a series on memory grant internals:

The memory grant for the above query is 183496/8=22937 pages of memory grant. The total available grant for the queries is 92442. We can run the above query 4 times to hit the total memory grant (22937*4=91748). We can use the SQLQueryStress tool to run the query and see the query memory grants. Let us run the query with 100 number of iterations and 10 number of threads.

This has been an interesting series so far and I look forward to seeing the rest of it.

Comments closed

Resource Governor and Memory Grants

Published 2021-01-29 by Kevin Feasel

Deepthi Goguri explains how you can use Resource Governor to control memory utilization:

Queries doesn’t start unless available memory is 150% of requested query memory because sql server will always allocate a little bit of additional buffer memory. Lets take the 100MB query which will get 150MB with additional buffer. What if this query only get 100MB and doesn’t get that additional buffer in the beginning to run the query? In that case, the query will wait until the 25x the query cost which is in seconds. When the timeout occurs and if the exact grant (100MB) is available without the buffer amount then the query starts to run this time. In other scenario, even after waiting for so long and after the timeout, the exact memory (100MB) without the memory grant is not available and less than 100MB is available, in that case the grant can be decreased to the minimum required grant. What if even the minimum required grant is not available, in that case we will receive an error 8645. We can configure the timeout by using the resource governor.

Read on for more detail.

Comments closed

Parquet 1.x Footer Content

Published 2021-01-28 by Kevin Feasel

Dmitry Tolpeko shows us what the footer of a Parquet 1.x file looks like:

Every Parquet file has the footer that contains metadata information: schema, row groups and column statistics. The footer is located at the end of the file.
A parquet file content starts and ends with 4-byte PAR1 “magic” string. Right before the ending PAR1 there is 4-byte footer length size (little-endian encoding):

Click through for more details, as well as one downside to Parquet 1.x.

Comments closed

The Performance of sys.partitions

Published 2021-01-28 by Kevin Feasel

Aaron Bertrand tries to disentangle a query:

This question was posted to #sqlhelp by Jake Manske, and it was brought to my attention by Erik Darling.
I don’t recall ever having a performance issue with sys.partitions. My initial thought (echoed by Joey D’Antoni) was that a filter on the data_compression column should avoid the redundant scan, and reduce query runtime by about half. However, this predicate doesn’t get pushed down, and the reason why takes a bit of unpacking.

Click through for the results of Aaron’s research.

Comments closed

Memory Grant Internals

Published 2021-01-25 by Kevin Feasel

Deepthi Goguri starts a series on memory grant internals:

SQL Server have different types of components (Plan Cache, Buffer pool and other memory components) in the memory area and requires memory for many things. We also require memory to process our queries. Each operator in the execution plan requires memory to process the rows from one operator to the other operator. Some of the iterators need to be buffered as the rows come into them before sending them to the next iterator in the execution plan. These iterators needs more memory. These are called memory consuming iterators. The grants that we given to these memory consuming iterators are called Query memory grants.

Click through for more.

Comments closed

Performance Impact of Foreign Keys with Non-Default ON UPDATE or ON DELETE

Published 2021-01-06 by Kevin Feasel

Hugo Kornelis continues a dive into foreign keys:

Welcome to part fifteen of the plansplaining series. In the three previous parts I looked at the operators and properties in an execution plan that check a modification doesn’t violate foreign key constraints. That part is done. But I’m not done with foreign keys yet.
We normally expect foreign keys to throw an error on violations. But that’s actually only the default option: they can also be set to be self-correcting. This is done using the ON UPDATE and ON DELETE clauses, which provide the user with several choices on how to handle child data that would become orphaned, and hence violate the constraint, as a result of a change in the parent table.

Read on to see how these operate in SQL Server.

Comments closed

PFS Contention and Heaps

Published 2021-01-05 by Kevin Feasel

Uwe Ricken continues a series on heaps in SQL Server:

The PFS page “can” become a bottleneck for a heap if many data records are entered in the heap in the shortest possible time. How often the PFS page has to be updated depends mostly on the data record’s size to be saved.
This procedure does not apply to clustered indexes since data records in an index must ALWAYS be “sorted” into the data volume according to the defined index value. Therefore, the search for a “free” space is not carried out via the PFS page but via the value of the key attribute!

Read on for more detail.

Comments closed

The Merge Interval Operator

Published 2020-12-15 by Kevin Feasel

Hugo Kornelis looks at another execution plan operator:

The Merge Interval operator reads dynamic seek range specifications, checks to see if their specified ranges overlap, and if so combines the overlapping ranges into one new range.
One typical use case is for a query that uses multiple BETWEEN specifications, connected with OR. When these ranges overlap, they must be combined into a single range. This saves performance, but more important is that it prevents rows that satisfy both range specifications from being returned multiple times. When the boundaries of the BETWEEN are given as constants, the optimizer analyzes for overlaps and combines ranges if needed when compiling the query. But when the boundaries of the BETWEEN specifications are only known at run-time (variables, column references), the Merge Interval operator is used for this task.

Click through to see how it works.

Comments closed

Category: Internals