Understanding Hash Match Aggregates

Itzik Ben-Gan continues his series on grouping and aggregating data by looking at the hash match aggregation process:

The estimated CPU cost for the Hash Aggregate in the plan for Query 8 is 0.166344, and in Query 9 is 0.16903.

It could be an interesting exercise to try and figure out exactly in what way the cardinality of the grouping set, the data types, and aggregate function used affect the cost; I just didn’t pursue this aspect of the costing. So, after making a choice of the grouping set and aggregate function for your query, you can reverse engineer the costing formula. For example, let’s reverse engineer the CPU costing formula for the Hash Aggregate operator when grouping by a single integer column and returning the MAX(orderdate) aggregate. The formula should be:

Operator CPU cost = <startup cost> + @numrows * <cost per row> + @numgroups * <cost per group>

Using the techniques that I demonstrated in the previous articles in the series, I got the following reverse engineered formula:

Operator CPU cost = 0.017749 + @numrows * 0.00000667857 + @numgroups * 0.0000177087

Definitely worth reading in detail.

Row Width And Snapshot Isolation

Kendra Little shows us the impact that row width has on snapshot isolation:

So I went to work to demonstrate row width impact on the version store — when only a tiny bit column is changed in the row.

Here’s how I did the test:

  • I created two tables, dbo.Narrow and dbo.Wide. They each each have a bit column named bitsy, along with some other columns.
  • I inserted one row in each table, but I put a lot more data into the row in dbo.Wide.
  • I allowed snapshot isolation on the database
  • I began a transaction in another session under snapshot isolation and left the transaction open (so version store cleanup wouldn’t kick in while I looked around)
  • I updated the bit column named bitsy for the single row in each table, thereby generating a row-version in tempdb for each table

The code I ran to test this is here, if you’d like to play around with it.

Read on for the results.

Digging Into The SQL Compute Context With R Services

Niels Berglund dives into how the SQL Compute Context works with R Services:

In the code above we use the RxInSqlServer() function to indicate we want to execute in a SQL context. The connectionString property defines where we execute, and the numTasks property sets the number of tasks (processes) to run for each computation, in Code Snippet 4 it is set to 1 which from a processing perspective should match what we do in Code Snippet 3. Before we execute the code in Code Snippet 4 we do what we did before we ran the code in Code Snippet 3:

  • Run Process Explorer as admin.
  • Navigate to the devenv.exe process in Process Explorer.
  • In addition, also look at the Launchpad.exe process in Process Explorer.

When we execute we see that the BxlServer.exe processes under the Microsoft.R.Host.exe processes are idling, but when we look at the Launchpad.exe process we see this:

This is a bit deep but interesting reading.

Nested Loops And Implicit Reordering

Dmitry Piliugin shows how the SQL Server optimizer can end up reordering data in a nested loops join to improve performance:

The purpose is to minimize random access impact. If we perform an Index Seek (with a partial scan, probably) we read the entries in the index order, in our case, in the order of CustomerID, which is clearly seen on the first result set. The index on CustomerID does not cover our query, so we have to ask the clustered index for the column SomeData, and actually, we perform one another seek, seeking by the SalesOrderID column. This is a random seek, so what if, before searching by the SalesOrderID we will sort by that key, and then issue an ordered sequence of Index Seeks, turning the random acces into the sequential one, wouldn’t it be more effective?

Yes, it would in some cases, and that is what “optimized” property tells us about. However, we remember, that it is not necessarily leads to the real reordering. As for comparing the real impact, I will refer you to the actual Craig’s post or leave it as a homework.

Read the whole thing.  This is one reason why it’s important to emphasize that in SQL, you can only assume order if you have an explicit ORDER BY clause.

What Read Committed Isolation Level Gets You

Paul Randal explains the answer, which is “not much”:

The ‘weird’ behavior is that when the “Batch 2” select completes, after having been blocked by the “Batch 1” transaction, it doesn’t return all 1,000 rows (even though “Batch 1” has completed). Furthermore, depending on when the “Batch 2” select is started, during the 10-seconds that “Batch 1” executes, “Batch 2” returns different numbers of rows. This behavior had also been reported on earlier versions of SQL Server as well. It’s easy to reproduce on SQL Server 2016/2017 and can be reproduced in all earlier versions with a single configuration change (more details in a moment).

Additionally, if the table has a clustered index created, 1,000 rows are returned every time, on all versions of SQL Server.

So why is this weird? Many people expect that all 1,000 rows will be returned every time AND that the structure of the table or the version of SQL Server should not make any difference.

Unfortunately, that assumption is not correct when using read committed.

Read Committed is a trade-off, not an ideal.

How The Hash Match Join Operation Works

Dmitry Piliugin explains what happens when SQL Server calls for a hash match to join two tables together:

Hash Match in the join mode consumes two inputs, as we are joining two tables. The main idea is to build the hash table using the first “build” input, and then apply the same approach hash the second “probe” input to see if there will be matches of hashed values.

Query Processor (QP) is doing many efforts while building the plan to choose the correct join order. From the Hash Match prospective, it means that QP should choose what table is on the Build side and what is on the Probe side. The Build size should be smaller as it will be stored in memory when building a hash table.

Building a hash table begins with hashing join key values of the build table and placing them to one or another bucket depending on the hash value. Then QP starts processing the probe side, it applies the same hash function to the probe values, determining the bucket and compares the values inside of the bucket. If there is a match – the row is returned.

That would be the whole story if we had infinite memory, but in the real world, it is not true. More to the point, SQL Server allocates memory to the query before the execution starts and does not change it during the execution. That means that if the allocated memory amount is much less than the data size came during the execution, a Hash Match should be able to partition the joining data, and process it in portions that fit allocated memory, while the rest of the data is spilled to the disk waiting to be processed. Here is where the dancing begins.

Read on to learn more about the details of this operation.

The Value Of Spinlocks

Jeremiah Peschka walks us through the concepts behind spinlocks:

Here’s our fist stab at a spinlock, badly written in C:

// lock_value is an integer that's shared between multiple threads
while (lock_value != 0) { // spin
lock_value = 1;
lock_value = 0;

The general idea here is correct – we have some lock_value and we only want to allow processes into the critical section of code if the another process doesn’t “hold the lock.” Holding the lock means that lock_value is set to a non-zero value.

There are few things that make this code bad. These things, coincidentally, are part of what make concurrent programming difficult.

Spinlocks are a critical part of maintaining internal consistency, but they can also accidentally ruin performance on your system if done wrong.  Read the whole thing.

How LSNs Get Generated

Stuart Moore looks at how SQL Server builds log sequence numbers:

If you’ve ever dug down in the SQL Server transaction logs or had to build up restore chains, then you’ll have come across Log Sequence Numbers (LSNs). Ever wondered why they’re so large, why they all look suspiciously the same, why don’t they start from 0 and just how does SQL Server generate these LSNs? Well, here we’re going to take a look at them

Below we’ll go through examples of how to look inside the current transaction log, and backed up transaction logs. This will involve using some DBCC commands and the undocumented fn_dblog and fn_dump_dblog function. The last 2 are very handy for digging into SQL Server internals, but be wary about running them on a production system without understanding what’s going on. They can leave filehandles and processes behind that can impact on your system.

It’s an interesting look into SQL Server’s internals.

Row Goals On Anti-Joins

Paul White continues his row goals series:

The optimizer assumes that people write a semi join (indirectly e.g. using EXISTS) with the expectation that the row being searched for will be found. An apply semi join row goal is set by the optimizer to help find that expected matching row quickly.

For anti join (expressed e.g. using NOT EXISTS) the optimizer’s assumption is that a matching row will not be found. An apply anti join row goal is not set by the optimizer, because it expects to have to check all rows to confirm there is no match.

If there does turn out to be a matching row, the apply anti join might take longer to locate this row than it would if a row goal had been used. Nevertheless, the anti join will still terminate its search as soon as the (unexpected) match is encountered.

This is a shorter article but very useful in understanding row goals, along with the rest of his series.

Using Extended Properties For Documentation

Phil Factor shows us how we can use Extended Properties to build database documentation:

Once you’ve got into the habit of using Extended Properties to document your database, there are obvious benefits:

  • You can explain why you added that index or modified that constraint.
  • You can describe exactly what that rather obscure column does.
  • You can add a reasoned explanation to the use of a table.

You will often need these explanations because, sadly, DDL code isn’t ‘self-documenting’, and human memory is fallible. Extended Properties are easily searched because they are all exposed in one system view.

It is great to add explanations to lists of procedures, functions and views once the database becomes sizeable. Extended Properties are useful when exploring the metadata, but the requirement isn’t quite so essential because comments are preserved along with the source code. Tables, however, are a big problem because SQL Server throws away the script that produces the table, along with all the comments. The reason that this happens is that there are many ways you can alter parts of a table without scripting the entire table. How could one infallibly preserve all these ALTER statements in the preserved script? It’s tricky. Table scripts that you get from SSMS or script via SMO are therefore synthesised from the system tables but without those comments or even Extended Properties.

Extended Properties are useful, but I think the lack of tooling around them prevented widespread adoption.  Now that there are a few tools which support them (including SSMS’s data classification mechanism), I wonder if these will get a second look.


July 2018
« Jun