HASHBYTES Performance In SQL Server

Joe Obbish takes a look at how HASHBYTES doesn’t scale well:

The purpose of the MAX aggregate is to limit the size of the result set. This is a cheap aggregate because it can be implemented as a stream aggregate. The operator can simply keep the maximum value that it’s found so far, compare the next value to the max, and update the maximum value when necessary. On my test server, the query takes about 20 seconds. If I run the query without the HASHBYTES call it takes about 3 seconds. That matches intuitively what I would expect. Reading 11 million rows from a small table out of the buffer pool should be less expensive than calculating 11 million hashes.

From my naive point of view, I would expect this query to scale well as the number of concurrent queries increases. It doesn’t seem like there should be contention over any shared resources, so as long as every query gets on its own scheduler I wouldn’t expect a large degradation in overall run time as the number of queries increases.

Joe’s research isn’t complete, but he does have a conjecture as to why HASHBYTES doesn’t scale well.  That said, the most interesting thing in the post to me was to see Microsoft potentially using bcrypt under the covers for HASHBYTES calculation—if that’s really the case, there actually is a chance that sometime in the future, we’d be able to generate cryptographically secure hashes within SQL Server rather than the MD5, SHA1, and SHA2 hashes we have today.

Related Posts

Auditing Options With Azure SQL Data Warehouse

Janusz Rokicki explores what is available in Azure SQL Data Warehouse when it comes to auditing: Auditing is disabled by default and the UI experience depends on the region to which the logical server is deployed. For instance, in UK South, the portal offers no options to manage auditing: In North Europe, the portal allows […]

Read More

What’s Coming With Always Encrypted?

Monica Rathbun explains a new feature coming to SQL Server: As I discussed in part 3 there are many roads blocks the can stop the implementation of Always Encrypted (AE). In the current available versions of SQL Server 2016 and 2017, along with Azure SQL Database, the cost of using AE was way too high […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Categories

June 2018
MTWTFSS
« May  
 123
45678910
11121314151617
18192021222324
252627282930