Press "Enter" to skip to content

Curated SQL Posts

Love Your Bugs

Allison Kaptur has an appreciation for strange bugs:

I love this bug because it shows that bitflips are a real thing that can happen, not just a theoretical concern. In fact, there are some domains where they’re more common than others. One such domain is if you’re getting requests from users with low-end or old hardware, which is true for a lot of laptops running Dropbox. Another domain with lots of bitflips is outer space – there’s no atmosphere in space to protect your memory from energetic particles and radiation, so bitflips are pretty common.

You probably really care about correctness in space – your code might be keeping astronauts alive on the ISS, for example, but even if it’s not mission-critical, it’s hard to do software updates to space. If you really need your application to defend against bitflips, there are a variety of hardware & software approaches you can take, and there’s a very interesting talk by Katie Betchold about this.

This is an interesting approach, although I fear I may not have the temperament to love my bugs so much as despise their existence.

Comments closed

Curl For Windows Now Uses WinSSL

David Smith notes that curl version 3.0 supports the winSSL library rather than installing OpenSSL:

To implement secure communications, the curl package needs to connect with a library that handles the SSL (secure socket layer) encryption. On Linux and Macs, curl has always used the OpenSSL library, which is included on those systems. Windows doesn’t have this library (at least, outside of the Subsystem for Linux), so on Windows the curl package included the OpenSSL library and associated certificate. This raises its own set of issues (see the post linked below for details), so version 3.0 of the package instead uses the built-in winSSL library. This means curl uses the same security architecture as other connected applications on Windows.

This shouldn’t have any impact on your web-connectivity from R now or in the future, except the knowledge that the underlying architecture is more secure. Nonetheless, it’s possible to switch back to OpenSSL-based encryption (and this remains the default on Windows 7, which does not include the winSSL).

Click through for more information.

Comments closed

ROWGROUP_FLUSH Deadlocks Inserting Into Clustered Columnstore Indexes

Joe Obbish ran into a strange deadlock when performing concurrent insertions into a clustered columnstore index:

We’ve only observed this deadlock with multiple concurrent sessions insert to the delta store for the same target CCI due to server memory pressure or very low cardinality estimates (less than 251 rows). The correct mitigation depends on why you’re seeing the issue in the first place. If you’re seeing it due to low cardinality estimates then fix your estimates, or at the very least get them above 250 rows. If you’re seeing them because the memory grant for the CCI build times out after 25 seconds then lower concurrency or increase server memory.

The problem can also be avoided by not doing concurrent inserts in the first place. In some cases a parallel insert may be a reasonable alterative. There’s also some evidence that the deadlock is only seen when the number of rows for insert is very close to 1048576, but we weren’t able to make any definitive conclusions around that.

Read the whole thing.  Also check out his Connect item.

Comments closed

Making CXPACKET The Bad Guy

Pedro Lopes goes into detail regarding the new CXCONSUMER wait stat:

It’s important to mention a couple notes before getting into details:

  1. Parallelism use is intrinsically a benefit for costly queries, bound by the Cost Threshold for Parallelism and Max Degree of Parallelismserver configurations among other considerations. See the Query Processing Guide for more in-depth information on how SQL Server implements parallelism. Also refer to Craig Freedman’s blog series on the topic.
  2. Waits are a normal part of the waits and queues model, allowing SQL Server to concurrently execute many more requests than there are schedulers available. See more about waits and queues here,

It’s also important to understand that parallelism is implemented as if it were two operators. There’s the producers that push data to consumers, and the consumers that may have to wait for data from producers.

And so it becomes easier to understand how producer waits are the ones that may require attention, while consumer waits are inevitable as a passive consequence of longer running producers.

I actually would have preferred the rename go the other way and create a CXPRODUCER which is bad, rather than a CXCONSUMER which is benign.  The problem is that if you’re running in a heterogeneous environment, you have to remember that CXPACKET on Server A might indicate a problem, whereas it’s much fuzzier on Server B because you haven’t upgraded Server B to the latest CU yet.  Regardless, I’m happy that we’re going to be able to have a better idea when parallelism is actually a problem.

Comments closed

JSON Parsing Performance

Jovan Popovic has a couple posts showing JSON parsing performance in SQL Server 2017.  First up, he makes the statement that JSON parsing can be 10x faster than XML:

I’m running 10000 iterations and getting the average spent time.

Average time to parse JSON and get the value using JSON_VALUE() function is around 3 microseconds while the equivalent action with XML typed variable and value() method takes between 30 and 40 microseconds.

This shows that parsing JSON “plain text” is 10x faster than parsing strongly typed XML variable.

He also compares JSON parsing to spatial type parsing:

I’m running 100.000 iterations and getting the average spent time in both cases. The results are:

  1. Spatial type needs around 11-14 microseconds to get the X coordinate
  2. JSON parser needs around 1-2 microseconds to get the value from X property (including CAST to float)

We can see that parsing JSON text is much faster than the equivalent operation in Spatial type.

JSON uses NVARCHAR type that is the most optimized type in SQL Server because it is used in most of the queries in SQL Server. JSON parser is based on T-SQL parser that is also very optimized. Spatial has a type, but it is still some binary serialized object that needs to be deserialized when we need to access the fields. In this case, scanning NVARCHAR is much faster than deserializing binary data into spatial.

On this side, Jovan does note that there are benefits to using spatial types, like performing spatial calculations.

Comments closed

Upgrading A Cluster To Windows Server 2016

Ryan Adams shows how to upgrade a failover cluster running Windows Server 2012 R2 to Windows Server 2016 without having to start from scratch:

Starting in Windows Server 2012 R2 you now have a way to upgrade a cluster to Windows 2016.  The best part is it’s not an OS upgrade, but a rebuild.  The magic is that you can join a Windows 2016 server to a Windows 2012 R2 cluster.  You can upgrade your cluster with as little as one failover and thus very little down time.  Everything stays in compatibility mode until all nodes are upgraded to Windows 2016 and then you upgrade the cluster functional level.  This is great news for those of us running FCIs or AGs.

Click through for a listing of steps and a video.

Comments closed

The Magic Of Sampling

Nathan LeClaire reminds us of an important story that statisticians have been telling us for a couple centuries:

It starts slowly. Maybe your home-grown centralized logging cluster becomes more difficult to operate, demanding unholy amounts of engineer time every week. Maybe engineers start to find that making a query about production is a “go get a coffee and come back later” activity. Or maybe monitoring vendors offer you a quote that elicits a response ranging anywhere from curses under the breath to blood-curdling screams of terror.

The multi-headed beast we know as Scale has reared its ugly visage.

As some of you may have already guessed from the title, I’m going to discuss one way to solve this problem, and why it might not be as bad as you might think.

Take some of your precious information and throw it in the garbage. In lots of cases, you can just drop those writes on the floor as long as your observability stack is equipped to handle it.

In other words, sample.

Read on for a couple of methods.  One thing I’ve taken a fancy to is collecting the first N of a particular type of message and keeping track of how often that message appears.  If you get the same error for every row in a file, then you might really only need to see that one time and the number of times it happened.  Or maybe you want to see a few of them to ensure that they’re really the same error and not two separate errors which are getting reported together due to insufficient error separation.

Comments closed

Quickly Computing Area Under The Curve

Jean-Francois Puget has a fast method for computing Area Under the Curve in Python:

When the target only takes two values we have a binary classification problem at hand.  Example of binary classification are very common. For instance fraud detection where examples are credit card transactions, features are time, location, amount, merchant id, etc., and target is fraud or not fraud.  Spam detection is also a binary classification where examples are emails, features are the email content as a string of words, and target is spam or not spam.  Without loss of generality we can assume that the target values are 0 and 1, for instance 0 means no fraud or no spam, whiloe 1 means fraud or spam.

For binary classification, predictions are also binary.  Therefore, a prediction is either equal to the target, or is off the mark.  A simple way to evaluate model performance is accuracy: how many predictions are right? For instance, if our test set has 100 examples in it, how many times is the prediction correct?  Accuracy seems a logical way to evaluate performance: a higher accuracy obviously means a better model.  At least this is what people think when they are exposed to the first time to binary classification problems.  Issue is that accuracy can be extremely misleading.

Read Jean-Francois’ explanation and scroll down for the Python sample.

Comments closed

Spotting SQL Agent Job Differences

Rob Sewell shows us how to compare SQL Agent jobs across Availability Group replicas:

On the plane home from PAS Summit I was sat next to someone who had also attended and when he saw on my laptop that I was part of the SQL Community we struck up a conversation. He asked me how he could compare SQL Agent Jobs across availability group replicas to ensure that they were the same.

He already knew that he could use Copy-DbaAgentJob from dbatools to copy the jobs between replicas and we discussed how to set up an Agent job to accomplish this. The best way to run an Agent Job with a PowerShell script is described here

Maybe I’m out in la-la land, but wouldn’t this be a perfect use case for MSX/TSX?

Comments closed