Love Your Bugs

Allison Kaptur has an appreciation for strange bugs:

I love this bug because it shows that bitflips are a real thing that can happen, not just a theoretical concern. In fact, there are some domains where they’re more common than others. One such domain is if you’re getting requests from users with low-end or old hardware, which is true for a lot of laptops running Dropbox. Another domain with lots of bitflips is outer space – there’s no atmosphere in space to protect your memory from energetic particles and radiation, so bitflips are pretty common.

You probably really care about correctness in space – your code might be keeping astronauts alive on the ISS, for example, but even if it’s not mission-critical, it’s hard to do software updates to space. If you really need your application to defend against bitflips, there are a variety of hardware & software approaches you can take, and there’s a very interesting talk by Katie Betchold about this.

This is an interesting approach, although I fear I may not have the temperament to love my bugs so much as despise their existence.

Curl For Windows Now Uses WinSSL

David Smith notes that curl version 3.0 supports the winSSL library rather than installing OpenSSL:

To implement secure communications, the curl package needs to connect with a library that handles the SSL (secure socket layer) encryption. On Linux and Macs, curl has always used the OpenSSL library, which is included on those systems. Windows doesn’t have this library (at least, outside of the Subsystem for Linux), so on Windows the curl package included the OpenSSL library and associated certificate. This raises its own set of issues (see the post linked below for details), so version 3.0 of the package instead uses the built-in winSSL library. This means curl uses the same security architecture as other connected applications on Windows.

This shouldn’t have any impact on your web-connectivity from R now or in the future, except the knowledge that the underlying architecture is more secure. Nonetheless, it’s possible to switch back to OpenSSL-based encryption (and this remains the default on Windows 7, which does not include the winSSL).

Click through for more information.

Scraping SQL Saturday Statistics

Tomaz Kastrun shows how to use rvest to read the SQL Saturday website and parse schedule details:

I wanted to check a simple query: How many times has a particular topic been presented and from how many different presenters.

Sounds interesting, tackling the problem should not be a problem, just that the end numbers may vary, since there will be some text analysis included.

Read on for the code and some analysis.

ROWGROUP_FLUSH Deadlocks Inserting Into Clustered Columnstore Indexes

Joe Obbish ran into a strange deadlock when performing concurrent insertions into a clustered columnstore index:

We’ve only observed this deadlock with multiple concurrent sessions insert to the delta store for the same target CCI due to server memory pressure or very low cardinality estimates (less than 251 rows). The correct mitigation depends on why you’re seeing the issue in the first place. If you’re seeing it due to low cardinality estimates then fix your estimates, or at the very least get them above 250 rows. If you’re seeing them because the memory grant for the CCI build times out after 25 seconds then lower concurrency or increase server memory.

The problem can also be avoided by not doing concurrent inserts in the first place. In some cases a parallel insert may be a reasonable alterative. There’s also some evidence that the deadlock is only seen when the number of rows for insert is very close to 1048576, but we weren’t able to make any definitive conclusions around that.

Read the whole thing.  Also check out his Connect item.

Making CXPACKET The Bad Guy

Pedro Lopes goes into detail regarding the new CXCONSUMER wait stat:

It’s important to mention a couple notes before getting into details:

  1. Parallelism use is intrinsically a benefit for costly queries, bound by the Cost Threshold for Parallelism and Max Degree of Parallelismserver configurations among other considerations. See the Query Processing Guide for more in-depth information on how SQL Server implements parallelism. Also refer to Craig Freedman’s blog series on the topic.
  2. Waits are a normal part of the waits and queues model, allowing SQL Server to concurrently execute many more requests than there are schedulers available. See more about waits and queues here,

It’s also important to understand that parallelism is implemented as if it were two operators. There’s the producers that push data to consumers, and the consumers that may have to wait for data from producers.

And so it becomes easier to understand how producer waits are the ones that may require attention, while consumer waits are inevitable as a passive consequence of longer running producers.

I actually would have preferred the rename go the other way and create a CXPRODUCER which is bad, rather than a CXCONSUMER which is benign.  The problem is that if you’re running in a heterogeneous environment, you have to remember that CXPACKET on Server A might indicate a problem, whereas it’s much fuzzier on Server B because you haven’t upgraded Server B to the latest CU yet.  Regardless, I’m happy that we’re going to be able to have a better idea when parallelism is actually a problem.

JSON Parsing Performance

Jovan Popovic has a couple posts showing JSON parsing performance in SQL Server 2017.  First up, he makes the statement that JSON parsing can be 10x faster than XML:

I’m running 10000 iterations and getting the average spent time.

Average time to parse JSON and get the value using JSON_VALUE() function is around 3 microseconds while the equivalent action with XML typed variable and value() method takes between 30 and 40 microseconds.

This shows that parsing JSON “plain text” is 10x faster than parsing strongly typed XML variable.

He also compares JSON parsing to spatial type parsing:

I’m running 100.000 iterations and getting the average spent time in both cases. The results are:

  1. Spatial type needs around 11-14 microseconds to get the X coordinate
  2. JSON parser needs around 1-2 microseconds to get the value from X property (including CAST to float)

We can see that parsing JSON text is much faster than the equivalent operation in Spatial type.

JSON uses NVARCHAR type that is the most optimized type in SQL Server because it is used in most of the queries in SQL Server. JSON parser is based on T-SQL parser that is also very optimized. Spatial has a type, but it is still some binary serialized object that needs to be deserialized when we need to access the fields. In this case, scanning NVARCHAR is much faster than deserializing binary data into spatial.

On this side, Jovan does note that there are benefits to using spatial types, like performing spatial calculations.

Upgrading A Cluster To Windows Server 2016

Ryan Adams shows how to upgrade a failover cluster running Windows Server 2012 R2 to Windows Server 2016 without having to start from scratch:

Starting in Windows Server 2012 R2 you now have a way to upgrade a cluster to Windows 2016.  The best part is it’s not an OS upgrade, but a rebuild.  The magic is that you can join a Windows 2016 server to a Windows 2012 R2 cluster.  You can upgrade your cluster with as little as one failover and thus very little down time.  Everything stays in compatibility mode until all nodes are upgraded to Windows 2016 and then you upgrade the cluster functional level.  This is great news for those of us running FCIs or AGs.

Click through for a listing of steps and a video.

Categories

November 2017
MTWTFSS
« Oct  
 12345
6789101112
13141516171819
20212223242526
27282930