Press "Enter" to skip to content

Category: Bugs

Security Breach in Cosmos DB: ChaosDB

Nir Ohfeld and Sagi Tzadik discovered a flaw in Azure Cosmos DB:

Nearly everything we do online these days runs through applications and databases in the cloud. While leaky storage buckets get a lot of attention, database exposure is the bigger risk for most companies because each one can contain millions or even billions of sensitive records. Every CISO’s nightmare is someone getting their access keys and exfiltrating gigabytes of data in one fell swoop.

So you can imagine our surprise when we were able to gain complete unrestricted access to the accounts and databases of several thousand Microsoft Azure customers, including many Fortune 500 companies. Wiz’s security research team (that’s us) constantly looks for new attack surfaces in the cloud, and two weeks ago we discovered an unprecedented breach that affects Azure’s flagship database service, Cosmos DB.

Read on for details about the attack. Microsoft has already mitigated the issue by disabling the functionality necessary to pull off the attack. H/T Ben Stegink.

Comments closed

Dealing with Non-Yielding Schedulers

Sean Gallardy breaks up the party:

One of the most common items that will cause a memory dump in SQL Server is a non-yielding scheduler (generally referred to as NYS). What the heck does that mean? Why would it cause a memory dump? Is there anything that can be investigated? Good questions, let’s take a look.

Read on to learn what these are, why they’re not something you want to deal with on a regular basis, and how you can get more information on what happened out of a dump file. Which is also going to be helpful for Microsoft staff to diagnose and correct the underlying issue (if possible).

Comments closed

Ways to avoid the MERGE Operator

Michael J. Swart has important bullet points:

Aaron Bertrand has a post called Use Caution with SQL Server’s MERGE Statement. It’s a pretty thorough compilation of all the problems and defects associated with the MERGE statement that folks have reported in the past. But it’s been a few years since that post and in the spirit of giving Microsoft the benefit of the doubt, I revisited each of the issues Aaron brought up.

Some of the items can be dismissed based on circumstances. I noticed that:

– Some of the issues are fixed in recent versions (2016+).

– Some of the issues that have been marked as won’t fix have been fixed anyway (the repro script associated with the issue no longer fails).

– Some of the items are complaints about confusing documentation.

– Some are complaints are about issues that are not limited to the MERGE statement (e.g. concurrency and constraint checks).

Spoilers: some + some + some + some is still a lot less than all. Read the whole thing.

Comments closed

Memory-Optimized tempdb Metadata Bug

Mark Wilkinson shines a light on a bug:

Hey folks! Today I’ll share another bug we found recently when testing some new SQL Server 2019 instances in our development environment. This one is a bit concerning as it could affect a lot of shops and the bug presents itself during a pretty common use case.

The unfortunate thing is that ours is exactly the type of environment that memory-optimized tempdb metadata was made for.

Comments closed

Categorizing Why Bugs Can Be Tricky

Julia Evans has a list:

Hello! I’m very slowly working on writing a zine about debugging, so I asked on Twitter the other day:

If you’ve run into a bug where it felt “impossible” to understand what was happening – what made it feel that way?

Of course, bugs always happen for logical reasons, but I’ve definitely run into bugs that felt like they might be impossible for me to understand (until I figured them out!)

I got about 400 responses, which I’ll try to summarize here. I’m not going to talk about how to deal with these various kinds of “impossible” bugs in this post, I’ll just try to classify them.

Click through for the major categories, as well as explanations and sub-categories. I think an interesting follow-up to this is to ask why we find ourselves in situations where we get these sorts of bugs and what (if anything) we can do to minimize or eliminate the likelihood of their appearance.

Comments closed

Bug with Filtered Index on Computed Column

Erik Darling points out a weird bug:

At some point in the past, I blogged about a silent bug with computed columns and clustered column store indexes.

In this post, I’m going to take a quick look at a very loud bug.

Normally, you can’t add a filtered index to a computed column. I’ve always hated that limitation. How nice would that be for so many currently difficult tasks?

Click through to see how you can create a filtered index against a computed column, as well as all of the pain it provides.

Comments closed

Recursive UDF Bug in SQL Server 2019

Erik Darling finds the bugs so you don’t have to:

I see people do things like this fairly often with UDFs. I don’t know why. It’s almost like they read a list of best practices and decided the opposite was better.

This is a quite simplified function, but it’s enough to show the bug behavior.

While writing this, I learned that you can’t create a recursive (self-referencing) scalar UDF with the schemabinding option. I don’t know why that is either.

I’m going to go out on a limb and say that if you run into this bug, it’s your own fault.

Comments closed

sample_ms in sys.dm_io_virtual_file_stats and Data Types

Paul Randal points out an interesting bug:

A prior student emailed me yesterday about some strange behavior of the sample_ms column in sys.dm_io_virtual_file_stats. It’s supposed to be the number of milliseconds since the SQL Server instance has been started. He has a SQL Server 2016 instance that’s been running since August 2019, and it shows the following:

ms_ticks from sys.dm_os_sys_info: 51915112684 (which works out to be August 28, 2019)

sample_ms from sys.dm_io_virtual_file_stats: 375504432 (which works out to be about 4.5 days)

Read on to get a determination and an alternative in case you’re using that field.

Comments closed

Bug around Parallel Eager Spools and Batch Mode

Paul White digs into a nasty bug:

more accurate description of the issue would be:

This bug can cause wrong results, incorrect error messages, and statement failure when:

– A data-modification statement requires Halloween Protection.
– That protection is provided by a Parallel Eager Spool.
– The spool is on the probe side of a Batch Mode Hash Join.

This issue affects Azure SQL Database and SQL Server 2014 to 2019 inclusive.

Read on for a repro and Paul’s thoughts. As of March 2021, this is an active problem, so it’s worth keeping an eye on in your environment. I’d wager, though, that this probably doesn’t pop up on its own very frequently.

Comments closed

Hyperscale and RBIO_RG_STORAGE

Reitse Eskens runs into a strange bug in Azure SQL Database Hyperscale:

This single wait made sure our complete environment went dead in the water. Everything halted. To get some context, Microsoft has some documentation on this wait:

Occurs when a Hyperscale database primary compute node log generation rate is being throttled due to delayed log consumption at the page server(s).

Well, that’s not really helping, because that’s about everything they tell you about it.

Click through for Reitse’s findings and Microsoft’s advice.

Comments closed