Press "Enter" to skip to content

Curated SQL Posts

Using the Dedicated Admin Connection

Reitse Eskens has the key for the back door:

I got a call this morning from a coworker. One of the database instances was unreachable with the message that the TempDB log file was full. No processes could login and the only way to get things going again might be to restart either server or instance. Restarting the instance should be the last resort, because after a restart SQL Server will plough through all the logfiles to find transactions to either roll forward or roll back. That will take more time that you want. But, we had the famous DAC backdoor installed.

Read on to see how the DAC can save the day.

Comments closed

Finding Queries with Index Hints

Aaron Bertrand wants to find queries using index hints:

Index hints should be a last resort, but we all know how that tends to go in SQL Server. If you have committed to using index hints, be aware that at some point you may have to update those explicit references. A specific index may get renamed or dropped, making the hints invalid, or a new and better index may be created. How do we find these references to remove, update, or at least document them? I see a lot of queries out there that will search the plan cache for index hints (often using fuzzy matching), or that just happen to use a specific index, but what about queries with explicit index hints that aren’t currently in the plan cache, and is index usage really what we’re after?

Click through for a detailed investigation, and also congratulations to Aaron for landing at Stack Overflow.

Comments closed

Finding Processes Waiting on Spinlocks

David Fowler helps us find processes waiting on spinlocks:

Spinlock contention is always a real headache to deal with. I recently saw an issue when spinlock contention on SOS_CACHESTORE was making the server virtually unresponsive. The issue was very intermittent with no obvious pattern but the assumption is that it was caused by a particular process in the application. Finding what that process was the tricky part, they don’t show up as waiting tasks so your usual scripts for looking for waiting processes may not work here.

It’s almost never spinlocks, but when it is, this is how you figure it out.

Comments closed

Thinking like an Escalation Engineer

Stacy Gray shares stories:

“You new?” asked with an amused grin.

“Yes,” I replied floating 2 inches off the ground with a huge, toothy smile.

“Which team?”

“SQL!”

“Good luck.”

I glanced at the badge.  It was blue.  My opportunity to get some secret, inside wisdom!

“I want to become a blue badge.  Do you have any advice on that?” The elevator doors opened.

“Solve your own cases,” was the reply.

Read on for stories, advice, and more.

Comments closed

Azure Synapse Analytics Shared Security

Hiram Fleitas explains the value of workspace and storage account segregation in Azure Synapse Analytics:

Well, why?… perhaps you prefer not spinning more resources to segmentate the environment or decouple the workloads, but you still need to enforce data security across domains.

Lets look at how to secure an HR container in a shared Azure Synapse Analytics workspace that serves mixed workloads by using only RBAC permissions at the storage, and at container level.

It’s recommended to use a separate storage account. I will explain and demo why.

Click through for the demo and explanation.

Comments closed

Azure Synapse Data Explorer Pools

Manoj Raheja tries announces another pool type:

At Ignite, we announced the public preview of Azure Synapse data explorer that makes it possible to query huge amounts of structured, semi-structured, and free-text telemetry and time-series data. The following are some of the key capabilities that make this possible:

– Powerful distributed query engine that indexes all data including free text and semi-structured data. The data is automatically compressed, indexed, auto-optimized, and cached on local SSDs and persisted on storage. Compute and storage are decoupled that gives you full elasticity to auto scale in/out without a downtime.

– Intuitive Kusto Query Language (KQL) that is highly optimized for exploring raw telemetry and time series data using Synapse data explore’s best-in-class text indexing for efficient free-text search, regex, and parsing on traces\text data.

– Comprehensive JSON parsing capabilities for querying semi-structured data including arrays and nested structure.

– Native, advanced time series support for creation, manipulation, and analysis of multiple time series with in-engine Python and R execution support for model scoring.

Click through for a demonstration, showing that this is for more than just logs.

Comments closed

Sort-Based Blocking Shuffle in Flink

Yingjie Cao and Daisy Tsang have a multi-part series on sort-based blocking shuffles in Apache Flink. Part 1 acts as an overview:

The hash-based blocking shuffle has been supported in Flink for a long time. However, compared to the sort-based approach, it can have several weaknesses:

1. Stability: For batch jobs with high parallelism (tens of thousands of subtasks), the hash-based approach opens many files concurrently while writing or reading data, which can give high pressure to the file system (i.e. maintenance of too many file metas, exhaustion of inodes or file descriptors). We have encountered many stability issues when running large-scale batch jobs via the hash-based blocking shuffle.

2. Performance: For large-scale batch jobs, the hash-based approach can produce too many small files: for each data shuffle (or connection), the number of output files is (producer parallelism) * (consumer parallelism) and the average size of each file is (shuffle data size) / (number of files). The random IO caused by writing/reading these fragmented files can influence the shuffle performance a lot, especially on spinning disks. See the benchmark results section for more information.

By introducing the sort-based blocking shuffle implementation, fewer data files will be created and opened, and more sequential reads are done. As a result, better stability and performance can be achieved.

Part 2 provides some design considerations:

As discussed above, the hash-based blocking shuffle would produce too many small files for large-scale batch jobs. Producing fewer files can help to improve both stability and performance. The sort-merge approach has been widely adopted to solve this problem. By first writing to the in-memory buffer and then sorting and spilling the data into a file after the in-memory buffer is full, the number of output files can be reduced, which becomes (total data size) / (in-memory buffer size). Then by merging the produced files together, the number of files can be further reduced and larger data blocks can provide better sequential reads.

Flink’s sort-based blocking shuffle adopts a similar logic. A core difference is that data spilling will always append data to the same file so only one file will be spilled for each output, which means fewer files are produced.

Check it out for a behind-the-scenes look at

Comments closed

Testing Azure SQL MI Premium

Joe Obbish reaches for the top shelf:

At Microsoft Ignite 2021, public preview for new “premium-series” hardware was announced for Azure SQL Managed Instances. There’s even a black friday sort of sale during this month where you can do testing on premium-series VMs without paying for the compute costs. As someone without free cloud bucks: sign me up!

I did some basic query benchmarking to get an idea of the performance difference between the new premium VMs and the standard gen 5 VMs. The test VMs aren’t identical in specs: the standard-series has 4 vCore with 20.4 GB of memory and the premium-series has 8 vCore with 56 GB of memory. I will attempt to call out any situations where that spec difference had a measurable impact.

Read on for Joe’s findings.

Comments closed

Restoring a Database from Backup in Docker

Chad Baldwin has a container in search of a database:

Yesterday, I was watching a Pluralsight course which provided a database .bak file to follow along with the examples. I generally like to use Docker when working with SQL Server locally…but as a somewhat novice user, I have found it to be a bit of a pain if you need to deal with restoring or attaching a database.

When I run into these scenarios, I usually spin up an AWS EC2 instance, install SQL server, and work with it that way. There’s probably a simpler way to do it using RDS or Azure, but I’m not familiar with those just yet. The other option is if I have a Linux machine at hand, I will use that with Docker and mapped volumes work great.

I do happen to have a Linux machine ready to use…but I was determined to figure out how to get this working on Windows.

Bonus points for using RESTORE DATABASE syntax. Every SQL Server user should know how to back up and restore a database using only T-SQL. That’s a skill which will definitely pay dividends.

Comments closed

The Cause of Resource Semaphore Waits

Chad Callihan has an analogy for us:

Have you ever spent Black Friday shopping, filled up your car, and then ran out of space at the end for a big purchase? Your vehicle is already full but that oversized exercise equipment is too big of a deal to pass up! You’re going to have to wait until you can unload at home first before there’s room for that new clothes rack…um…I mean exercise equipment.

That’s kind of the same idea as RESOURCE_SEMAPHORE waits. SQL Server has a large query (that exercise equipment) but not enough memory to execute (purchase).

Read on for Chad’s explanation of how you can deal with RESOURCE_SEMAPHORE waits, but as someone who drives a Miata, of course I have stories. My wife and I went to the mall one day, a few years back, and we bought a space heater for the room above our garage. Well, that thing was just a little too big for the trunk and so, on a 40-something degree day (approximately -15,000 in Celsius, I’m pretty sure), here we are driving down the highway in a Miata with the top down and my wife’s arms wrapped around this box (and mind you, the box was bigger around than she is, so those arms don’t quite get all the way around that box) sitting in the passenger’s seat. Good times.

Now, in fairness to that car, you can fit a lot of stuff in a Miata trunk. Another time, we’d bought a large area rug for our dining room, as well as a lengthy floor runner. When we made the purchase, I thought it was going to be delivered, but nope. I did learn that day, however, just how compressible a rug is, as they proceeded to bind that thing so together that it fit comfortably into the trunk, where by “comfortably” I mean “barely but I’m trying to sell up how large this trunk is.” As for the floor runner, we had to cram it into the space between the headrests on our seats and the raised top, where it just barely fit. Sure, I couldn’t see behind me, but winners never look back.

Comments closed