Press "Enter" to skip to content

Curated SQL Posts

Architecting a Jenkins Replacement

Li Haoyi takes us through an internal Databricks tool for continuous integration:

Runbot is a bespoke continuous integration (CI) solution developed specifically for Databricks’ needs. Originally developed in 2019, Runbot incrementally replaces our aging Jenkins infrastructure with something more performant, scalable, and user friendly for both users and maintainers of the service. This blog post will explore the motivations behind developing Runbot, the core design decisions that went into it, and how we used it to greatly improve the experience of all the developers within the Databircks engineering organization.

It doesn’t look like the tool is available externally, but it’s an interesting read and helps understand some of the “why” behind the solution.

Leave a Comment

Moving Files Associated with Availability Groups

Eitan Blumin has a doozy of a short script:

Today, I’m sharing with you a cool Powershell script that basically implements the methodology necessary to move database files to a new location in AlwaysOn Availability Groups, without breaking HADR.

It’s based on a few very useful step-by-step guides on the topic such as this one and this one and this one. But it takes it a step further by being a single cohesive Powershell script that does everything end-to-end.

Well… Almost everything… The only thing it’s missing is somehow disabling any SQL Agent jobs that may be performing backups. I still haven’t figured out how to possibly automate such a thing, so you’d have to do that manually on your own.

Click through for instructions, notes, and warnings, as well as the script itself.

Leave a Comment

Things You Might Not Need in SQL Server

Erik Darling has two posts of a similar theme. First up is that you might not need to offload reads:

Duplicating data for reporting, outside of moving it to a data warehouse where there’s some transformations involved, can be an expensive and perilous task.

Your options come down to a native solution like AGs, Replication, or Log Shipping. You can brew something up yourself that relies on native stuff too, like Change Data Capture, Change Tracking, Temporal Tables, or triggers.

Erik’s suggestion here is that appropriate query tuning (and I’ll add proper database design!) does more for you than scaling out.

Then, Erik takes it one step further and recommends against certain features in SQL Server:

Consulting gives you a lot of opportunities to talk to a lot of people and deal with interesting issues.

Recently it occurred to me that a lot of people seem to confer magic button status to a lot of things that always seem to be If-I-Could-Only-Do-This features that would solve all their problems, and similarly a Thing-That-Solved-One-Problem-Once turned into something that got used everywhere.

I do agree with Erik on partitioning (which makes administration easier but usually only helps with read queries on columnstore indexes with huge numbers of rows), In-Memory OLTP (which could have been an incredible feature if it worked as I’d hoped), dirty reads, and over-use of recompilation. For sufficiently busy environments, I disagree with Erik’s take on fill factor, having been convinced by Jeff Moden that there’s a lot of value in well-thought-out fill factor settings, but to make the most of it requires more knowledge of the data than DBAs typically take the time to learn.

Leave a Comment

Long-Term Storage in Gold

Alex Woodie discusses a new storage solution:

In the hunt for the optimal long-term storage, media comes and media goes. From cave paintings and daguerreotypes to tape and optical disk, humans continue to search for new ways to preserve information for use by subsequent generations. Now an Upstate New York company is touting a new type of archive that leverages an element with the ultimate in longevity: gold.

Totenpass today announced the launch of its novel storage solution that’s made of gold. Data is written directly to the surface of the gold cartridge (which features some nickel) using a laser etching process. The data takes the form of shrunken down human-readable images or documents, or machine-readable data that can be encrypted and read with a smartphone app (which is still in development).

This makes perfect sense to me. If you write it in metal, that means Ruin can’t read or modify it.

Leave a Comment

Measuring the Performance Impact of TDE

Manvendra Singh does the math:

Transparent Data Encryption (TDE) encrypts database files to secure your data. It also encrypts the tempdb database to secure your data in a temporary space. The process of encryption and decryption adds additional overhead to the database system. Even non-encrypted databases hosted on the same SQL Server instance would have some performance degradation because of tempdb encryption. Today I will show you performance impact analysis using few simple T-SQL statements by comparing their stats gathered before and after enabling TDE.

I will execute T-SQL statements (INSERT, UPDATE, SELECT, BACKUP DATABASE) before and after enabling encryption (Transparent Data Encryption) and gather their performance statistics during each execution. Finally, once we will have performance stats taken before and after enabling encryption then we will compare them to understand the performance impact analysis. I have used DBCC DROPCLEANBUFFERS before executing each query to clean the buffer cache.

Click through for the results. My rule of thumb is about a 5% performance overhead, but Manvendra shows us some of the particulars of what that means.

Leave a Comment