Press "Enter" to skip to content

Curated SQL Posts

Alternative Storage Engines for PostgreSQL

Cristophe Pettus puts together a list:

PostgreSQL 12 shipped the table access method API in October 2019, and the community spent the next six years figuring out what to do with it. The early prediction was that within a few releases we would have a thriving ecosystem of pluggable storage engines — columnar for analytics, undo-log for OLTP, in-memory for hot workloads — and the heap would become “the default, not the only choice.”

That is almost what happened. The ecosystem exists. It is thriving in the sense that there are more credible options now than at any point in the project’s history. It is also messier than the early vision implied, more concentrated around a couple of design philosophies than was expected, and littered with the bones of projects that ran out of funding or hit walls in the API itself. Before the next post in this series puts numbers on any of this, it is worth taking stock of what is actually out there, what each project is trying to do, and where the architectural fault lines run.

Click through for a survey of who’s tried what and what’s still around today.

Leave a Comment

T-SQL Snapshot Backups on Hyper-V

Anthony Nocentino takes a backup:

If you’ve been following my T-SQL Snapshot Backup series, you’ve seen this technique work on bare-metal and standard VM deployments where database files live on volumes directly presented to the SQL Server OS. In this post, I’m bringing T-SQL Snapshot Backup into a Hyper-V cluster environment, with database files on VHDXs backed by a Pure Storage FlashArray Cluster Shared Volume (CSV). Hyper-V adds a few extra layers to manage at the hypervisor level, but the SQL Server side of the story is identical. Let’s walk through it.

Click through to see how it all works. It has taken me a long time to accept the idea that storage volume snapshots could fit the bill, and it’s been people like Anthony, Andy Yun, and Mark Wilkinson who have allayed my concerns over the years.

Leave a Comment

Tips for Disaster Response

Christophe Pettus shares some advice:

  • Wind your watch.

No one has a watch that winds anymore, but the point is: take a deep breath. Give yourself a minute, or two, or five to gather data. A too-fast response is the main way a problem becomes a disaster.

Click through for all seven of them. I fully agree with doing drills. If you don’t practice in the easy times, you probably won’t respond well in the harder times. Also, I recommend having the process written down on a one-pager that everyone has a copy of. This should include the most important details around emergency response: how to escalate, what information to start capturing early on, etc.

Leave a Comment

Page Splits and Readaheads on Clustered Tables

Erik Darling learns us some T-SQL:

But we can see that the number of page splits that have occurred on the server have gone up a bit here. We can see that that number has increased. So if we look back at the table itself now, and we look in here, right, we’re still not going to have any forwarded fetches because that’s never a thing.

But we do have a lot more pages in the table now, and the average page space used in percent has gone down dramatically. This was at 99-something percent. We are now below 50%.

We are at 46% full. Well, that doesn’t feel too good. What this means is that SQL Server has a lot more pages in the table now that are a whole lot less full, which means we are sort of like yesterday when we deleted a bunch of data from the heap and we still read empty pages.

Click through for the video.

Leave a Comment

Cold Storage Data Archival

Brent Ozar answers a question:

At least once a month, I get this question from a client:

We have big data, and we want to save money. We want to move the older data down to some kind of archives that will cost us less.

I ask exactly one followup question:

Are you willing to modify the app that queries the data, or do you want the users to still be able to query the old data in their existing screens and reports?

Brent recommends trying a linked server approach. I’d also toss PolyBase in to the mix, though it’s going to have similar pros and cons to linked servers in this specific scenario. I have an old post on cold storage with PolyBase but the mechanisms haven’t changed much. I do have some sample scripts in my PolyBase talk’s GitHub repo, though fair warning that these are going away soon as I rebuild the talk for 2025.

Leave a Comment

The Power of COALESCE()

Lukas Vileikis shows off a bit of ANSI SQL syntax:

When the need to deal with NULL values arises, multiple queries come onto the scene. The SQL COALESCE function is one of them. In simple terms, the SQL COALESCE function is a ‘fallback’ mechanism for missing data. Its only task is to return the first non-NULL value from a list of values.

I used to be a big believer in COALESCE() all of the time, but it turns out that ISNULL() is faster if you only have two things to compare. Granted, it’s not a huge difference in speed, as I recall, but the difference is there.

Leave a Comment

MSG 10054 Failures over VPN

Tim Radney takes us through a connection failure error:

Over the past few weeks, I’ve been contacted by multiple customers experiencing the same frustrating issue. Applications and SSMS sessions that had been rock-solid for years suddenly started throwing errors when connecting over VPN:

  • Msg 10054: “An existing connection was forcibly closed by the remote host.”
  • “Connection Failure (status code = 3000, [Microsoft][ODBC Driver 17 for SQL Server] The connection is broken and recovery is not possible…”

Read on to learn what’s going on and how you can remediate it.

Leave a Comment

Setting Function Parameters for Debugging in R

Jason Bryer has a function:

I tend to write a lot of functions that create specific graphics implemented with ggplot2. Although I try to pick graphic parameters (e.g. colors, text size, etc.) that are reasonable, I will typically define all relevant aesthetics as parameters to my function. As a result, my functions tend to have a lot of parameters. When I need to debug the function I need to have all those parameters set in the global environment which usually requires me highlighting each assignment and running it. This function automates this process.

Click through to see how it works. H/T R-Bloggers.

Leave a Comment

Database Deployment Variables with SQLCMD

Andy Brownsword changes a variable:

A regular Database Project deployment is static and delivers consistent results regardless of environment. When it comes to schema, that’s usually desired, but data is a different story.

Data is environment specific. You want a Database Project that works across all environments. You want smarter deployments. You need SQLCMD Variables.

These have been the go-to method for handling different environments and other things that change between releases since I started using database projects about 15 years ago. Looks like not a lot has changed on this front, but it’s good to see that they still work as expected.

Leave a Comment