Press "Enter" to skip to content

Curated SQL Posts

Copying Azure SQL Databases between Subscriptions

Kenneth Fisher is back in the fight:

I recently had to copy an Azure SQL database (SQL db) from one subscription to an Azure SQL Server instance in another subscription. All of the help I found suggested going to the database and hitting the COPY option. Unfortunately, when I did, I ran into a problem.

Read on for the issue, as well as one way to fix it. The route Kenneth landed on was the same one I ended up going with when I had a similar problem and very limited access to SQL DB on both subscriptions.

Comments closed

Fixing Slow Row-Level Security Policies in PostgreSQL

Dian Fay troubleshoots some row-level security slowness:

At my day job, we use row-level security extensively. Several different roles interact with Postgres through the same GraphQL API; each role has its own grants and policies on tables; whether a role can see record X in table Y can depend on its access to record A in table B, so these policies aren’t merely a function of the contents of the candidate row itself. There’s more complexity than that, even, but no need to get into it.

Read on for a dive into row-level security and several tips to make the operation faster.

Comments closed

Adaptive Joins and Equivalent Plans

Forrest McDaniel has a public service announcement:

In case you haven’t heard, moral decline is everywhere. The latest generation of query operators is a prime example of this rot. “But this is programming” you say, “what do morals have to do with anything?” No. This is SQL. And for us, we have the term morally equivalent plans which allows me to finally have an ethical excuse for sprinkling a post with degenerate puns.

I bit my tongue specifically to avoid doing exactly what Forrest does. So click through to read all of the terrible puns.

Comments closed

Efficiency of Sparse Hash Tables in PostgreSQL

Ashutosh Bapat runs some tests:

The hash_create() API in PostgreSQL takes initial size as an argument. It allocates memory for those many hash entries upfront. If more entries are added, it will expand that memory later. The point of argument was what should be the initial size of the hash table, introduced by that patch, containing the derived clauses. During the discussion, David hypothesised that the size of the hash table affects the efficiency of the hash table operations depending upon whether the hash table fits cache line. While I thought it’s reasonable to assume so, the practical impact wouldn’t be noticeable. I thought that beyond saving a few bytes choosing the right hash table size wasn’t going to have any noticeable effects. If an derived clause lookup or insert became a bit slower, nobody would even notice it. It was practically easy to address David’s concern by using the number of derived clauses at the time of creating the hash table to decide initial size of the hash table. The patch was committed.

Read on to see how things didn’t quite turn out this way, and what the results of testing look like.

Comments closed

LakeBench Now Available

Miles Cole makes an announcement:

I’m excited to formally announce LakeBench, now in version v0.3, the first Python-based multi-modal benchmarking library that supports multiple data processing engines on multiple benchmarks. You can find it on GitHub and PyPi.

Traditional benchmarks like TPC-DS and TPC-H focus heavily on analytical queries, but they miss the reality of modern data engineering: building complex ELT pipelines. LakeBench bridges this gap by introducing novel benchmarks that measure not just query performance, but also data loading, transformation, incremental processing, and maintenance operations. The first of such benchmarks is called ELTBench and is initially available in light mode.

Click through to see how it works and grab a copy if you’re interested.

Comments closed

Extending Ola’s Backup Scripts for Scale

Aaron Bertrand looks at some extensions to a venerable solution:

Where I work, we have hundreds of mission-critical databases. We use Ola Hallengren’s SQL Server backup solution, but, out of necessity, have wrapped it with several enhancements to help us accomplish goals around retention, SQL Server backup frequency, and recovery point objectives. I touched on some of this last year when I talked about automating some of our restore testing, in It’s a recovery strategy, not a backup strategy. Today, we do a lot more than that, and we have to be more flexible about it.

Click through to see what Aaron’s environment looks like and some of the things they’ve done to make recovery work better.

Comments closed

Looking at PostgreSQL Internals with pg_visibility

Ian Parker introduces us to an extension:

When most people think about PostgreSQL internals, they picture tables, indexes, and perhaps the VACUUM process. Tucked away in the contrib extensions, however, is a tool that exposes what really sits on disk: the pg_visibility extension. By querying it, you see how PostgreSQL tracks visibility and freezing at the page level—information that directly affects vacuum efficiency and index-only scans.

Read on to see how this extension works.

Comments closed

The Limits of Transparent Data Encryption

Steve Stedman breaks down some myths around Transparent Data Encryption in SQL Server:

Transparent Data Encryption (TDE) in SQL Server is a powerful feature designed to protect data at rest by encrypting database files. While TDE is an excellent tool for securing data stored on disk, it’s often misunderstood, leading to assumptions about its capabilities. In this blog post, we’ll clarify what TDE does not do to help you better understand its scope and limitations, ensuring you implement the right security measures for your SQL Server environment.

Click through for the list.

Comments closed

Not Patching: the False Sense of Security

Mike Walsh explains why it’s important to keep up to date on patches:

At first, I was a bit shocked, but then I realized it actually makes sense and it matches a regret I hear far too often from folks who call us too late“If we don’t change too much, we’re safe.”

Sadly, nothing can be further from the truth. When a software vendor releases a fix for a vulnerability, it certainly could be caused by a recent patch or update – but more often than not that fix is for a longstanding issue that has been until recently unknown to the world.

Read on for Mike’s argument. This also applies to people running versions of SQL Server no longer receiving patches (hint: 2016 SP3 and earlier). Just because there isn’t a patch doesn’t mean there isn’t a vulnerability.

Comments closed