Press "Enter" to skip to content

Curated SQL Posts

ForEach Loops in Powershell

Garry Bargsley continues a series on Powershell capabilities:

Welcome back to PowerShell Strikes Back. We’re three weeks in, and the training is paying off. In Week 1, we learned that quotes are not interchangeable. In Week 2, we put variables to work – storing server names, config values, service objects, and boolean results. If you’ve been following along and running the examples in your own environment, you’re already writing better PowerShell than you were a month ago.

This week, we tackle the concept that transforms a script from a one-time operation into an actual tool: the ForEach loop.

Garry also ties in error handling, which is important during loop iteration.

Leave a Comment

Performing ELT with Python and DuckDB

Jamal Hansen shows off a capable in-memory analytic database:

This is a real-world example of a common data engineering pattern. You may have heard of ETL (Extract, Transform, Load), where data is transformed before it reaches its destination. What we are actually building today is the more modern variant, ELT: Extract, Load, Transform.

Read on for the process. I like DuckDB a lot and this is one of the types of use cases in which it excels.

Leave a Comment

T-SQL Tuesday 198 Round-Up

Meagan Longoria wraps up another T-SQL Tuesday:

Thank you to everyone who participated in T-SQL Tuesday #198! When I wrote the invitation post, I intentionally kept the prompt broad because change detection looks different depending on your source system, your infrastructure, your data volumes, and what you need to do with the changes once you have them. The responses covered SQL Server internals, Microsoft Fabric and Synapse, hashing strategies, metadata-driven frameworks, and Synapse workspace diffing with Python. Here’s a summary of each contribution.

Read on for links to eight responses.

Leave a Comment

Migrating testthat to testit

Yihui Xie explains how to switch test frameworks in R:

Back in 2013, I wrote about testing R packages when I first released testit. Thirteen years later, I still believe that unit testing should be nothing more than “tell me if something unexpected happened.” Recently I converted a large testthat test suite to testit, and I thought I’d share a practical guide for anyone considering the same move.

Click through for that guide.

Leave a Comment

PostgreSQL Removing MD5 Hashing for Authentication

Lukas Vileikis covers the consequences:

In late 2024, a message by Nathan Bossart hit the database spotlight. Within it, he proposed a “multi-year, incremental approach to remove MD5 password support from PostgreSQL.”

Before we dive in completely, let’s establish one important thing first: what exactly is MD5?

One thing I strongly disagree with: Lukas’s comment that “A decade or so ago, when computing power was far smaller than it is now, MD5 was considered an ‘okay’ hashing mechanism.” There were MD5 rainbow tables readily available 15 years ago and people already realized MD5 was not good for password hashing, even with a salt. To the extent that these platform vendors thought it was “okay” a decade ago, they were already way out of date.

Leave a Comment

Recovering from a Full Transaction Log File

Jeff Iannucci sneaks in a fix:

We received an emergency call from a client that noted that their SQL Server instances was unresponsive. (This was an Amazon RDS instance, although that didn’t play much into the ultimate root cause.) The client had some technical staff already looking at the issue, and when we joined the call we were informed that the transaction log for their main production database was completely full, and all transactional activity in the database had stopped.

Read on to see how Jeff and team were able to fix it.

Leave a Comment

Comparing Postgres Kubernetes Operators: CloudNativePG vs Crunchy PGO

Gabriele Bartolini makes a comparison:

For years, I resisted writing a direct comparison between CloudNativePG and Crunchy PGO. It felt like the wrong kind of article to write from where I sit. But after several years of both projects maturing, and particularly since Crunchy Data was acquired by Snowflake, I have been asked with increasing frequency how the two operators compare. I now think the time is right. Last week, I wrote Recipe 24 to answer the practical question of how to migrate. This post attempts something harder: an honest assessment of why the two operators differ and what those differences mean for teams choosing a long-term platform for PostgreSQL on Kubernetes.

I will acknowledge Crunchy’s legacy, explain the architectural choices that I believe make CloudNativePG the stronger foundation, point to data where it exists, and flag the areas where my view is unavoidably subjective. I will not pretend this is a neutral document.

Gabriele is a maintainer on CloudNativePG, so take that into consideration. I do appreciate the upfront statement of bias and think the post is well worth reading.

Leave a Comment

A Primer on Partitioned Views

Erik Darling talks about an old-style way of partitioning in SQL Server:

Erik Darling here with Darling Data. And we’re going to finish off this Friday by talking about partitioned views. And look, there are a lot of things I could say about partitioned views that are great and grand and that have come in handy for me over the years in ways that I’m like, wow, thank you partitioned views. Thank you for not being normal table partitioning. Thank you for existing. 

Read on to see how they work, how you can write into them, things that might prevent you from writing into partitioned views directly, and why you probably don’t want writable partitioned views anyhow.

Leave a Comment

Exceeding the Capacity Limit for Power BI Dataset Refreshes

Chris Webb explains an error:

If you have a lot of Power BI semantic models that are scheduled to refresh at the same time in the Service then you may find that some of them fail with the following error:

You’ve exceeded the capacity limit for dataset refreshes. Try again when fewer datasets are being processed.

[Note: “dataset” is the old name for a Power BI semantic model. Someone should update the error message.]

Read on to see what can cause this error and what you can do about it.

Leave a Comment

What’s New in Cassandra 6

Mariah McLaughlin lays out some of the new features in the latest version of Cassandra:

Accord is a general-purpose transaction framework that uses a leaderless consensus protocol to have highly available transactions and is used in Cassandra 6. The goal is broader transactional support across multiple keys, with strict serializable isolation and without a central bottleneck.

This matters because multi-key consistency is hard to handle cleanly in application code. Once a workflow spans more than one partition, the application often ends up doing coordination work that really belongs in the database.

Accord enables ACID behavior on transactional tables, which lets developers coordinate multi-step, multi-partition changes with stronger correctness guarantees, reducing the amount of custom consistency logic they have to build in the application.

Click through for more information on this, as well as a few other significant features.

Leave a Comment