Press "Enter" to skip to content

Curated SQL Posts

Function Generators versus Partial Application in R

Jonathan Carroll digs in:

The blog post (www.tidyverse.org) describing the latest updates to the tidyverse {scales} package neatly demonstrates the usage of the new functionality, but because the examples are written outside of actual plotting code, one feature stuck out to me in particular…

label_glue("The {x} penguin")(c("Gentoo", "Chinstrap", "Adelie"))
# The Gentoo penguin
# The Chinstrap penguin
# The Adelie penguin

Read on for a dive into what makes the actual invocation interesting. H/T R-Bloggers.

Leave a Comment

Diskless Topics in Apache Kafka

Filip Yonov and Josep Prat work through a challenge:

KIP-1150 isn’t a distant, strange planet; it just reroutes Kafka’s entire replication pathway from broker disks to cloud object storage. Flip one topic flag and your data bypasses local drives altogether:

  • No disks to babysit: Hot-partition drama, IOPS ceilings, and multi-hour rebalances vanish—freeing up time (for more blog posts).
  • Cloud bill trimmed by up to 80%: Object storage replaces triple-replicated setups with pricey SSDs and every byte of cross‑zone replication, erasing the “cloud tax”.
  • Scale in real time: With nothing pinned to brokers, you can spin brokers up (or down) in seconds to absorb traffic spikes.

Because Diskless is built into Kafka (no client changes, no forks), we had to solve a 4D puzzle: How do you make a Diskless topic behave exactly like a Kafka one—inside the same cluster—without rewriting Kafka? This blog unpacks the first‑principles, deep dive into the thought process,and trade‑offs that shaped the proposal.

Click through for a deep dive on this from the perspective of a platform host.

Leave a Comment

Running Cron Jobs in Azure Database for PostgreSQL Flexible Server

Josephine Bush schedules a task:

pg_cron is a simple cron-based job scheduler for PostgreSQL that runs inside the database as an extension. It allows you to schedule PostgreSQL commands directly from your database, similar to using cron jobs at the operating system level. pg_cron on PG Flex is pretty easy to use, making it easy to schedule regular database maintenance and processing tasks directly from within PostgreSQL.

Read on to see how to install the extension, and then how to manage cron jobs. Josephine also lays out some limitations when using pg_cron on Azure and how to track failed jobs.

Leave a Comment

The Power of Invoke-DbaQuery in dbatools

David Seis looks at a powerful cmdlet in dbatools:

In this blog post, we will audit the dbatools command Invoke-DbaQuery. I will test, review, and evaluate the script based on a series of identical steps. Our goal is to provide insights, warnings, and recommendations to help you use this script effectively and safely. Invoke-DbaQuery is the Swiss army knife of all dbatools commands as you can execute almost any T-SQL script you can think of via PowerShell.

Click through for an overview of what the cmdlet does, some tips on proper usage, and an important note around possible misuse.

Leave a Comment

Item Limits in Microsoft Fabric Workspaces

Sakshi Jain announces a change:

Previously, there were no restrictions on the number of Fabric items that could be created in a workspace, with a limit for Power BI items already being enforced. Even though this allows flexibility for our users, having too many items in workspaces reduces the overall user friendliness and effectiveness of the platform.

As of April 10, 2025, Microsoft Fabric has implemented updates to the total number of items permissible in a workspace. This change introduces a combined limit of 1,000 Fabric items (including Power BI items) per workspace. In other words, a workspace may now contain up to 1,000 items from both Fabric and Power BI collectively.

This improves usability of the workspace and simplifies organization of Fabric items. This also improves service quality and reliability for users.

Well, that’s one way to spin it.

That limit of 1000 items seems quite restrictive to me, knowing how quickly you can accrue Fabric items.

Leave a Comment

Log Rotation in PostgreSQL

Ajay Dwivedi switches out log files:

In my organisation, we have started building PostgreSQL Clusters with Patroni + Consul. In PostgreSQL we enable a few extensions like pg_stat_statements to ensure we don’t miss any performance impacting query.

But this generates too much log in active servers, leading to PostgreSQL log bloating. Thus, it becomes important to ensure log files do not consume beyond an agreed amount of disk space. For this reason, I implemented the following log rotation steps for Postgresql –

  • Ensure to set proper log file name for PostgresSQL logging_collector.
  • Add a logrotate policy on linux system for postgres logs directory.
  • Add a cron job for running logrotate policy more frequently.

Click through to see how.

Leave a Comment

Two Direct Lakes in Microsoft Fabric

Nikola Ilic does a bit of digging:

Before you proceed, in case you don’t know what Direct Lake is, I’ve got you covered in this article, where you can learn and understand various Direct Lake concepts, as well as in which scenarios you might consider implementing Direct Lake semantic models. Now that you know what Direct Lake is, let’s digest the latest news…

A couple of days ago, I was reading the official blog post about the latest enhancement to the Direct Lake storage mode for semantic models in Microsoft Fabric. The official blog post can be found here.

Click through for that announcement and what it means.

Leave a Comment

Graphs are Not Always Necessary

Alex Velez drives home a point:

Our current #SWDchallenge has been on my mind. For those who don’t know, we pose a different monthly suggestion for community members to build their data visualization and presentation skills. It could be a prompt to try a novel graph type, redesign an existing example, or practice a specific technique like chart animation. This month, data storyteller Simon asked us to consider whether data always needs to be communicated in a graph. Simon shares, “When you have just a number or two, writing the numbers themselves can be much more powerful than burying them in a table or graph and potentially losing the impact of the main number you’re looking to share.”

This statement came to mind when I reviewed a chart over the weekend. The chart I’m referring to is a bar graph displaying the weekly weight measurements of my dog, Nemo.

Click through for Alex’s argument. My take on the matter is that the point of visualization is to convey relevant information to your audience. If you can do that with a single word or a single number, you don’t need to go further.

Leave a Comment

Equal Sign Alignment with Powershell Hash Tables

Mike Robbins lays out an argument:

If you ever formatted a hash table in PowerShell, you know how easy it is to focus on function over form. But what if one minor formatting tweak could improve readability, reduce syntax errors, simplify code reviews, and enhance script maintainability? During a recent documentation update, I stumbled on a subtle but powerful practice—aligning the equals signs in hash tables. What began as a style suggestion proved to be a practical improvement that changed how I write PowerShell every day. Here’s why this seemingly minor change deserves a place in your scripting toolbox.

Click through to learn why. This doesn’t apply only to hash tables in Powershell, of course, so you could take this concrete example and extend it to other situations. As an example, this is a very common pattern for managing lengthy configuration files for the same reasons Mike points out. Just as long as your programming language is okay with extra whitespace around the equal sign (or equivalent), you can do this.

Leave a Comment

Contained Database Users and Creating Logins

Rob Sewell does a bit of testing:

A contained user can create a Windows login as its own account, although as it cannot grant connect permissions it is then is unable to connect at all.

So if your vendor application is running as a contained user and during an upgrade it tries to create a login for itself, it will succeed in the creation but then be unable to connect to the SQL Server instance and the upgrade will fail.

Click through for the context and the proof.

Leave a Comment