Press "Enter" to skip to content

Day: April 25, 2025

Function Generators versus Partial Application in R

Jonathan Carroll digs in:

The blog post (www.tidyverse.org) describing the latest updates to the tidyverse {scales} package neatly demonstrates the usage of the new functionality, but because the examples are written outside of actual plotting code, one feature stuck out to me in particular…

label_glue("The {x} penguin")(c("Gentoo", "Chinstrap", "Adelie"))
# The Gentoo penguin
# The Chinstrap penguin
# The Adelie penguin

Read on for a dive into what makes the actual invocation interesting. H/T R-Bloggers.

Leave a Comment

Diskless Topics in Apache Kafka

Filip Yonov and Josep Prat work through a challenge:

KIP-1150 isn’t a distant, strange planet; it just reroutes Kafka’s entire replication pathway from broker disks to cloud object storage. Flip one topic flag and your data bypasses local drives altogether:

  • No disks to babysit: Hot-partition drama, IOPS ceilings, and multi-hour rebalances vanish—freeing up time (for more blog posts).
  • Cloud bill trimmed by up to 80%: Object storage replaces triple-replicated setups with pricey SSDs and every byte of cross‑zone replication, erasing the “cloud tax”.
  • Scale in real time: With nothing pinned to brokers, you can spin brokers up (or down) in seconds to absorb traffic spikes.

Because Diskless is built into Kafka (no client changes, no forks), we had to solve a 4D puzzle: How do you make a Diskless topic behave exactly like a Kafka one—inside the same cluster—without rewriting Kafka? This blog unpacks the first‑principles, deep dive into the thought process,and trade‑offs that shaped the proposal.

Click through for a deep dive on this from the perspective of a platform host.

Leave a Comment

Running Cron Jobs in Azure Database for PostgreSQL Flexible Server

Josephine Bush schedules a task:

pg_cron is a simple cron-based job scheduler for PostgreSQL that runs inside the database as an extension. It allows you to schedule PostgreSQL commands directly from your database, similar to using cron jobs at the operating system level. pg_cron on PG Flex is pretty easy to use, making it easy to schedule regular database maintenance and processing tasks directly from within PostgreSQL.

Read on to see how to install the extension, and then how to manage cron jobs. Josephine also lays out some limitations when using pg_cron on Azure and how to track failed jobs.

Leave a Comment

The Power of Invoke-DbaQuery in dbatools

David Seis looks at a powerful cmdlet in dbatools:

In this blog post, we will audit the dbatools command Invoke-DbaQuery. I will test, review, and evaluate the script based on a series of identical steps. Our goal is to provide insights, warnings, and recommendations to help you use this script effectively and safely. Invoke-DbaQuery is the Swiss army knife of all dbatools commands as you can execute almost any T-SQL script you can think of via PowerShell.

Click through for an overview of what the cmdlet does, some tips on proper usage, and an important note around possible misuse.

Leave a Comment

Item Limits in Microsoft Fabric Workspaces

Sakshi Jain announces a change:

Previously, there were no restrictions on the number of Fabric items that could be created in a workspace, with a limit for Power BI items already being enforced. Even though this allows flexibility for our users, having too many items in workspaces reduces the overall user friendliness and effectiveness of the platform.

As of April 10, 2025, Microsoft Fabric has implemented updates to the total number of items permissible in a workspace. This change introduces a combined limit of 1,000 Fabric items (including Power BI items) per workspace. In other words, a workspace may now contain up to 1,000 items from both Fabric and Power BI collectively.

This improves usability of the workspace and simplifies organization of Fabric items. This also improves service quality and reliability for users.

Well, that’s one way to spin it.

That limit of 1000 items seems quite restrictive to me, knowing how quickly you can accrue Fabric items.

Leave a Comment

Log Rotation in PostgreSQL

Ajay Dwivedi switches out log files:

In my organisation, we have started building PostgreSQL Clusters with Patroni + Consul. In PostgreSQL we enable a few extensions like pg_stat_statements to ensure we don’t miss any performance impacting query.

But this generates too much log in active servers, leading to PostgreSQL log bloating. Thus, it becomes important to ensure log files do not consume beyond an agreed amount of disk space. For this reason, I implemented the following log rotation steps for Postgresql –

  • Ensure to set proper log file name for PostgresSQL logging_collector.
  • Add a logrotate policy on linux system for postgres logs directory.
  • Add a cron job for running logrotate policy more frequently.

Click through to see how.

Leave a Comment

Two Direct Lakes in Microsoft Fabric

Nikola Ilic does a bit of digging:

Before you proceed, in case you don’t know what Direct Lake is, I’ve got you covered in this article, where you can learn and understand various Direct Lake concepts, as well as in which scenarios you might consider implementing Direct Lake semantic models. Now that you know what Direct Lake is, let’s digest the latest news…

A couple of days ago, I was reading the official blog post about the latest enhancement to the Direct Lake storage mode for semantic models in Microsoft Fabric. The official blog post can be found here.

Click through for that announcement and what it means.

Leave a Comment