Press "Enter" to skip to content

Curated SQL Posts

Exploring Associations in R with AssociationExplorer

Antoine Soetewey announces a new tool:

I am pleased to announce the publication of our paper “AssociationExplorer: A user-friendly Shiny application for exploring associations and visual patterns” in the journal SoftwareX, together with the official release of the AssociationExplorer2 R package on CRAN.

Both the paper and the software are part of an open-science effort aimed at making exploratory data analysis more accessible to non-technical users.

Read on to learn more about the tool and how you can get it. H/T R-Bloggers.

Leave a Comment

Expiring Old Postgres Backups with pgBackRest

Stefan Fercot demonstrates some new functionality in a tool:

A useful new feature was introduced on 11 December 2025: Allow expiration of the oldest full backup regardless of current retention. Details are available in commit bf2b276.

Before this change, it was awkward to expire only the oldest full backup while leaving the existing retention settings untouched. Users had to temporarily adjust retention (or write a script) to achieve the same result. Expiring the oldest full backup is particularly helpful when your backup repository is running low on disk space.

Let’s see how this works in practice with a simple example.

Click through for the example.

Leave a Comment

Optional Parameter Plan Optimization in SQL Server 2025

Tomaz Kastrun continues looking at new functionality in SQL Server 2025:

Part of new features in IQP (Intelligent query processing) is also OPPO – Optional parameter plan optimization – which refers to a specific variation of the parameter-sensitive plan (PSP) or Parameter Sensitive Plan Optimization (PSPO) problem in which the sensitive, parameter value that exists during query execution, controls whether we need to perform a seek into or scan a table. It is part of mitigating the parameter sniffing problem.

Read on for more information, as well as a demonstration of how it works.

Leave a Comment

Using TimescaleDB in Postgres

Adron Hall put a database in his database:

I’ve been using TimescaleDB for time-series data on and off for a while now. I recently fired up Postgres.app for local development. It’s one of the cleanest ways to get PostgreSQL running on macOS, and adding TimescaleDB is surprisingly straightforward once you know where to look.

Time-series data is everywhere—sensor readings, application metrics, user events, IoT data. Regular PostgreSQL can handle it, but once you’re dealing with millions of rows, you’ll notice queries slowing down. TimescaleDB solves this by turning your time-series tables into hypertables that automatically partition by time, compress old data, and optimize queries. The best part? It’s still PostgreSQL, so all your existing tools and SQL knowledge work exactly the same.

Read on for a demo of how it all works.

Leave a Comment

Postgres Index Corruption after an OS Upgrade

Laurenz Albe has me concerned:

Most major Linux distributions have upgraded to the GNU C library version 2.28 or later. Therefore, there is a growing awareness that an operating system upgrade can lead to index corruption. However, rebuilding all your indexes can slow down the upgrade process considerably. In this article, I want to discuss how you can avoid rebuilding more indexes than necessary.

Read the whole thing if you are a Postgres DBA.

Leave a Comment

Monitoring Replication in Redgate Monitor

Steve Jones shows off some functionality:

The monitoring capabilities in Redgate Monitor were originally fairly limited to a few counters from PerfMon. A few people had written custom metrics on sqlmonitormetrics.com that clients could use, but we’ve had customers asking for more native integrations.

We’ve done it. With version 14.2, we have added an estate view of your replication environment. In the Estate menu, there is a new entry for Replication Monitoring.

I had to check the documentation to see if merge replication was included, and it was. That’s usually the form that people avoid because it’s too hard to implement.

In practice, this doesn’t replace ReplMon or more detailed mechanisms for detailing system behavior, but it does at least provide the ability to alert if things are going wrong with the SQL Agent jobs. That probably catches about 60-70% of issues with replication, with the remaining 30-40% requiring some sort of time of flight indicator to see if data on the subscribers are keeping up to date with data on the publisher.

Leave a Comment

Filtering RPC_COMPLETED Extended Events

Kendra Little builds a filter:

The rpc_completed event in Extended Events is useful when troubleshooting SQL Server performance. It captures detailed information about Remote Procedure Calls: that means stored procedure executions, including the calls to sp_executesql often used by applications (including Entity Framework) to run parameterized queries against SQL Server. The output for rpc_completed includes the parameters that were specified along with values provided, and the CPU time, logical reads, and duration used by the query.

It can be frustrating to figure out how to filter this in Extended Events. Struggling with this is one of the primary reasons I sometimes use ye Olde Profiler for initial investigations and to speedily observe something in SQL Server.

Here is my survival guide to filtering rpc_completed, which makes using XEvents suck less.

Kendra digs in deep, including several common gotchas and solid recommendations, including the use of Erik Darling’s sp_humanevents.

Leave a Comment

Creating a SQL Agent Job via SSMS

Jim Evans creates a job:

In this article, I’ll show how to create and schedule a SQL Server Agent Job. I’ll also show how to setup an Operator to receive notifications for failed or successful Job completions.

Jim lays out the UI-based approach and has scripting the job as an optional step. I highly recommend scripting and understanding the T-SQL it generates. It may look like a mess at first and the T-SQL it outputs is not idempotent (meaning, if you re-run the script multiple times, you do not end up with the same outcome and a successful run of the script). But changing this to become an idempotent template that successfully completes each time will allow you to store the code in source control and also build out future jobs that much faster. Additionally, it prevents issues where you have the “same” job but it’s actually set up differently across your different SQL Server instances.

Leave a Comment

Asynchronous Disk I/O in Postgres 18

Josef Machytka gives us the skinny:

PostgreSQL 17 introduced streaming I/O – grouping multiple page reads into a single system call and using smarter posix_fadvise() hints. That alone gave up to ~30% faster sequential scans in some workloads, but it was still strictly synchronous: each backend process would issue a read and then sit there waiting for the kernel to return data before proceeding. Before PG17, PostgreSQL typically read one 8kB page at a time.

PostgreSQL 18 takes the next logical step: a full asynchronous I/O (AIO) subsystem that can keep multiple reads in flight while backends keep doing useful work. Reads become overlapped instead of only serialized. The AIO subsystem is deliberately targeted at operations that know their future block numbers ahead of time and can issue multiple reads in advance:

Read on to see some of the consequences of this change, as well as more detail on how it works.

Leave a Comment