Press "Enter" to skip to content

Curated SQL Posts

A Hidden Slowdown in PostgreSQL Restores

Warda Bibi explains why a large PostgreSQL database restoration can take a long time:

In July 2025, during the PG19-1 CommitFest, I reviewed a patch targeting the lack of parallelism when adding foreign keys in pg_restore. Around the same time, I was helping a client with a large production migration where pg_restore dragged on for more than 24 hours and crashed multiple times.

In this blog, I will talk about the technical limitations in PostgreSQL, the proposed fix, and a practical workaround for surviving large restores.

The reason for the slowdown is a bit wild to me, though I suppose somebody could edit the backup files manually after pg_dump writes them out.

Leave a Comment

Choosing between Random Forest and Gradient Boosting

Jayita Gulati compares two popular algorithms for classification:

When working with machine learning on structured data, two algorithms often rise to the top of the shortlist: random forests and gradient boosting. Both are ensemble methods built on decision trees, but they take very different approaches to improving model accuracy. Random forests emphasize diversity by training many trees in parallel and averaging their results, while gradient boosting builds trees sequentially, each one correcting the mistakes of the last.

This article explains how each method works, their key differences, and how to decide which one best fits your project.

Click through for the explanation.

Leave a Comment

SQL Injection Vulnerabilities Fixed in SQL Server

Vlad Drumea tests out a pair of fixes:

In this post I demo two PoCs for SQL injection vulnerabilities fixed in SQL Server 2025 CU20 GDR KB5063814.

This August’s Patch Tuesday came with a security patch for SQL Server 2022, 2019, 2017, and 2016.

The number of SQL injection vulnerabilities caught my attention and I decided to see what system stored procedures have changed to see if I can find anything useful.

Vlad looks at a pair of spatial stored procedures and puts together a method to exploit the old versions.

Leave a Comment

Date Intervals in PostgreSQL Window Functions

Hubert Lubaczewski solves a problem:

Since I can’t copy paste the text, I’ll try to write what I remember:

Given table sessions, with columns: user_id, login_time, and country_id, list all cases where single account logged to the system from more than one country within 2 hour time frame.

The idea behind is that it would be a tool to find hacked account, based on idea that you generally can’t change country within 2 hours. Which is somewhat true.

Solution in the blogpost suggested joining sessions table with itself, using some inequality condition. I think we can do better…

Click through for a solution that works for PostgreSQL but not SQL Server because the latter doesn’t offer date and time intervals on window function frames.

To do this in SQL Server, I’d probably use LAG() and get the prior value of country ID and the prior login time. Something like the following query, though I didn’t run detailed performance checks.

WITH records AS
(
	SELECT
		s.user_id,
		s.login_time,
		s.country_id,
		LAG(s.login_time) OVER (PARTITION BY s.user_id ORDER BY s.login_time) AS prior_login_time,
		LAG(s.country_id) OVER (PARTITION BY s.user_id ORDER BY s.login_time) AS prior_country_id
	FROM sessions s
)
SELECT *
FROM records r
WHERE
	r.prior_country_id <> r.country_id
	AND DATEDIFF(HOUR, r.prior_login_time, r.login_time) <= 2;
Leave a Comment

Trying Time-Bound Extended Events in SQL Server 2025 RC0

Reitse Eskens check the state of the art:

What’s new in SQL 2025 is that the team behind SQL Server 2025 has addressed an issue that I’ve encountered once or twice as well. If you start an extended event, it will run until you stop it. Yes, you have to manually stop it. If you don’t, it will stop eventually. Together with some other services, because the disk space has run out. In my case, I was fortunate that the servers were monitored and alerts were triggered due to low disk space.

In this new version, starting with Release Candidate 0 (RC0), you can configure the duration.

Click through for a demo, as well as current limitations (generally around this not being in the UI yet).

Leave a Comment

Generating Calendar Dates in T-SQL

Rick Dobson writes some code:

Sometimes it is convenient to generate calendar dates, such as for Fridays across multiple years. This tip demonstrates two different solutions for this kind of task. The demonstrations implement recursive CTEs, chained CTEs, as well as a stored procedure with both input and output parameters. These two different demonstrations are described so that you can incorporate either or both into your own calendar date generation projects.

Honestly, this is where a good calendar table (or date dimension) would be extremely handy.

Leave a Comment

Time Series Forecasting in Python

Myles Mitchell builds an ARIMA model:

In time series analysis we are interested in sequential data made up of a series of observations taken at regular intervals. Examples include:

  • Weekly hospital occupancy
  • Monthly sales figures
  • Annual global temperature

In many cases we want to use the observations up to the present day to predict (or forecast) the next N time points. For example, a hospital could reduce running costs if an appropriate number of beds are provisioned.

Read on for a primer on the topic, a quick explanation of ARIMA, and a sample implementation using several Python packages.

Leave a Comment

Architectural Guidance for IoT Deployments in Azure

Bhimraj Ghadge shares some tips:

Edge computing, a strategy for computing on location where data is collected or used, allows IoT data to be gathered and processed at the edge, rather than sending the data back to a data center or cloud. Together, IoT and edge computing are a powerful way to rapidly analyze data in real-time.

In this Tutorial, I am trying to lay out the components and considerations for designing IoT solutions based on Azure IoT and services.

Read on for an overview of IoT components in Azure, as well as several things to keep in mind during systems design and implementation.

Leave a Comment

Automating a SQL Server Build

K. Brian Kelley doesn’t need that George Jetson button-clicking finger:

I am frequently building up my SQL Server environments, especially my test lab. However, manually performing SQL Server backups, running scripts to set up security, and the rest of the tasks are time-consuming. What can I do to automate things?

Read on for the answer. If you are administering a larger number of SQL Server instances than you can count with, let’s say, one hand, it’s a great idea to script out your server setup and configuration process.

Leave a Comment

Resetting the sa Password on a Locked-Out SQL Server Instance

Tim Radney jimmies the lock:

Getting locked out of a SQL Server instance can happen due to a number of situations. The most common scenario I’ve encountered is when a SQL Server is moved from one domain to another without the domain being trusted or having a local SQL admin account. I recently encountered another incident where a DBA was attempting to fail over a log-shipped instance to another instance. Part of their run book included a script to disable a set of users to block production access to the instance. The problem was, the production instance was on a cluster, unlike the development and QA systems where the script had been tested. Disabling the users in production took down the instance preventing the tail log backups from occurring. We had to get the instance back up in order to take those final backups.

What can you do if you find that you’re locked out of a SQL Server instance?

Read on for that answer.

1 Comment