Press "Enter" to skip to content

Category: Administration

Changes to Accelerated Database Recovery in 2025

Jordan Boich points out something interesting coming in SQL Server 2025:

Accelerated Database Recovery (ADR) was introduced in SQL Server 2019. Its main purpose is to allow for faster database recovery in the event of a crash or unexpected shutdown. Traditionally, the database engine handles crash recovery through a series of phases—analysis, redo, and undo—which can be inefficient and slow, especially when dealing with long-running transactions.

To make a long story short, ADR “shortcutsˮ the recovery process by introducing a new approach to handling undo operations. Instead of relying heavily on scanning the transaction log—which can be painfully slow for uncommitted or long-running transactions—ADR maintains a version store within the user database to track row-level changes. This allows SQL Server to quickly roll back uncommitted transactions without scanning the entire log. The result is much faster crash recovery, quicker rollbacks, and improved overall database availability, particularly in high-transaction environments.

Read on to see what’s new, as well as some of the consequences of enabling this feature.

Leave a Comment

Changing a Busy Column’s Data Type in SQL Server

Matt Gantz makes a staggered change:

In a previous post I showed how to use a batching strategy to remove large amounts of data from a table while it is being used. Today I will apply the same technique to another common problem- changing the datatype of a column. A common use of this is to normalize a text column into an integer (that references another table), but could be used to transition to and from any datatype . Many of the considerations in the previous post apply, so I advise you to read it as well before using this technique.

Click through for the process.

Leave a Comment

Myths of the DBA-less Cloud

Kevin Hill has a reminder for us:

Here’s a common theme I hear from small IT teams:

“Our SQL Server is in the cloud now. We don’t need a DBA.”

Not quite.

When you’re running SQL Server on cloud virtual machines like Azure SQL VM or AWS EC2, you’re still responsible for a lot of the same challenges you faced on-prem. The difference? Now you’re paying cloud rates for the same headaches. But you don’t have to deal with hardware and some of the other infrastructure hassles.

Read on to see what that entails in practice. Though I’m pretty sure my target audience generally understands this and it’s people two or three levels up who should give Kevin’s post a read.

Comments closed

Databases and Reboots

Rob Douglas will reboot many things, but not the database server:

I am taking a slightly different tangent. My problem is neither strange or unique – in fact it’s infuriatingly common and it stems from one of the most common troubleshooting techniques in IT. While asking users “Have you tried turning it off and on again?” is a common go to for tech support call handlers, it is not a great idea when the “it” you are talking about is a database server

Click through for a cautionary tale, as well as an explanation of why this usually isn’t the smart play.

Comments closed

Optimizing SQL Server via Indirect Checkpoints

Jon Russell covers a quiet feature:

A checkpoint is a background process that writes dirty pages to disk. A checkpoint performs a full scan of the pages in the buffer pool, lists all the dirty pages that are yet to be written to disk, and finally writes those pages to disk. In SQL instances that do not have many dirty pages in the buffer pool, this is a trivial operation. However, with SQL instances that have OLTP databases, use more memory and/or involve sequential scanning of all pages, the performance of the system could be impacted.

With SQL Server 2012, indirect checkpoints were introduced. In this case, the dirty page manager manages the dirty page list and keeps track of all the dirty page modifiers of the database. By default, it runs every 60 seconds and tracks the dirty pages that need to be flushed.

Read on to learn more about why indirect checkpointing exists, the kinds of capabilities it offers, and the extent to which you might want to tweak its settings.

Comments closed

Sockets vs Cores and SQL Server

Vlad Drumea checks server settings:

It’s not uncommon that I run into a VM that’s configured with something like 6 or more cores with each core on one socket.

Here’s an example how this would show up in Task Manager for a VM with 16 CPU cores configured with 1 core per socket.

Read on to learn why this particular configuration can turn out so poorly with SQL Server, particularly when you use Standard Edition.

Comments closed

High Water Mark and PostgreSQL Vacuum Operations

Shane Borden troubleshoots an issue:

The first thing we came to understand is that the pattern of work on the primary is a somewhat frequent large DELETE statement followed by a data refresh accomplished by a COPY from STDIN command against a partitioned table with 16 hash partitions.

The problem being observed was that periodically the SELECTs occurring on the read replica would time out and not meet the SLA. Upon investigation, we found that the “startup” process on the read replica would periodically request an “exclusive lock” on some random partition. This exclusive lock would block the SELECT (which is partition unaware) and then cause the timeout. But what is causing the timeout?

Read on for the answer and tips on how to determine if you have problems with High Water Mark growth in PostgreSQL.

Comments closed

Setting the Optimal logical_decoding_work_mem in PostgreSQL

Ashutosh Bapat shares a tip with us:

Logical replication is a versatile feature offered in PostgreSQL. I have discussed the the theoretical background of this feature in detail in my POSETTE talk. At the end of the talk, I emphasize the need for monitoring logical replication setup. If you are using logical replication and have setup monitoring you will be familiar with pg_stat_replication_slots. In some cases this view shows high amount of spill_txns, spill_count and spill_bytes, which indicates that the WAL sender corresponding to that replication slot is using high amount of disk space. This increases load on the IO subsystem affecting the performance. It also means that there is less disk available for user data and regular transactions to operate. This is an indication that logical_decoding_work_mem has been configured too low. That’s the subject of this blog: how to decide the right configuration value for logical_decoding_work_mem. Let’s first discuss the purpose of this GUC. Blog might serve as a good background before reading further.

Read on to learn a bit more about how this value works and what you can do to set it correctly.

Comments closed

Estimating Query Percentiles in PostgreSQL

Michael Christofides makes an assertion:

I recently saw a feature request for pg_stat_statements to be able to track percentile performance of queries, for example the p95 (95th percentile) or p99 (99th percentile).

That would be fantastic, but isn’t yet possible. In the meantime, there is a statistically-dodgy-but-practically-useful (my speciality) way to approximate them using the mean and standard deviation columns in pg_stat_statements.

Click through for the code. Michael even covers the immediate objection I have (that the data isn’t normally distributed, so you shouldn’t use the same Z score estimators that exist for the normal). That said, if you’re interested in “p99…ish” then this is a clever approach to take.

Comments closed

Pain Points around tempdb

Kevin Hill has a list:

TempDB is the SQL Server equivalent of a junk drawer – everyone uses it, nobody monitors it, and eventually it becomes a bottleneck you can’t ignore.

Whether it’s poorly configured from the start or getting hammered by bad execution plans, TempDB often becomes the silent killer of performance. The good news? A few targeted changes can make a big impact.

Read on for some of tempdb’s greatest hits. Kevin also has a few quick tips for tempdb.

Comments closed