Press "Enter" to skip to content

Category: Administration

Increasing CPU Capacity or Tuning Queries

John Deardurff explains how to make a choice:

Recently while discussing the Task Execution Model and Thread Scheduling, I was asked the following question, When discussing worker threads, how can we determine whether we should increase CPU capacity or focus on query tuning? This is when our worker threads are under pressure and the instance is becoming exhausted?

In my brain, I thought, that is a great question, and it’s exactly the right way to think about worker thread pressure vs. real CPU starvation, especially when worker threads are getting tight. Let’s write a post.

John has a nice discussion of the trade-offs and signals associated with each approach. One third approach I might add is caching in the application(s), if applicable. This is especially useful if a significant fraction of the queries access static or nearly-static data.

Leave a Comment

Write Storms and PostgreSQL

Shaun Thomas talks checkpoints:

Every database has to reconcile two uncomfortable truths: memory is fast but volatile, and disk is slow but durable. Postgres handles this tension through its Write-Ahead Log (WAL), which records every change before it happens. But the WAL can’t grow forever. At some point, Postgres needs to flush all those accumulated dirty pages to disk and declare a clean starting point. That process is called a checkpoint, and when it goes wrong, it can bring throughput to its knees.

One thing I would note is that direct-attached nVME storage is approximately 1 order of magnitude slower than RAM. Yeah, that’s still a lot slower, but the gap has closed significantly. If you have PCIe 5 nVME drives (call that 12-14 GB/sec) and relatively slow RAM (20 GB/sec), it’s getting close to on par. But once you move past the top-of-the-line for disk speed, you add more orders of magnitude and everything Shaun describes becomes a problem again.

Jeremy Schneider offers a follow-up involving autovacuum_cost_delay:

A few days ago, Shaun Thomas published an article over on the pgEdge blog called [Checkpoints, Write Storms, and You]. Sadly a lot of corporate blogs don’t have comment functionality anymore. I left a few comments [on LinkedIn], but overall let me say this article is a great read, and I’m always happy to see someone dive into an important and overlooked topic, present a good technical description, and include real test results to illustrate the details.

I don’t have any reproducible real test results today. But I have a good story and a little real data.

Check out both of those articles.

Leave a Comment

Accelerated Database Recovery in tempdb for SQL Server 2025

Rebecca Lewis looks into a feature:

Two weeks ago I covered the Resource Governor changes in SQL Server 2025 — specifically, capping how much tempdb data space a workload group can consume. That was the data-file side. For the log side, SQL Server 2025 now lets you enable Accelerated Database Recovery (ADR) on tempdb. Enable it and cancelled queries stop grinding, the tempdb log stops bloating, and recovery gets faster. Sounds like an easy yes — but you’ve got to read the fine print.

Click through for that fine print.

Leave a Comment

Updates to Straight Path Solutions sp_Check Procedures

Jeff Iannucci has some updates:

This month though there are mostly a few small updates for the tools, as next month’s updates should also include many of the issues noted in GitHub (thanks for those!) Most of this month’s updates were to sp_CheckSecurity, although we did add “Initial File Size” to the output of sp_CheckTempdb since that had been requested by a few folks.

We hope these new updates can help you, especially if you’ve never used these stored procedures. Here are the updates for March 2026, with links to the GitHub repositories where you can download the latest versions.

Click through for those links.

Leave a Comment

Capacity Overage in Microsoft Fabric

Pankaj Arora has a new ‘give us money’ lever:

Capacity overage, is a new opt‑in capability in Microsoft Fabric designed to help organizations keep their workloads running—even during unexpected compute spikes. Now available in preview, this feature allows for automatic billing for excess capacity usage, based on limits you set, instead of throttling operations, ensuring smoother experiences when workloads exceed the limits of your purchased capacity.

I will say that I think it’s reasonable to have the two options of throttling (you went over by 30%, so for a stretch of time you’ll be capped until you get back under the limit) or simply paying. The controversy around this was mostly in the fact that, if you shut off and restart your Fabric capacity, you’d automatically be charged for the overages you created. To that end, providing more options on how to work off that overage debt is useful.

Leave a Comment

The Importance of the Transaction Log in SQL Server

Deb Melkin lays out the case:

Lately, I’ve been feeling like we’re not paying as much attention to transaction logs as we should. In fact, I’ve been saying there needs to be a flashing neon sign that says “Transaction Logs are more important than ever.” I thought I’d take a minute and explain why.

Click through for some important functionality that makes heavy use of the transaction log, Deb also has a call to action on how to keep them going.

Leave a Comment

Monitoring Fabric Mirroring of SQL Server 2025

Meagan Longoria wants to make sure things are working as expected:

previously wrote about how the underlying technology for Fabric mirroring changed with SQL Server 2025. The latest version of mirroring that uses the SQL Server Change Feed is reading from the database transaction logs and pushing the data to a landing zone in OneLake. The data is then merged into the Delta tables for the Fabric mirrored database.

In this blog post, we will look at how to monitor this process, both in SQL Server and in Fabric.

Click through for information on the right DMVs to query and what you can find within Microsoft Fabric itself.

Leave a Comment