Press "Enter" to skip to content

Author: Kevin Feasel

The Downside of Sticking to the Legacy Cardinality Estimator

Stephen Planck recommends taking the plunge:

Cardinality estimation (CE) is how the optimizer predicts the number of rows that will flow through each operator in a plan. Those estimates drive cost, join choices, memory grants, and ultimately latency and resource usage. SQL Server has shipped multiple CE models over time. The pre-2014 model—commonly called the legacy CE—dates back to SQL Server 7.0. Starting in SQL Server 2014, Microsoft introduced a new CE and has continued refining it in later releases, including SQL Server 2022. Keeping the legacy CE turned on in SQL Server 2022 is usually the wrong long-term choice.

One thing to note is that the “new” cardinality estimator has been out for a decade. It’s not really that new anymore, and it’s not going anywhere soon. Yeah, there are still trade-offs where some queries are better on the legacy estimator, but for the people who use that as their reason for not using the new estimator, what have you done in the past decade to address this and tune those queries to work better? If the answer is “nothing,” it’s not the cardinality estimator’s fault here.

Leave a Comment

Fun with the Data API Builder

Jess Pomfret tries out the Data API Builder:

I’ve been hearing about the Data API Builder (dab) for a while now, but I hadn’t found a reason to play with it myself.

Well I recently found I had a SQL Server database that could use an API so I could interact with it from an Azure Function. I immediately thought about DAB and was excited to have a reason to test it out.

Let me tell you – this thing is pretty neat!

Jess has started a new series and the first post involves installing and trying out the service.

Leave a Comment

Error 845 Timeout in DBCC CHECKTABLE

Eitan Blumin troubleshoots an odd issue:

A customer reported that running DBCC CHECKTABLE on several different tables kept failing with the exact same error:

Msg 845, Sev 17: Time-out occurred while waiting for buffer latch type 4 for page (1:27527325), database ID 10.
Msg 1823, Sev 17: A database snapshot cannot be created because it failed to start.
Msg 7928, Sev 17: The database snapshot for online checks could not be created.

Read on to learn more about Eitan’s troubleshooting process, what the cause of the issue was, and the fixes (both the immediate and complete ones) needed to resolve the issue.

Leave a Comment

A Hidden Slowdown in PostgreSQL Restores

Warda Bibi explains why a large PostgreSQL database restoration can take a long time:

In July 2025, during the PG19-1 CommitFest, I reviewed a patch targeting the lack of parallelism when adding foreign keys in pg_restore. Around the same time, I was helping a client with a large production migration where pg_restore dragged on for more than 24 hours and crashed multiple times.

In this blog, I will talk about the technical limitations in PostgreSQL, the proposed fix, and a practical workaround for surviving large restores.

The reason for the slowdown is a bit wild to me, though I suppose somebody could edit the backup files manually after pg_dump writes them out.

Leave a Comment

Worst-Case Testing for Direct Lake Semantic Models

Chris Webb updates a prior post:

Two years ago I wrote a detailed post on how to do performance testing for Direct Lake semantic models. In that post I talked about how important it is to run worst-case scenario tests to see how your model performs when there is no model data present in memory, and how it was possible to clear all the data held in memory by doing a full refresh of the semantic model. Recently, however, a long-awaited performance improvement for Direct Lake has been released which means a full semantic model refresh may no longer page all data out of memory – which is great, but which also makes running performance tests a bit more complicated.

Read on to learn more about the improvement as well as how you can still perform your performance testing.

Leave a Comment

Choosing between Random Forest and Gradient Boosting

Jayita Gulati compares two popular algorithms for classification:

When working with machine learning on structured data, two algorithms often rise to the top of the shortlist: random forests and gradient boosting. Both are ensemble methods built on decision trees, but they take very different approaches to improving model accuracy. Random forests emphasize diversity by training many trees in parallel and averaging their results, while gradient boosting builds trees sequentially, each one correcting the mistakes of the last.

This article explains how each method works, their key differences, and how to decide which one best fits your project.

Click through for the explanation.

Leave a Comment

Date Intervals in PostgreSQL Window Functions

Hubert Lubaczewski solves a problem:

Since I can’t copy paste the text, I’ll try to write what I remember:

Given table sessions, with columns: user_id, login_time, and country_id, list all cases where single account logged to the system from more than one country within 2 hour time frame.

The idea behind is that it would be a tool to find hacked account, based on idea that you generally can’t change country within 2 hours. Which is somewhat true.

Solution in the blogpost suggested joining sessions table with itself, using some inequality condition. I think we can do better…

Click through for a solution that works for PostgreSQL but not SQL Server because the latter doesn’t offer date and time intervals on window function frames.

To do this in SQL Server, I’d probably use LAG() and get the prior value of country ID and the prior login time. Something like the following query, though I didn’t run detailed performance checks.

WITH records AS
(
	SELECT
		s.user_id,
		s.login_time,
		s.country_id,
		LAG(s.login_time) OVER (PARTITION BY s.user_id ORDER BY s.login_time) AS prior_login_time,
		LAG(s.country_id) OVER (PARTITION BY s.user_id ORDER BY s.login_time) AS prior_country_id
	FROM sessions s
)
SELECT *
FROM records r
WHERE
	r.prior_country_id <> r.country_id
	AND DATEDIFF(HOUR, r.prior_login_time, r.login_time) <= 2;
Leave a Comment

SQL Injection Vulnerabilities Fixed in SQL Server

Vlad Drumea tests out a pair of fixes:

In this post I demo two PoCs for SQL injection vulnerabilities fixed in SQL Server 2025 CU20 GDR KB5063814.

This August’s Patch Tuesday came with a security patch for SQL Server 2022, 2019, 2017, and 2016.

The number of SQL injection vulnerabilities caught my attention and I decided to see what system stored procedures have changed to see if I can find anything useful.

Vlad looks at a pair of spatial stored procedures and puts together a method to exploit the old versions.

Leave a Comment

Trying Time-Bound Extended Events in SQL Server 2025 RC0

Reitse Eskens check the state of the art:

What’s new in SQL 2025 is that the team behind SQL Server 2025 has addressed an issue that I’ve encountered once or twice as well. If you start an extended event, it will run until you stop it. Yes, you have to manually stop it. If you don’t, it will stop eventually. Together with some other services, because the disk space has run out. In my case, I was fortunate that the servers were monitored and alerts were triggered due to low disk space.

In this new version, starting with Release Candidate 0 (RC0), you can configure the duration.

Click through for a demo, as well as current limitations (generally around this not being in the UI yet).

Leave a Comment