Press "Enter" to skip to content

Category: Indexing

JSON Data and Columnstore Indexes

Niko Neugebauer continues a series on columnstore:

Not since SQL Server 2008 that Microsoft has added a new base data type to SQL Server, but in SQL Server 2025 they have added not 1 but whole 2 new data types – Vector and JSON. The first one (Vector) and the corresponding index (Vector Index) are described in details in the Columnstore Indexes – part 134 (“Vectors and Columnstore Indexes”) and this post is dedicated to the new JSON data type and the new JSON Index and their compatibility with the Columnstore Indexes and the Batch Execution mode.

One common trait for the Vector & JSON Indexes is that both come with a big number of limitations and they are all enabled under a “Preview” option, making them unsuitable for the most production environments.

Niko has a somewhat-humorous and somewhat-infuriating table at the beginning describing just how much support columnstore indexes have for JSON data types.

And it is another example of the frustrating way in which Microsoft will release something before it’s even half-baked, demand consumer adoption to continue working on it, and then can the feature because people can’t use the not-even-half-baked feature in its current state. There’s a fine line between rapid prototyping and quick market feedback versus strangling products in the crib, and I think they’re pretty far onto the wrong side of things when it comes to most SQL Server functionality.

Leave a Comment

The Cost of Heaps in SQL Server

Heaps killed Vlad Drumea’s dog and now it’s time to get revenge:

In this post I cover some unpleasant side-effects of using heaps in SQL Server for anything else than staging tables.

In the past couple of months I’ve been running into performance issues caused by the strange decision to have very active tables as heaps.
And, at this point, I’m on a personal vendetta against heaps in SQL Server.
As a result, I’m writing this blog post so that I can refer people to it the next time I’ll have to explain why heaps don’t match their use cases.

I, for one, wholeheartedly support Vlad’s vendetta. There are very rare circumstances in which it makes sense to use a heap in a non-temporary SQL Server table. Otherwise, define that clustered index.

Leave a Comment

Postgres Index Corruption after an OS Upgrade

Laurenz Albe has me concerned:

Most major Linux distributions have upgraded to the GNU C library version 2.28 or later. Therefore, there is a growing awareness that an operating system upgrade can lead to index corruption. However, rebuilding all your indexes can slow down the upgrade process considerably. In this article, I want to discuss how you can avoid rebuilding more indexes than necessary.

Read the whole thing if you are a Postgres DBA.

Comments closed

Page Compression on Heaps

Vlad Drumea explains why page compression might not give you quite what you expect:

I recently ran into SQL Server’s page compression being applied to a heap, and I figured I’d cover why that won’t work how some folks expect.

SQL Server’s page compression is really neat when applied on tables and indexes that are good candidates for it.
Even more so in cloud environments where storage costs can quickly add up.

Honestly, this is just a good reason to push for clustered indexes on all tables in SQL Server. I’ll call it good reason 5 out of 12.

Comments closed

What’s Missing in Columnstore Indexes

Niko Neugebauer has a list:

After spending some time thinking about the best way to come back to writing about Columnstore Indexes, after 5 and half years hiatus, I came to a conclusion that I have never published a post on what is still missing. With that in mind, I decided to mark my comeback to writing technical posts on my blog with rather simple post on the things that are needed, but did not made into the SQL Server – based engines so far (as of December 2025).

Niko has seven items on his list. I tend not to cover wish lists on Curated SQL, but when it’s Niko and columnstore indexes, I’m willing to make an exception.

Comments closed

Scan Types in PostgreSQL

Elizabeth Christensen has some neat imagery:

The secret to unlocking performance gains often lies not just in what you ask in a query, but in how Postgres finds the answer. The Postgres EXPLAIN system is great for understanding how data is being queried. One of secretes to reading EXPLAIN plans is understanding the type of scan done to retrieve the data. The scan type can be the difference between a lightning-fast response or a slow query.

Click through for the list, as well as images that clearly explain what’s happening.

Comments closed

Dealing with Index Bloat in Postgres

Kendra Little trims down the database after Thanksgiving:

Index bloat in Postgres can cause problems, but it’s easy to miss.

I’ve written about how vacuum problems can prevent PostgreSQL from using covering indexes, and index bloat is one of the things that can make vacuum struggle.

Here’s what you need to know about index bloat, how to find it, and how to fix it.

Read on to learn what Kendra means by index bloat, ways in which it can occur, why this is an issue, and how to identify and correct it.

Comments closed

Parallel Performance and SSMS Outputs

Joe Obbish looks at some execution plans:

Getting back to the query, it doesn’t look that offensive to me. The row mode sort is a parallel top N sort and the overall number of rows to return is low, so each thread can independently sort its rows and return 1000 locally sorted rows to the parent operator. This is about as good as it gets with parallel row mode sorting. This is a row mode only query so the operator times that you see are the sum of that operator’s work and its children. In terms of real work done by the query, the scan clocks in at 1.168 seconds and the sort clocks in at 0.84 seconds. The final accounting at the end by the parent Parallelism (Gather Streams) is misleading at best and an outright LIE at worst. There wasn’t 4 seconds of work done by this query. There was only 2 seconds. 

Joe looks at two separate things in this post: first, a way of trying to optimize OFFSET/FETCH style paging queries; and second, how the gather streams parallel operator can report wrong information.

Comments closed

Querying JSON Quickly in SQL Server 2025

Brent Ozar follows up on yesterday’s post:

SQL Server 2025 and .NET 10 bring several new improvements to storing JSON natively in the database and querying it quickly.

On the SQL Server 2025 side, the two big ones are the new native JSON indexes and the new JSON_CONTAINS function. Let’s see their improvements in action. On the .NET 10 side, EF 10 not only supports the new JSON data type, but on databases of compatibility level 170 or higher, EF will automatically migrate JSON data from NVARCHAR(MAX) data types over to JSON the next time you do a migration, as explained in the What’s New in EF Core 10 doc. That makes it especially important for you to understand how the new JSON indexes work, because they may be coming at you quickly the instant you change your compatibility level.

Read on to see Brent’s take.

Comments closed

Index Skip Scans in PostgreSQL 18

Hans-Jürgen Schönig demonstrates a new capability in PostgreSQL:

PostgreSQL 18 brings a couple of performance related features to the table which will help applications to run more efficiently, providing a better and more enjoyable user experience. One of those performance features is called “skip scans”. Most of you might ask yourself at this point: Wow, sounds cool, but what is a skip scan? The purpose of this post is to shed some light and explain how this works, what it does and most importantly: How one can benefit from this feature in real life.

Click through for the demo.

Comments closed