Press "Enter" to skip to content

Category: Columnstore

JSON Data and Columnstore Indexes

Niko Neugebauer continues a series on columnstore:

Not since SQL Server 2008 that Microsoft has added a new base data type to SQL Server, but in SQL Server 2025 they have added not 1 but whole 2 new data types – Vector and JSON. The first one (Vector) and the corresponding index (Vector Index) are described in details in the Columnstore Indexes – part 134 (“Vectors and Columnstore Indexes”) and this post is dedicated to the new JSON data type and the new JSON Index and their compatibility with the Columnstore Indexes and the Batch Execution mode.

One common trait for the Vector & JSON Indexes is that both come with a big number of limitations and they are all enabled under a “Preview” option, making them unsuitable for the most production environments.

Niko has a somewhat-humorous and somewhat-infuriating table at the beginning describing just how much support columnstore indexes have for JSON data types.

And it is another example of the frustrating way in which Microsoft will release something before it’s even half-baked, demand consumer adoption to continue working on it, and then can the feature because people can’t use the not-even-half-baked feature in its current state. There’s a fine line between rapid prototyping and quick market feedback versus strangling products in the crib, and I think they’re pretty far onto the wrong side of things when it comes to most SQL Server functionality.

Leave a Comment

Vectors and Columnstore Indexes

Niko Neugebauer continues a series on columnstore indexes:

In this post we are going to test one of the more promising technologies in SQL Server-based offerings – Vector data types and its relationship with the Columnstore Indexes. The tests I am running right now are executed against SQL Server 2025 RTM, the latest and greatest SQL Server version available to customers. Given that some parts of the SQL Server 2025 were delivered as a Preview Features, the current situation might change in the future for SQL Server 2025 (at least, Half-precision float support should evolve into the fully supported feature, in my opinion). At very least, I do expect reasonably fast evolution of the space on Azure SQL Database & Azure SQL Managed Instance.

This seems like more pain than joy, which is the unfortunate reality of v1 features in SQL Server anymore.

Leave a Comment

Data Replication and Columnstore

Niko Neugebauer continues a series on columnstore:

In the Columnstore Indexes space, there is a long-standing “tradition” in Microsoft to ignore the needs of the customers for data replication. It has started with with the original SQL Server 2012 release not supporting any data manipulation operations besides Partition switching. Since then it has been improved from version to version up until SQL Server 2016 where Nonclustered Columnstore Indexes has received a support for the Transactional Replication, and voila – that’s where it has stopped!

Read on for the frustration involved in moving around columnstore data.

2 Comments

What’s Missing in Columnstore Indexes

Niko Neugebauer has a list:

After spending some time thinking about the best way to come back to writing about Columnstore Indexes, after 5 and half years hiatus, I came to a conclusion that I have never published a post on what is still missing. With that in mind, I decided to mark my comeback to writing technical posts on my blog with rather simple post on the things that are needed, but did not made into the SQL Server – based engines so far (as of December 2025).

Niko has seven items on his list. I tend not to cover wish lists on Curated SQL, but when it’s Niko and columnstore indexes, I’m willing to make an exception.

Comments closed

What’s New for Columnstore Indexes in SQL Server 2025

Ed Pollack gives us the lowdown:

Columnstore indexes are a powerful tool for storing analytic data directly in SQL Server. This feature has improved in every version of SQL Server since their inception over ten years ago, and SQL Server 2025 is no exception! 

The newest enhancements are laser-focused on business continuity and performance. Ordered clustered columnstore indexes, ordered non-clustered columnstore indexes, and database/file shrink operations are all given significant boosts that are worth the time to introduce and learn. 

In this article we will dive into each of these changes, how they impact columnstore workloads in SQL Server, and demonstrate their operation. 

Read on to see what we’ve got. Nothing in here is ground-breaking, but it’s a set of nice quality of life improvements.

Comments closed

Columnstore Key Lookups are Bad News

Forrest McDaniel does not want to perform that key lookup:

I’ve read it repeatedly, columnstore key lookups are extra slow. The question of course, is why?

In my mental model, it makes sense. A normal key lookup adds about 3 reads.

While a columnstore lookup should add at least a read per column, since each column lives in its own segments.

But it turns out that it’s not a read per column, oh no. Columnstore indexes are amazing for large-scale aggregations and awful for individual lookups.

Comments closed

Search Patterns in T-SQL

Erik Darling puts on the fedora and grabs the bullwhip:

First, what you should not do: A universal search string:

The problem here is somewhat obvious if you’ve been hanging around SQL Server long enough. Double wildcard searches, searching with a string type against numbers and dates, strung-together OR predicates that the optimizer will hate you for.

These aren’t problems that other things will solve either. For example, using CHARINDEX or PATINDEX isn’t a better pattern for double wildcard LIKE searching, and different takes on how you handle parameters being NULL don’t buy you much.

Read on for an example of a terrible search query, a mediocre search query, a good search query, and a possible unicorn: an actually valid reason to use a non-clustered columnstore index.

Comments closed

Ordered Columnstore Indexes in SQL Server 2022

Ed Pollack gives us the scoop on ordered columnstore indexes:

One of the more challenging technical details of columnstore indexes that regularly gets attention is the need for data to be ordered to allow for segment elimination. In a non-clustered columnstore index, data order is automatically applied based on the order of the underlying rowstore data. In a clustered columnstore index, though, data order is not enforced by any SQL Server process. This leaves managing data order to us, which may or may not be an easy task.

To assist with this challenge, SQL Server 2022 has added the ability to specify an ORDER clause when creating or rebuilding an index. This feature allows data to be automatically sorted by SQL Server as part of those insert or rebuild processes. This article dives into this feature, exploring both its usage and its limitations.

I’ve seen a couple places where ordered columnstore indexes make enough sense to use, though not as many as I had first anticipated. That might change over time, as we see additional columnstore development.

Comments closed

Finding Columnstore Index Storage and Memory Allocations

Jose Manuel Jurado Diaz has a few scripts for us:

Today, we got a new question how much is the size used by a columnstore index table at storage level and memory usage.

TSQL to obtain the total number of rows, size per schema, table and index.

Using the view sys.column_store_row_groups (Transact-SQL) – SQL Server | Microsoft Learn we could see the total number of rows and space usage at storage level.

Click through for that script, as well as a few more to learn how much space and memory that columnstore index is taking.

Comments closed

Bringing Order to a Columnstore Index

Tibor Karaszi puts columnstore ducks in a row:

Data for a columnstore index is divded in groups of approximate 1 million rows, rowgroups. Each rowgroup has a set of pages for each column. The set of pages for a column in a rowgroup is called a segment. SQL Server has meta-data for the lowest and highest value for a segment. There are no SEEKs in a columnstore index. But, SQL Server can use this meta-data to skip reading segments, with the knowledge that “this segment cannot contain any data that I need based on my predicates in my WHERE clause”.

Also, you might want to do these operations using MAXDOP 1, so we don’t have several threads muddling our neat segment alignment.

I’m not sure I actually set the ORDER BY clause on columnstore indexes all that often—a quick mental survey says maybe once, though that could be my own failing rather than a statement on the utility of ordered columnstore indexes.

Comments closed