Press "Enter" to skip to content

Category: Columnstore

Clustered Columnstore Index Updates

Koen Verbeeck discusses what updates do to clustered columnstore indexes:

Turns out the majority of the rows belonged to the second scenario. Whoops. The initial run took a little over 20 hours. Not exactly rocket speed. The problem was that for each period, a large number of rows in the clustered columnstore index (CCI) had to be updated, just to set the range of the interval. Updates in a CCI are expensive, as they are split into inserts and deletes. Doing so many updates resulted in a heavily fragmented CCI and with possibly too many rows in the delta storage (which is row storage).

Read the whole thing.  Koen links to a Niko Neugebauer post, which you should also read.  After that, read my warning on trickle loading.  The major querying benefits you get from clustered columnstore indexes is great, but it does come at a cost when you aren’t simply inserting new rows.

Comments closed

LOB In Columnstore

Niko Neugebauer discusses LOB support in SQL Server vNext:

In the upcoming version of SQL Server (for the moment known as SQL Server vNext), Microsoft has finally announced the upcoming support for the LOBs within Columnstore Indexes – thus enabling the usage of the NVARCHAR(MAX), VARCHAR(MAX), and VARBINARY(MAX) data types on the tables with Columnstore Indexes that include those columns.

For the tests, I have decided to spin a Virtual Machine in Azure with an installation of the currently available CTP1 of the SQL Server vNext, which has a version 14.0.1.126.

Read the whole thing.  It’s hard to tell at this point if these are bugs, incomplete functionality, or what, so it’ll be interesting to track changes over the CTPs.

Comments closed

Columnstore On Standard Edition

Niko Neugebauer extends his Columnstore series with a post on what you can do with these indexes in Standard Edition:

Given the improvements and the availability of the of the programability surface for every edition (with some insignificant & logical limitations) that I have blogged about in
SQL Server 2016 SP1 – Programmability Surface for everyone!, I believe everyone using Microsoft Data Platform has rejoyced greatly. Of course, now everyone can have Columnstore Indexes on every SQL Server edition!
There are some noticeable limitations that were announced right from the start, such as the maximum size of the Columnstore Object Pool (you can find more information about it here – Columnstore Indexes – part 38 (“Memory Structures”)), but there are more limitations to the Standard Editions and inferior ones and it is extremely important to know them, to understand them in order to make the right decision – when your Business is ready/needed to upgrade to the Enterprise Edition of the SQL Server.

If you’re on Standard Edition and excited about using Columnstore, do read Niko’s post.  Columnstore won’t work as fast as it does on Enterprise Edition (gotta have a reason to upgrade) but based on what he’s shown thus far, Columnstore is still a good reason to upgrade to 2016 SP1 if you’re on Standard Edition.

Comments closed

In-Memory Columnstore Updates

Niko Neugebauer looks at improvements in SQL Server 2016 SP1 with respect to adding columnstore indexes to memory-optimized tables:

With an incredible Service Pack 1 Microsoft has triumphantly announced that all editions (Standard, Web, Express and even Local) will get the most advanced programming capabilities of Columnstore, In-Memory, Database Snapshot, Compression, Partition & many others, plus that there are some incredible features for the T-SQL (CREATE OR ALTER) and Execution Plan details (Actual Information on processed Rows, used Trace Flags, etc),
but at the same time there are some quite important improvement under the hood that will make you want to use Service Pack 1 for SQL Server 2016 immediately. One of this features is the fast addition of the Columnstore Index to the Memory-Optimised tables. Let take it to the test by restoring a copy of the ContosoRetailDW free database:

These results look nice.

Comments closed

Loading Columnstore Data

I have a post on an issue I had with loading columnstore data:

In this design, I have data coming from the transactional system, undergoing some ETL processing, and going into a staging table on the warehouse.  From there, I perform the remainder of the ETL work and insert into a rowstore table.  This rowstore table has the same attribute names and data types as the columnstore table, but instead of having a clustered columnstore index, it has a standard B-tree index and can have additional non-clustered indexes.  From there, I expose the combination table using a view which simply unions the two sets of data so the application doesn’t have to see rowstore versus columnstore tables.

If you’re doing nightly insertions, results tend to be a lot better.  But if you need close-to-but-not-quite real-time data, there are still ways to solve the problem.

Comments closed

Filtered Indexes

Kendra Little explains the two types of filtered indexes:

These two filtered indexes are very different – and the SQL Server optimizer can use them very differently!

While classic filtered nonclustered rowstore indexes must reliably “cover” parts of the query to be used to the optimizer, filtered nonclustered columnstore indexes may be combined with other indexes to produce a plan returning a larger range of data.

This sounds a little weird. I’ll show you what I mean using the WideWorldImporters database.

Kendra exposes an interesting difference in the two types of index and a case where filtered indexes simply fail  (though that’s not a situation you want to be in anyhow!).

Comments closed

CISL 1.4.0

Niko Neugebauer has released the latest version of his Columnstore Index Scripts Library:

Another happy release of the CISL (Columnstore Indexes Script Library) is live – this time it is 1.4.0!

This release is focusing on the addition of the Extended Events, so that a user of CISL can easily set up the events for each of the SQL Server (2012,2014,2016) or Azure SQL Database versions.

This is an open source library which I recommend if you deal with columnstore indexes in any fashion.

Comments closed

Memory-Optimized Columnstore

Niko Neugebauer clears the air regarding memory-optimized columnstore tables:

I would like to dedicate this blog post to the Memory-Optimised (also known and LOVED as Hekaton) Columnstore Indexes and their limitations in SQL Server 2016.
Disclaimer: the Memory-Optimised Technology is the ground-breaking development, which will be truly appreciated only in the next couple of years, and it has its incredible use cases (and maybe I will be blogging more about this space in the next couple of months), but people needs to understand that mapping InMemory Columnstore Indexes to disk-based Columnstore Indexes 1:1 is a very wrong idea, and that because InMemory technology is significantly younger and less stable than Columnstore Indexes – there are some very significant hidden cornerstones.

It’s important to read this post as “this is not yet a fully-mature product” rather than “this will always be worse.”  But it’s just as important to understand the limitations of the product and not think you’re getting something that you aren’t.

Comments closed

Minimal Logging With Columnstore

Niko Neugebauer continues his columnstore series by looking at columnstore insert logging in SQL Server 2016 versus 2014:

Ladies and gentlemen! That’s quite a difference to SQL Server 2014!
We better check the total length of the transaction log to see the final result: 384.032 bytes! Ok, that is significantly more than for the rowstore heap table for sure, but what about the comparison to the SQL Server 2014 ? Did this minimal logging bring any improvement ?
Well … 🙂
In SQL Server 2014 we had 1.255.224 bytes spent on the transaction log – meaning over 1.2 MB, meaning around 3 times more, for the Delta-Store insertion! For such a simple table, this is a huge improvement, but let’s take a look at the total length of the transaction log entries in both environments (SQL Server 2014 & SQL Server 2016)

This is worth a careful read.  If you’ve spent time working with 2014 clustered columnstore indexes, there are a few changes which might affect you.  The most interesting thing for me was that the deltastore is no longer page compressed.

Comments closed

Nonclustered Columnstore Indexes On Indexed Views

Niko Neugebauer notes that non-clustered columnstore indexes can now sit on top of indexed views, as of SQL Server 2016:

From the perspective of the disk access, this is where you will definitely win at least a couple of times with the amount of the disk access while processing the information, amount of memory that you will need to store and process (think hashing and sorting for the late materialisation phases), and you will pay less for the occupied storage.

Another noticeable thing was that the memory grants for the Indexed Views query was smaller compared to the query that was processing the original columnstore table FactOnlineSales.

Clustered indexes are currently not available as an option; we’ll see if that changes in the next version of SQL Server.

Comments closed