Press "Enter" to skip to content

Category: Columnstore

CISL 1.4.0

Niko Neugebauer has released the latest version of his Columnstore Index Scripts Library:

Another happy release of the CISL (Columnstore Indexes Script Library) is live – this time it is 1.4.0!

This release is focusing on the addition of the Extended Events, so that a user of CISL can easily set up the events for each of the SQL Server (2012,2014,2016) or Azure SQL Database versions.

This is an open source library which I recommend if you deal with columnstore indexes in any fashion.

Comments closed

Memory-Optimized Columnstore

Niko Neugebauer clears the air regarding memory-optimized columnstore tables:

I would like to dedicate this blog post to the Memory-Optimised (also known and LOVED as Hekaton) Columnstore Indexes and their limitations in SQL Server 2016.
Disclaimer: the Memory-Optimised Technology is the ground-breaking development, which will be truly appreciated only in the next couple of years, and it has its incredible use cases (and maybe I will be blogging more about this space in the next couple of months), but people needs to understand that mapping InMemory Columnstore Indexes to disk-based Columnstore Indexes 1:1 is a very wrong idea, and that because InMemory technology is significantly younger and less stable than Columnstore Indexes – there are some very significant hidden cornerstones.

It’s important to read this post as “this is not yet a fully-mature product” rather than “this will always be worse.”  But it’s just as important to understand the limitations of the product and not think you’re getting something that you aren’t.

Comments closed

Minimal Logging With Columnstore

Niko Neugebauer continues his columnstore series by looking at columnstore insert logging in SQL Server 2016 versus 2014:

Ladies and gentlemen! That’s quite a difference to SQL Server 2014!
We better check the total length of the transaction log to see the final result: 384.032 bytes! Ok, that is significantly more than for the rowstore heap table for sure, but what about the comparison to the SQL Server 2014 ? Did this minimal logging bring any improvement ?
Well … 🙂
In SQL Server 2014 we had 1.255.224 bytes spent on the transaction log – meaning over 1.2 MB, meaning around 3 times more, for the Delta-Store insertion! For such a simple table, this is a huge improvement, but let’s take a look at the total length of the transaction log entries in both environments (SQL Server 2014 & SQL Server 2016)

This is worth a careful read.  If you’ve spent time working with 2014 clustered columnstore indexes, there are a few changes which might affect you.  The most interesting thing for me was that the deltastore is no longer page compressed.

Comments closed

Nonclustered Columnstore Indexes On Indexed Views

Niko Neugebauer notes that non-clustered columnstore indexes can now sit on top of indexed views, as of SQL Server 2016:

From the perspective of the disk access, this is where you will definitely win at least a couple of times with the amount of the disk access while processing the information, amount of memory that you will need to store and process (think hashing and sorting for the late materialisation phases), and you will pay less for the occupied storage.

Another noticeable thing was that the memory grants for the Indexed Views query was smaller compared to the query that was processing the original columnstore table FactOnlineSales.

Clustered indexes are currently not available as an option; we’ll see if that changes in the next version of SQL Server.

Comments closed

Columnstore Improvements

Warner Chaves discusses improvements in columnstore indexes in SQL Server 2016:

SQL Server first introduced Columnar Storage with the SQL 2012 Enterprise release. In this release, Columnstores were read-only indexes, so it required to drop the index, load the table or partition and then rebuild the index to refresh it with the latest data.

SQL Server 2014 upgraded Columnstores with full read-write capabilities, allowing the Columnstore to become the ‘clustered’ index for the table and hold all the data instead of just being one more index on top of row-organized data. 2014 also introduced many improvements to batch operations so more pieces of an execution plan could take advantage of this faster processing mode.

Read on to see changes in 2016.

Comments closed

CISL 1.3.1

Niko Neugebauer has released the newest version of his columnstore index library:

Here is the small description of what is new in this release:

  • The database snapshots (.dacpac) for all platforms are now included in the Releases\DacPacs.

  • Includes new Powershell functions for installing and removing CISL from the Instances:

    • Install-CISL.ps1 will allow you to install the CISL at multiple databases of a SQ Server Instance (or Azure SQLDB).
    • Remove-CISL.ps1 will allow you to remove the CISL from the multiple databases of a SQL Server Instance (or Azure SQLDB).
  • Support for the different collation is included.

  • Includes information on all recent SQL Server updates.

  • Included support of the new Columnstore Indexes Trace Flags in SQL Server 2016.

  • Basic Unit Tests (based on t-sqlt) are included for SQL Server 2012 & SQL Server 2014, guaranteeing the quality of the released code.

  • A good number of bug fixes.

  • Further parameter enhancements for the existing functions.

Sounds like there’s a lot packed into this release.

Comments closed

New Columnstore Trace Flags

Niko Neugebauer looks at a few trace flags which modify columnstore index behavior:

Starting with SQL Server 2016, if you have enough RAM and suffering from the TempDB Spills that do have a significant impact on your workload, then you can enable the Trace Flag 9389 that will enable Batch Mode Iterators to request additional memory for the work and thus avoiding producing additional unnecessary I/O.

I am glad that Microsoft has created this functionality and especially that at the current release, it is hidden behind this track, and so Microsoft can learn from the applications before enabling it by default, hopefully in the next major release of SQL Server.

There’s a lot of good stuff in here.  Read the whole thing.

Comments closed

CISL 1.3.0

Niko Neugebauer has released version 1.3.0 of his Columstore script library:

I am extremely proud to share with everyone the news that the long-awaited and quite overdue release of the CISL – Columnstore Indexes Scripts Library is finally public – the 1.3.0 version. The most important part of this release is the support of the SQL Server 2016 & Azure SQLDatabase – all 3 scenarios (Nonclustered Columnstore, Disk-based Clustered Columnstore & the Memory-Optimised Clustered Columnstore Indexes) are included.
You will be able to explore all the important new architecture objects, such as Deleted Buffer & Deleted Table, plus the scripts for every version supports the new output results, even though there were no In-Memory tables in SQL Server 2012 for example.

If you use columnstore indexes, check this out.

Comments closed

Parallel Loading Columnstore Indexes

Sunil Agarwal shows how to bulk load with parallelism into a clustered columnstore index from a staging table:

SQL Server 2016 requires following conditions to be met for parallel insert on CCI

  • Must specify TABLOCK
  • No NCI on the clustered columnstore index
  • No identity column
  • Database compatibility is set to 130

While these restrictions are enforced in SQL Server 2016 but they represent important scenarios. We are looking into relaxing these in subsequent releases. Another interesting point is that you can also load into ‘rowstore HEAP’ in parallel as well.

The restriction I’d most like to see reduced would be the “no non-clustered indexes” part.  The rest seem forgivable for most clustered columnstore setups (i.e., fact tables).

Comments closed

Columnstore Basics

Sunil Agarwal has a couple of posts explaining columnstore indexes.  First, how columnstore indexes differ from classic B-tree indexes:

  • Index Fragmentation: For rowstore based indexes, it is considered fragmented if (a) the physical order of pages in out of sync with the index-key order. (b) the data pages (clustered index) or index pages (for nonclustered index) are partially filled. A fragmented index will lead to significantly higher physical IOs and can potentially put more pressure on memory which can ultimately slowdown queries. Most organizations run a periodic index maintenance job to defragment indexes. For details, please refer to  https://msdn.microsoft.com/en-us/library/ms189858.aspx#Fragmentation best practices on how to maintain btree indexes. For columnstore index, an index fragmentation is considered fragmented if (a) there are 10% or more rows marked as deleted in a compressed rowgroup (b) one or more smaller compressed rowgroups can be combined to create a larger compressed rowgroup such that the resultant compressed rowgroup has less than or equal to 1 million rows. Note, if a compressed rowgroup has less than 1 million rows due to dictionary size, it is not considered fragmented because there is nothing that can be done to increase its size.  Also recall that a columnstore index consists of zero or more delta rowgroups as shown the in the picture below.

Also, clustered and non-clustered columnstores:

SQL Server 2016 provides two flavors of columnstore index; clustered (CCI) and nonclustered (NCCI) columnstore index. As shown in the simplified picture below, both indexes are organized as columns but NCCI is created on an existing rowstore table as shown on the right side in the picture below while a table with CCI does not have a rowstore table. Both tables can have one or more btree nonclustered indexes.

If you haven’t looked at columnstore indexes yet, 2016 is a great time to start.

Comments closed