Clustered Columnstore Index Updates

Koen Verbeeck discusses what updates do to clustered columnstore indexes:

Turns out the majority of the rows belonged to the second scenario. Whoops. The initial run took a little over 20 hours. Not exactly rocket speed. The problem was that for each period, a large number of rows in the clustered columnstore index (CCI) had to be updated, just to set the range of the interval. Updates in a CCI are expensive, as they are split into inserts and deletes. Doing so many updates resulted in a heavily fragmented CCI and with possibly too many rows in the delta storage (which is row storage).

Read the whole thing.  Koen links to a Niko Neugebauer post, which you should also read.  After that, read my warning on trickle loading.  The major querying benefits you get from clustered columnstore indexes is great, but it does come at a cost when you aren’t simply inserting new rows.

Related Posts

Cloning And Columnstore Statistics

Niko Neugebauer points out a fix in SQL Server 2019: I have a huge love for the DBCC CLONEDATABASE command – it has been made available (backported) to every SQL Server version starting with SQL Server 2012, since the original release in SQL Server 2014, while being constantly improved in the Service Packs and Cumulative Updates. This […]

Read More

Hybrid Columnstore And B+ Tree Designs

Adrian Colyer reviews a Microsoft paper on the combination of columnstore and B+ tree indexes on a single table: The authors conducted a series of microbenchmarks as follows: scans with single predicates with varying selectivity to study the trade-off between the range scan of a B+ tree vs a columnstore scan sort and group-by queries […]

Read More

Categories

December 2016
MTWTFSS
« Nov Jan »
 1234
567891011
12131415161718
19202122232425
262728293031