Clustered Columnstore Index Updates

Koen Verbeeck discusses what updates do to clustered columnstore indexes:

Turns out the majority of the rows belonged to the second scenario. Whoops. The initial run took a little over 20 hours. Not exactly rocket speed. The problem was that for each period, a large number of rows in the clustered columnstore index (CCI) had to be updated, just to set the range of the interval. Updates in a CCI are expensive, as they are split into inserts and deletes. Doing so many updates resulted in a heavily fragmented CCI and with possibly too many rows in the delta storage (which is row storage).

Read the whole thing.  Koen links to a Niko Neugebauer post, which you should also read.  After that, read my warning on trickle loading.  The major querying benefits you get from clustered columnstore indexes is great, but it does come at a cost when you aren’t simply inserting new rows.

Related Posts

Preventing Issues With Columnstore Indexes

Kevin Chant has a post covering issues in his experience with the creation of columnstore indexes: Error due to online statement It’s a very common error caused usually by somebody copying a rowstore “Create Index” command. The “online=on” option you use with rowstore indexes does not work with creating columnstore indexes yet. For the record the […]

Read More

Avoid Key Lookups On Clustered Columnstore Indexes

Joey D’Antoni points out a potential big performance problem with clustered columnstore indexes: In the last year or so, with a large customer who makes fairly heavy use of this pattern, I’ve noticed another concern. Sometimes, and I can’t figure out what exactly triggers it, the execution plan generated, will do a seek against the […]

Read More

Categories

December 2016
MTWTFSS
« Nov Jan »
 1234
567891011
12131415161718
19202122232425
262728293031