Identity Integers And Columnstore

Niko Neugebauer looks at how clustered columnstore indexes handle identity integers with respect to data load times:

This blog post will try to respond this question from the perspective of the data loading performance.

For this research I decided to pick 3 distinct scenarios to investigate, which refer to different ways to approach the solution:
– a CCI table with an Identity column
– a CCI table with a Sequence as a default value
– a CCI table without Identity

There’s a pretty substantial performance difference, so this is well worth the read for large columnstore data loads.

Related Posts

Trickle Insertion With Clustered Columnstore Indexes

Sunil Agarwal provides a pattern for trickle loading clustered columnstore indexes: A traditional scenario of loading data into CCI is a nightly load from one or more data files containing millions of rows. Recommended technique is to load the data with batchsize >= 102400 as explained https://blogs.msdn.microsoft.com/sqlserverstorageengine/2014/07/27/clustered-column-store-index-bulk-loading-the-data/. However, we are seeing many scenarios where data source is parallel […]

Read More

Wait Stats Related To Columnstore Indexes

Niko Neugebauer has some documentation on important wait stats around columnstore: I split the known wait types into the following distinct groups: – Batch Execution Mode (HTBUILD, HTDELETE, HTMEMO, HTREINIT, HTREPARTITION, PWAIT_QRY_BPMEMORY, BPSORT) – Columnstore Indexes (ROWGROUP_VERSION, ROWGROUP_OP_STATS, SHARED_DELTASTORE_CREATION, COLUMNSTORE_BUILD_THROTTLE, COLUMNSTORE_MIGRATION_BACKGROUND_TASK, COLUMNSTORE_COLUMNDATASET_SESSION_LIST) With an appearance in the next SQL Server version of the Batch Execution Mode for the RowStore […]

Read More

Categories

March 2017
MTWTFSS
« Feb Apr »
 12345
6789101112
13141516171819
20212223242526
2728293031