Columnstore Elimination

Sunil Agarwal has a two-part series on columnstore data elimination.  First up is column elimination:

Now, let us run the same query on the table with clustered columnstore index as shown in the picture below. Note, that the logical IOs for the LOB data is reduced by 3/4th for the second query as only one column needs to be fetched. You may wonder why LOB? Well, the data in each column is compressed and then is stored as BLOB. Another point to note is that the query with columnstore index runs much faster, 25x for the first query and 4x for the second query.

Next up is rowgroup elimination:

In the context of rowgroup elimination, let us revisit the previous example with sales data

  • You may not even need partitioning to filter the rows for the current quarter as rows are inserted in the SalesDate order allowing SQL Server to pick the rowgroups that contain the rows for the requested date range.
  • If you need to filter the data for a specific region within a quarter, you can partition the columnstore index at quarterly boundary and then load the data into each partition after sorting on the region. If the incoming data is not sorted on region, you can follow the steps (a) switch out the partition into a staging table T1 (b) drop the clustered columnstore index (CCI) on the T1 and create clustered btree index on T1 on column ‘region’ to order the data (c) now create the CCI while dropping the existing clustered index. A general recommendation is to create CCI with DOP=1 to keep the prefect ordering.

From these two articles, queries which hit a small percentage of columns and stick to a relatively small number of rowgroups will likely perform better.  For people who understand normal B-tree indexes, the second point seems clear enough, but the first point is at least as important.

Related Posts

Reserved Memory Allocation Waits And Trace Flag 834

Joe Obbish has another post looking at sub-optimal columnstore index performance: It is possible to see a scalability bottleneck in the form of high average wait time for the RESERVED_MEMORY_ALLOCATION_EXT wait if a highly concurrent workload is run on a server that consumes memory with batch mode operators. I believe that the severity of the bottleneck depends […]

Read More

Columnstore And Merge Replication

Niko Neugebauer tests whether merge replicated tables can use columnstore indexes: Adding this table to the publication will end up with the following, self-explaining error message, being very clear that the Clustered Columnstore Indexes are not supported for the Merge Replication[.] There is no surprise here, as the same Clustered Columnstore Indexes are not supported […]

Read More


January 2017
« Dec Feb »