Press "Enter" to skip to content

Category: Columnstore

Linked Servers And Columnstore

Niko Neugebauer continues his columnstore series by looking at how they interact with linked servers:

Lets us make sure everything is fine for data transfer and as we are using our source server (SQL Server 2014) with Linked Server to SQL Server 2016, let us insert a couple of ObjectIds to the T1 table that we have created in the [Test] database:

This statement will result in the error message that you can find below, telling us something about Cursors (????):

WHAT ? SORRY ? THERE ARE NO CURSORS HERE !
We have no cursors, we just have a Clustered Columnstore Index on our table!

Read on to see how to get around this error, to the extent that you can.

Comments closed

Columnstore Segment Elimination Pains

Niko Neugebauer shows us two pain points with row group (segment) elimination on clustered columnstore indexes:

Still this gives us processing of just 1 Segment with 253 lob logical reads vs 6 Segments and 2669 lob logical reads in the original query.

But wait, that’s not all! It can get significantly worse!
Let’s use the inequality search with the OR condition, to see what happens – should be safe, right ?

Spoilers:  it’s not safe.  I’ve been burned on the Min + Max issue that Niko shows, where you’d think that it could eliminate all segments except for the first and last, but that’s not how the process works today.

Comments closed

Columnstore Predicate Pushdown Tipping Point

Lonny Niederstadt is hot on the trail, looking for evidence of a tipping point for COUNT(*) aggregates performing pushdown against a clustered columnstore index:

Below is what we want post-execution plans to look like when counting rows in a range – the thin arrow coming out of the Columnstore scan is a hint that predicate pushdown was successful.  I didn’t specify MAXDOP in a query hint, and Resource Governor isn’t supplying MAXDOP; MAXDOP 8 is coming from the Database Scoped Configuration.  The degree of parallelism turns out to be a significant factor in determining the tipping point.  The [key] column is a BigInt.  Maybe its surprising that I’m using 27213.9 as the upper bound.  But… check out the estimated number of rows 🙂 Again – this estimate is coming from the Legacy CE, specified in the database scoped configuration for my database.

Interesting findings, although it looks like the specific values are going to be more settings-dependent than the usual finding of this nature.

Comments closed

Columnstore Segment Alignment

Niko Neugebauer walks us through a new segment alignment detection function he has written:

There are 3 important factors that I use for the determination of the column that is really good for the Segment Elimination and hence the Segment Alignment:
– The support for the Segment Elimination (and then for the Predicate Pushdown). If the data type does not support Segment Elimination, than why would someone optimise for it ?
– The frequency with which the column is used in the predicates (not in the joins, because this is where generally the Segment Elimination/Predicate Pushdown does not function)
– The number of the distinct values within a table/partition (if we have more Segments than distinct values, it is not a very good sign generally: example – 10 million rows with 5 distinct values)

Read on for more details.

Comments closed

Rowgroup Elimination In Stored Procedures

Erik Darling notes a parameter sniffing problem when trying to use rowgroup elimination in a stored procedure:

So where are we? Well, we found that Rowgroup Elimination is possible in stored procedures with ColumnStore indexes, but that the cached plan doesn’t change based on feedback from that elimination.

  • Good news: elimination can occur with variables passed in.
  • Bad news: that cached plan sticks with you like belly fat at a desk job

Remember our plan? It used a Stream Aggregate to process the MAX. Stream Aggregates are preferred for small, and/or ordered sets.

Great post, Brent.

Comments closed

The Trivial Plan Problem With Columnstore

Niko Neugebauer shows that trivial columnstore plans can lead to poor performance, but SQL Server 2017 has a fix:

Do you remember one of the major problems in SQL Server 2014 using Columnstore Indexes ? It was the lack of the support for the Batch Execution Mode with just a single core. We would get wonderful, fast execution plans with MAXDOP >= 2, which will go terribly slow if there would not be enough memory to run the query with 2 or more cores, or if the internal query cost would be below the parallel execution threshold (cost threshold for parallelism)
OR if the execution plan would be dimmed as TRIVIAL by the Query Optimiser, thus getting a single core execution and running really slow.
Once we upgraded to SQL Server 2016, the problem of inability of the single core Batch Mode execution would fade away, but still, sometimes some queries would run terribly slow for some reason …
One of the reasons behind this are the trivial execution plans, which are running Columnstore Index Scan in the Row Execution Mode – also known as a VerySlowExecutionMode for the big amounts of data.

Read on to see the change in 2017, as well as a workaround for 2016.

Comments closed

Columnstore Dictionaries

Niko Neugebauer explains some interesting facts about columnstore index dictionaries:

From a recent experience at a customer, I had an opportunity to dive into the details of the Columnstore Indexes Dictionaries. I have to admit that my understanding of them was pretty low, from what I have learned in the recent days, and I would like to share what I have learned with everyone.

These are some of the key findings that I have discovered:
– The local dictionaries are not exclusively connected with just 1 Row Group, but with multiple ones;
– The dictionaries within Columnstore Indexes are compressed in a different way, depending on the type of the compression applied (Columnstore vs Columnstore Archival);

and let us dive into each one of them:

Read the whole thing.

Comments closed

Trivial Plans And Columnstore Indexes

Erik Darling warns us that trivial plans against clustered columnstore indexes could lead to row execution rather than batch execution:

Let’s look at one query with a few variations.

SELECT COUNT(*) AS [Records], SUM(CONVERT(BIGINT, t.Amount)) AS [Total]
FROM   dbo.t1 AS t
WHERE  t.Id > 0
       AND t.Id < 3;

The plan for it is alright. It’s fairly straightforward and the query finishes in about 170ms.

We can see from the graphical execution plan that it’s been Simple Parameterized. SQL Server does this to make plan caching more efficient.

Check out the entire post.

Comments closed

Finding Clustered Columnstore Index Candidates

Sunil Agarwal has a script that helps you find potential clustered columnstore index candidates:

Most of us understand that clustered columnstore index can typically provide 10x data compression and can speed up query performance up to 100x. While this sounds all so good, the question is how do I know which tables in my database could potentially benefit from CCI? For a traditional DW scenario with star schema, the FACT table is an obvious choice to consider. However, many workloads including DW have grown organically and it is not trivial to identify tables that could benefit from CCI. So the question is how can I quickly identify a subset of tables suitable for CCI in my workload?

Interestingly, the answer lies in leveraging the DMVs that collect data access patterns in each of the tables. The following DMV query provides a first order approximation to identify list of tables suitable for CCI. It queries the HEAP or the rowstore Clustered index using DMV sys.dm_db_index_operational_stats to identify the access pattern on the base rowstore table to identify tables that meet the criteria listed in the comments below:

Read on for the script, which has a sensible set of criteria.

Comments closed

Memory Grants Affecting Columnstore Load

Denzil Ribeiro explains how memory grant pressure can determine whether a columnstore bulk insert skips the deltastore or not:

We found that only at the beginning of the run, there was contention on memory grants (RESOURCE_SEMAPHORE waits), for a short period of time. After that and later into the process, we could see some latch contention on regular data pages, which we didn’t expect as each thread was supposed to insert into its own row group. You would also see this same data by querying sys.dm_exec_requests live, if you caught it within the first minute of execution, as displayed below.

This is useful in case you run into the issue, but also useful as a case study on effective troubleshooting.

Comments closed