I have a very simple query. It’s running against a table with a nonclustered columnstore index.
SELECT COUNT(*) FROM pt.FirstNameByBirthDate_1966_2015;
The query returns one row, as expected. Here’s my count:
For the record, that is the correct number of rows in the table. Here’s where things get weird. In the actual execution plan, the columnstore index returns zero rows.
Yes, this is really the actual execution plan. I’m not tricking you, I promise.
Click through for the answer.
This blog post is focused on the MERGE statement for the Columnstore Indexes, or as I call it – the worst enemy of the Columnstore Indexes. It is extremely difficult to imagine some statement or way of making the worst out of the Columnstore Indexes, if not the infamous MERGE statement. Why ? Because it is not only making Columnstore Indexes perform slow, it will make them perform MUCH SLOWER then any Rowstore Indexes. Yes, you have read right – slower then ANY_ROWSTORE_INDEXES. In fact, this should be a hint that one should apply to the Merge statement, when it is executed against Columnstore Indexes! 🙂
I decide to dedicate a whole blog post on this matter, mainly to warn people of this pretty problematic statement – I hope not to see it being used for Columnstore Indexes in the future!
There is very little room for misunderstanding in Niko’s post.
Note: Notice, that I created a varchar(max) column in a table, that was done deliberately, because I wanted to test if LOB columns are supported in vNext. Unfortunatelly, when I tried to create a nonclustered Columnstore index, I got an error:
The statement failed. Column ‘Note’ has a data type that cannot participate in a columnstore index. Omit column ‘Note’.
However, if you create a clustered Columnstore, you will succeed! Though, you will get the following warning in CTP 1.2:
Warning: Using Lob types (NVARCHAR(MAX), VARCHAR(MAX), and VARBINARY(MAX)) with Clustered Columnstore Index is in public preview. Do not use it on production data without backup during public preview period.
I think it is great that we can now have a LOB varchar columns in Columnstore, because it was a blocker for some scenarios, at least I have seen some of those.
There are a few interesting findings here, so check it out.
Both queries are now fully optimized and that lead to different plans. First of all, both queries run in a Batch Mode, which is much faster than a Row Mode.
In the first query, we see Hash Match Aggregate instead of Stream Aggregate, more to the point you may see that Actual Number of Rows is 0, because all the rows were aggregated locally at the Storage Engine level, you may see property Actual Number of Locally Aggregated Rows = 60855. This is faster than a regular aggregation and is known as Aggregate Pushdown.
In the second query, you may observe a new Window Aggregate operator which is faster than a Window Spool and runs in Batch Mode also.
Read the whole thing. Dmitry also looks at SQL Server vNext and how it handles the same trivial-plan-generating scenario.
You can see a filter iterator taking estimated 9% of the resources and filtering over 5,1 million rows that were taken out of the lineitem_cci table (Columnstore Index Scan operation). Before it took place all of the table data was read and the values for the aggregate values were calculated, meaning a significant resource waste (when the data was filtered and while it was occupying memory).
The predicate properties of the Columnstore Index Scan are shown on the left picture and you can see that nothing has been pushed into the Storage Engine, making this query perform quite slow. On my local test VM it takes on average 0.7 seconds to execute while spending over 1.9 seconds of the CPU time. This happens because of the inability of the Columnstore Indexes to push NULL expression predicate into the storage engine.
To solve this problem in this test case I will need to rewrite the query and remove the NULL expression by substituting it by = ‘RAIL’ OR IS NULL logic:
Definitely worth a read.
Many of the in-memory features in the Microsoft platform rely on the xVelocity (formerly known as VertiPaq) engine. The implementations do differ somewhat between products, such as the requirements for data to be truly memory-resident.
The remainder of this post will focus on the two columnstore technologies in SQL Server: clustered and nonclustered.
If you’re not very familiar with columnstore and aren’t quite ready to tackle Niko Neugebauer’s columnstore series, this is a good way to get started.
Ultimately I think any thought of the readable secondary having a vastly different plan was a red herrings. Statistics are going to be the same on both instances, and if there were a missing statistic on the secondary, SQL Server would create it in TempDB. Anyway, columnstore indexes don’t use statistics in the traditional sense.
Fortunately I was able to catch a query in the process of waiting on HTDELETE, so I no longer had to look for the needle in the haystack, and I could get to tuning the plans. I was able to grab the SELECT part of the query and generate an estimated plan on both the primary and secondary nodes. The plans were virtually the same on both nodes, with just a minor difference in memory grant between them.
Click through for the solution.
CLUSTERED COLUMNSTORE INDEXES provide extreme data compression in SQL Server and Azure SQL Database. With NVARCHAR(MAX) support in CCI indexes you can use them on your JSON data stored is database and get high 25x compression. Therefore, CCI is a perfect solution if you need to store a large volume of JSON data in your SQL Database.
ContosoDW database is publicly available for download, so you can use this database and the script below to re-create this table and try this in your environment.
I’m curious whether this will also apply to non-JSON data.
This is equivalent to collections that you might find in classic NoSQL database because they store each JSON document as a single entity and optionally create indexes on these documents. The only difference is CLUSTERED COLUMNSTORE index on this table that provides the following benefits:
Data compression – CCI uses various techniques to analyze your data and choose optimal compression algorithms to compress data.
Batch mode analytic – queries executed on CCI process rows in the batches from 100 to 900 rows, which might be much faster than row-mode execution.
I think it’s worth reading this in conjunction with Niko Neugebauer’s comments regarding strings in columnstore.
Having Strings in Fact tables is something that is quite normal, but to be honest, in the most cases – does not make a lot of sense, since we are trying to keep there the information that can be calculated and/or aggregated. Notice that I have written in the most cases and NOT in all cases, because there are some noticeable exceptions. Additionally if you are “feeding” SSAS Tabular with your table this might be much easier to do it directly (hey, there is a solution through the views for that, I was told :)).
In this blog post, I am focusing not on the exceptions but on the typical cases where its not the best option and so here is a basic solution I just wanted to present you an optimised structure, which contains a tinyint column referring to the new table with distinct data for the ShipMode.
The string experience with columnstore can be troublesome. It’s great for numeric values, but less great for strings.