I call this the “setup, dump, & swap”. Essentially we need to create an empty copy of the table, with the desired index(es), dump all the data into it, and then swap it in. There are couple ways you can do this, but it boils down to the same basic premise: It’s “better” (probably not in terms of speed, but definitely in terms of efficiency and overhead) to fill this new copy of the table & its indexes, than it is to build the desired index on the existing table.
As Nate notes, “very large” here will depend on your environment, but this is a useful technique because the old table can be live until the moment of the swap. As Nate writes this, I’m actually in the middle of one of these sorts of swaps—one that will take a week or two to finish due to pacing.
Remember that myITU was implemented on an Oracle database. Unlike on SQL Server, all tables in an Oracle database are physically represented as heaps, unless explicitly specified otherwise. That means no indexes. And I didn’t know anything about database performance back then, so I didn’t add any. Effectively any query against the course table would give a full table scans in the EnumerationType and EnumerationValue tables as a side effect. Fast forward to course enrolments…
At ITU, we had implemented a selection algorithm, which considered both the study program you were enrolled into as well as how early you registered for a course. Early birds got the course, so students would be ready when course enrolment was opened at 12pm. And at 12.05 myITU would start failing with 500 Internal Server Errors.
This is a pretty common occurrence, followed up by the “let’s add all the indexes” phase.
100k runs of the query in a ten minute interval? yeowch. Yeah, this should be optimised if possible. The primary wait type was CPU – indicating that the data was all in RAM, but the CPU was having to schlep through the entire table to find what it needed. Or to find that it didn’t need anything. Or something.
The interesting story buried in here is that none of the tooling Thomas used indicated that things could be improved with a better index, and yet there was a tremendous opportunity here based on the graphics at the end.
Most of us understand that clustered columnstore index can typically provide 10x data compression and can speed up query performance up to 100x. While this sounds all so good, the question is how do I know which tables in my database could potentially benefit from CCI? For a traditional DW scenario with star schema, the FACT table is an obvious choice to consider. However, many workloads including DW have grown organically and it is not trivial to identify tables that could benefit from CCI. So the question is how can I quickly identify a subset of tables suitable for CCI in my workload?
Interestingly, the answer lies in leveraging the DMVs that collect data access patterns in each of the tables. The following DMV query provides a first order approximation to identify list of tables suitable for CCI. It queries the HEAP or the rowstore Clustered index using DMV sys.dm_db_index_operational_stats to identify the access pattern on the base rowstore table to identify tables that meet the criteria listed in the comments below:
Read on for the script, which has a sensible set of criteria.
To spare all the Wows & how can’s, Microsoft was well aware of this problem and has delivered a solution with Cumulative Update 3 for SQL Server 2016 with Service Pack 1:
FIX: SQL Server 2016 consumes more memory when you reorganize a columnstore index, and here it is – a new trace flag 6404 (documented in the link and thus should be supported), that will allow you to lower the memory requirements for the ALTER INDEX … REORGANIZE command.
Let’s take it for the test, by once again running the setup workload for the FactOnlineSales_Reindex table and then executing the following command, enabling the Trace Flag 6404 and then reorganising our Clustered Columnstore Index:
This is a rather interesting post and once again makes me wish that clustered columnstore indexes could be rebuilt online.
Pretty cool, huh? Full information on this can be found here: –https://docs.microsoft.com/en-us/sql/t-sql/statements/alter-index-transact-sql
I think this is very useful but we do need to be careful. The documentation says that pausing an online index rebuild for a long time may affect query performance and disk utilisation. This is due to the newly rebuild index being created side-by-side to the original one so we’ll need to watch out for that.
It’s a good first look at what might be a very interesting solution for companies tight on maintenance window time.
After multiple executions, the first process (Resumable Online Index Rebuild) on the average took 65.8 seconds, while the second one (a simple online) took only 60.8 seconds, representing 8% of the improvement of the overall performance. I can’t say if it looks acceptable to you or not, but for me this is something I will be definitely considering to be as an advantage for the cases where the resumable process is needed.
I decided to run a test on much bigger table, the lineitem which for 10GB TPCH database contains 60 Million Rows. My expectation here was to see if the percentage would stay the same or will jump to a whole new level (please make sure that you do execute the following script at least a couple of times, to get the real results and not the results of your disk-drive prefetching :)):
The big table example result was somewhat surprising. Niko is his normal, informative self, so definitely read the whole thing.
Here are my general rules of thumb, although of course, you should always use your best judgment. Also, this is for OLTP systems (ex: data entry systems) not OLAP systems (ex: data warehouses).
No Clustered Index: You really should add a clustered index. Clustered indexes are important for a number of reasons, so if you don’t have one spend some time, figure one out and create it. When possible I like dates as a clustered index but of course, it’s highly specific to the table you are working on. There are a few types of tables, some load tables, for example, were not having a clustered index is appropriate but they are few and far between.
Indexes are like cookies: you can’t have too many. No, wait, that doesn’t sound right…
Implicit conversions often happen when a query is comparing two or more columns with different data types. In the below example, the system is having to perform extra I/O in order to compare a varchar(max) column to an nvarchar(4000) column, which leads to an implicit conversion, and ultimately a scan instead of a seek. By fixing the tables to have matching data types, or simply converting this value before evaluation, you can greatly reduce I/O and improve cardinality (the estimated rows the optimizer should expect).
There’s some good advice here if your main hardware constraint is being I/O bound.
My first thought was that perhaps there is some process that runs against the production system and the test system that goes to sleep with an open transaction, holding an X or an IX lock against this table. If the index create can’t get its shared lock, then it could be part of a blocking chain.
So I asked first if the index create was the head of the blocking chain, or if it was perhaps blocked by something else. The answer came back that no, the index create was NOT blocked. It was holding the shared lock for a long time.
My new friend even sent a screenshot of the index create running against the test instance in sp_WhoIsActive with blocking_session_id null.
Read on for the full story and keep those systems patched.