In several of my last few blog posts, I’ve shared several methods of getting internal information from a database by using the DBCC PAGE command and utilizing the “WITH TABLERESULTS” option to be allowed to automate this process for further processing. This post will also do this, but in this case, we’ll be using it to bust a common myth—data in a clustered index is physically stored on disk in the order of the clustered index.
Busting this myth
To bust this myth, we’ll create a database, put a table with a clustered index into this database, and then we’ll add some rows in random order. Next, we will show that the rows are stored on the pages in logical order, and then we’ll take a deeper look at the page internals to see that the rows are not stored in physical order.
Read on for the proof.
True to my typical post style which focuses on small shops, accidental DBAs, and junior DBAs I went looking for something that could very easily benefit people that are using the basic SQL Server features. In this case they may not even realize how limited they were.
I chose to write about Index Maintenance in the built-in Maintenance Plan portion of SQL Server.
A brief summary of the built-in Maintenance Plans is that they allow you to drag-and-drop your way to basic SQL Server maintenance items such as Backups, Index maintenance, CheckDB, Statistics updating, etc. This a tool that has been around since at least version 7 that I know of. It wasn’t always very good, and it gets a bad rap from a lot of DBAs. It has been dramatically improved over the years in flexibility and reliability.
Read on for the changes. I’m really not a fan of maintenance plans, but if they’re going to exist, they should at least be as good as possible.
Ultimately I think any thought of the readable secondary having a vastly different plan was a red herrings. Statistics are going to be the same on both instances, and if there were a missing statistic on the secondary, SQL Server would create it in TempDB. Anyway, columnstore indexes don’t use statistics in the traditional sense.
Fortunately I was able to catch a query in the process of waiting on HTDELETE, so I no longer had to look for the needle in the haystack, and I could get to tuning the plans. I was able to grab the SELECT part of the query and generate an estimated plan on both the primary and secondary nodes. The plans were virtually the same on both nodes, with just a minor difference in memory grant between them.
Click through for the solution.
The “dirty little secret” about full-text search indexes is that they don’t help with ‘%blabla%’ predicates.
Well, it’s not a secret, it’s right there in the documentation.
A lot of us get the impression that full-text search is designed to handle “full wildcard” searches, probably just because of the name. “Full-Text Searches” sounds like it means “All The Searches”. But that’s not actually what it means.
Kendra’s take is a bit more optimistic than mine; I’m definitely more inclined to dump text out to a Lucene-based indexing system (like Solr or ElasticSearch), as they’ll typically perform faster and solve problems that full-text cannot. Some of that may just be that I was never very good at full-text indexing, though.
So, when SHOULD you explictly define the clustering key columns in a nonclustered index? When they ARE needed by the query.
This sounds rather simple but if the column is USED by the query then the index MUST have the column explicitly defined. Yes, I realize that SQL Server will add it… so it’s not necessary NOW but what if things change? (this is the main point!)
One of the more common cases I could think of is multi-part clustered indexes, like on a junction table.
For the Columnstore Indexes, the only online operation for the indexes that was available so far, was the Row Group Merging and Removal with ALTER INDEX REORGANIZE (as well as the Tuple Mover operations). With appearance of HTAP scenarios (Hybrid Transactional Analytical Processing aka Operational Analytics) in SQL Server 2016, there was a huge need for the online index maintenance, making sure that the operational part of the HTAP runs smoothly. For any online business, taking their application down for an hour means loosing real money and even worse – loosing credibility from their customers. To my knowledge, Microsoft was very much aware and was working on improving this missing part.
For the SQL Server vNext version (after SQL Server 2016) in CTP 1.2, yesterday, we have finally received the first Online Rebuild operation for the Columnstore Indexes – in this case for the Nonclustered Columnstore Indexes, and this is a huge news for anyone using the HTAP scenarios.
Naturally this feature is Enterprise Edition Only, and like ever before – if you are running a critical workload, you need to step up and use the Enterprise Edition.
Online clustered columnstore reorganization in 2016 was a life-saver, and I’m looking forward to online clustered columnstore rebuilding at some point in the future.
It Is Known
That when you use filtered indexes, they get ignored when your queries are parameterized. This is a Plan Caching Thing©, of course. The simplest example is a bit column with a filtered index. If your index is on WHERE Bit = 1, it doesn’t have data for WHERE Bit = 0. That index would only be suitable for one variation of the query, so caching a plan that uses an index which can’t be reused for every variation isn’t feasible.
Read on for a couple examples, and check the comments on this as well.
Regardless of using unique constraint or unique index, the field can accept null values, however the uniqueness will result in only accepting a single row with null value.
The solution to allow nulls in unique fields is create a unique filtered index excluding the nulls of the index, due to that the uniqueness of the nulls will not be validated and multiple rows with nulls will be accepted.
Click through for the code. I enjoy asking this as an interview question. It’s a non-trivial problem with a non-trivial solution and isn’t a trick question.
SQL Server doesn’t really track index create or modification date by default
I say “really”, because SQL Server’s default trace captures things like index create and alter commands. However, the default trace rolls over pretty quickly on most active servers, and it’s rare that you’re looking up the creation date for an index you created five minutes ago.
I think it’s fine that SQL Server doesn’t permanently store the creation date and modification date for most indexes, because not everyone wants this information — so why not make the default as lightweight as possible?
That said, Kendra has several methods for answering the question of when a particular index was created.
The best way is to run the stored procedure yourself to generate and save an “actual” execution plan, which contains the estimates SQL Server used when it generated the plan as well as actual rowcounts, actual memory granted, etc. It will also contain a green tooltip with a “missing index request” if SQL Server thinks an index would help.
This is an introductory-level post which contains good advice.