Press "Enter" to skip to content

Category: Indexing

Hybrid Columnstore And B+ Tree Designs

Adrian Colyer reviews a Microsoft paper on the combination of columnstore and B+ tree indexes on a single table:

The authors conducted a series of microbenchmarks as follows:

  • scans with single predicates with varying selectivity to study the trade-off between the range scan of a B+ tree vs a columnstore scan

  • sort and group-by queries to study the benefit of the sort order supported by B+ trees (columnstores in SQL Server are not sorted).

  • update statements with varying numbers of updated rows to analyze the cost of updating the different index types

  • mixed workloads with different combinations of reads and updates

It’s interesting to read an academic paper covering the topic, particularly when you can confirm that it works well in practice too.

Comments closed

Column Order Matters For Indexes

Bert Wagner violates Betteridge’s Law of Headlines:

When beginning to learn SQL, at some point you learn that indexes can be created to help improve the performance of queries.

Creating your first few indexes can be intimidating though, particularly when trying to understand what order to put your key columns in.

Today we’ll look at how row store indexes work to understand whether index column order matters.

Despite the flagrant violation, you should check out Bert’s post, as it’s a good one.

Comments closed

Creating Indexed Views

Eduardo Pivaral shows how to create a fairly simple indexed view:

Views help our query writing by simplifying writing the same sentences and/or aggregations over and over again, but it has a drawback, the views just store our query definition, but the performance is not improved by using them.

Since SQL Server 2008, the option to create an index over a view was introduced, of course, there are some limitations, but if your view can use them, the performance improvement could be great!

I will show you how to create a simple index over a view

I’ve used indexed views in the past, but they’re a much less common tool in the belt these days.

Comments closed

Index Maintenance With Replication

Ajay Dwivedi shares his rules of thumb for index maintenance on replicated databases:

Like any other DBA, I fell into the trap of using straight maintenance solution using Reorganize operation for Indexes with avg fragmentation with 30% or less with Index Rebuild for avg fragmentation greater than 30%.

Well above approach works fine in common scenarios, but can create problems for servers using transaction log based High Availability technologies, such as AlwaysOn Availability Groups, database mirroring, log shipping, and replication. Both index rebuild and reorganize introduce heavy transaction log activity and generate a large number of log records. This becomes an issue in case of node failover, server with limited storage, database file with restricted growth, wrong file auto growth setting, or database with high VLF counts.

The best option for servers with High Availability is to identify kind of server workload (OLTP/OLAP/mixed), fill factor (based on Page Splits/sec), fragmentation, underlying storage load (random/sequential), Index Scans vs Index Searches, job time frame (low activity outside business hours) etc. After calculating all the above factors, all we need is to have a robust Index Maintenance solution. This is where I find Ola Hallengren’s SQL Server Maintenance Solution a perfect fit.

Ajay uses Ola Hallengren’s solution and gives us the breakdown percentages he uses.

Comments closed

Configuring An Azure Runbook For Index Maintenance

Jim Donahoe explains how to perform index and statistics maintenance for Azure SQL Database, where you don’t have SQL Agent available:

I had a lot of issues when I created my first one, and after discussing with some folks, they had the same issues.  I searched for the best blog posts that I could find on the subject, and the one I LOVED the most was here: Arctic DBA.  He broke it down so simply, that I finally created my own pseudo installer and I wanted to share it with all of you.  Please, bear in mind, these code snippets may fail at anytime due to changes in Azure.

**IMPORTANT**

These next steps assume the following:

You have created/configured your Azure Automation Account and credential to use to execute this runbook.

Read on for a reasonably short Powershell script and a modified version of Ola Hallengren’s index maintenance procedures.

Comments closed

Betteridge’s Law And Index Hints

Bert Wagner asks a question in his title, Should You Use Index Hints?  Those familiar with Betteridge’s Law of Headlines know the general answer already:

One way to “fix” a poor performing plan is to use an index hint.  While we normally have no control over how SQL Server retrieves the data we requested, an index hint forces the  query optimizer to use the index specified in the hint to retrieve the data (hence, it’s really more of a “command” than a “hint”).

Sometimes when I feel like I’m losing control I like using an index hint to show SQL Server who’s boss.  I occasionally will also use index hints when debugging poor performing queries because it allows me to confirm whether using an alternate index would improve performance without having to overhaul my code or change any other settings.

About the only place I consistently use index hints is with filtered indexes, where the combination of parameter sniffing and inexactitude in filters will convince the optimizer that the filtered index isn’t helpful when it really is.

Comments closed

Generating Index Drop And Create Statements

Drew Furgiuele says “Game over, man, game over!” to indexes:

The premise is simple: it will generate a series of DROP and then CREATE INDEX commands for every index. The process is a little more complex in practice, but at a high level it:

  1. Creates a special schema to house a temporary object,
  2. Creates a special stored procedure to run the code,
  3. Calls said stored procedure,
  4. Generates a bunch of PRINT statements that serve as the output (along with new line support for readability),
  5. Cleans up the stored procedure it generated,
  6. And finally deletes the schema it created.

Nifty.

Click through for the script, as well as a bonus Powershell script.  Because hey, it’s only six lines of code.

Comments closed

Index That Column Or Include It?

Jeanne Combrinck lays out her recommendations on whether to make a particular column part of an index or have it be an included column:

The original question we wanted to answer was whether we would see a performance difference when a query used the index with all columns in the key, versus the index with most of the columns included in the leaf level. In our first set of tests there was no difference, but in our third and fourth tests there was. It ultimately depends on the query. We only looked at two variations – one had an additional predicate, the other had an ORDER BY – many more exist.

What developers and DBAs need to understand is that there are some great benefits to including columns in an index, but they will not always perform the same as indexes that have all columns in the key. It may be tempting to move columns that are not part of predicates and joins out of the key, and just include them, to reduce the overall size of the index. However, in some cases this requires more resources for query execution and may degrade performance. The degradation may be insignificant; it may not be…you will not know until you test. Therefore, when designing an index, it’s important to think about the columns after the leading one – and understand whether they need to be part of the key (e.g. because keeping the data ordered will provide benefit) or if they can serve their purpose as included columns.

Read the whole thing and be willing to test different approaches.

Comments closed

Resumable Online Index Creation In Azure SQL Database

Niko Neugebauer looks at a feature coming in SQL Server vNext:

It is about the time to create our first Clustered Online Resumable Index:

CREATE CLUSTERED INDEX CI_SampleDataTable	ON dbo.SampleDataTable (c1)	WITH ( ONLINE = ON, RESUMABLE = ON ) ;

But all we shall get is an error message:

Msg 155, Level 15, State 1, Line 25
'RESUMABLE' is not a recognized CREATE CLUSTERED INDEX option.

I was shocked and I was disappointed, but I have understood that it was my own mind’s fault. Nobody, I repeat – NOBODY has told me that it would work for the CLUSTERED Indexes, but when I see an announcement that the Indexes are supported, I was totally believing that the traditional (not XML, no CLR, no LOB’s) Rowstore Indexes would be totally supported. Oh yes, I know that it is crazy difficult. I know that this is a pretty forward-facing feature, but come on – my mind played trick on me, telling me the story that does not exist, for now, at least.

After realising my mind’s mistake I took a deeper breath and decided to try out the Resumable Nonclustered Index Creation with the following command:

CREATE NONCLUSTERED INDEX NCI_SampleDataTable	ON dbo.SampleDataTable (c1)	WITH ( ONLINE = ON, RESUMABLE = ON );

Hopefully we get a bit more support as SQL Server vNext is developed and eventually released.  In the meantime, Niko hits some limitations but his timings for the feature look good.

Comments closed

Smarter Indexes Based On Column Cardinality

Eric Blinn has a function which organizes columns in the missing index DMV by cardinality:

Bryan Rebok and Brent Ozar recently opened my eyes to something I didn’t know.  When SQL Server recommends missing indexes to you it puts the columns in order in which they are found in the table.  That’s it.  I always thought there was more logic into it.  But there isn’t.  Upon reading this I had a terrible realization that I’ve made a lot of awful indexes in my time.  I owe the world an apology.  I hope this post can serve as that apology.

I’ve written a function that accepts the equality column list from dm_db_missing_index_details as a parameter and spits those columns back out in order by their cardinality.  This won’t necessarily be the proper order for the columns in every index, but it is far more likely to be correct than the initial result from the DMV.

I’m amazed that the missing index DMV generates column names in such a simplistic manner.

Comments closed