Press "Enter" to skip to content

Category: Indexing

How DynamoDB Indexing Works

Shubham Agarwal explains how indexing works within DynamoDB:

Global secondary index in DynamoDb – An index with a partition key and a sort key that can be different from the base table. A global secondary index is very helpful when you need to query your data without primary key.

  •  The primary key of a global secondary index can be partition key or composite  (partition key and sort key).

  • Global secondary indexes can be created at the same time that you create a table. You can also add a new global secondary index to an existing table, or delete an existing global secondary index

  • A global secondary index lets you query over the entire table, across all partitions.

  • The index partition key and sort key (if present) can be any base table attributes of type string, number, or binary.

  • With global secondary index queries or scans, you can only request the attributes that are projected into the index. DynamoDB will not fetch any attributes from the table.

  • There are no size restrictions for global secondary indexes.

Click through to learn more about these as well as local secondary indexes.

Comments closed

Unique Indexes Versus Unique Constraints

Greg Low argues that you should create unique constraints instead of unique indexes whenever possible:

The CREATE INDEX statement is used to do exactly what its name says, it creates an index. But when you say CREATE UNIQUE INDEX, you are doing more than that; you are enforcing a business rule that involves uniqueness.

I have a simple rule on this. Wherever possible business rules like uniqueness, check values, etc. should be part of the design of the table, and not enforced in an external object like an index.

So, rather than a unique index, I’d rather see a unique constraint on the underlying table.

But that’s where real life steps in. I see two scenarios that lead me to occasionally use CREATE UNIQUE INDEX.

Here’s a third:  creating constraints can cause blocking issues.  If you already have a large table and Enterprise Edition, creating a unique index can be an online operation (unless you have a clustered columnstore index on the table), but a unique constraint is always a blocking activity.

Comments closed

Query Store Indexes

Arthur Daniels shows what you can learn from the indexes on Query Store tables:

It looks like internally Query Store is referred to as plan_persist. That makes sense, thinking about how the Query Store persists query plans to your database’s storage. Let’s take a look at those catalog views vs their clustered and nonclustered indexes. I’ve modified the query a bit at this point, to group together the key columns.

This lets you see how the Query Store authors expected us to use these tables.  Which isn’t always how people use them…

Comments closed

Gotchas When Indexing Partitioned Tables

Andrew Pruski gives us a couple of considerations when creating indexes on partitioned tables in SQL Server:

Looking at that data page, we can see that SQL has added a UNIQUIFIER column. Now this is standard SQL behaviour, SQL does this to all non-unique clustered indexes whether they are on a partitioned table or not.

But also look at the CreatedDate column. It’s after the ID column on the page. If this was a non-partitioned table, we would see that after ColA & ColB (the order the columns are on the table). This has happened because SQL has implicitly added the partitioning key into the index definition, which has changed the physical order of the data on the page.

Read the whole thing.

Comments closed

Ways To Hinder Indexes

Raul Gonzalez shows that even when you have a good index, “clever” developers and fate can find ways to conspire against it:

he benefits of having an index are well known, you can get the same results by reading a smaller amount of data so the improvement in performance can be from several minutes to seconds or even less.

That sounds awesome and it certainly is and there are people out there making a living of it, so it’s a huge deal for sure.

But it’s not always like that, and things can go wrong very easily and make all these shiny indexes just a pile of useless burden.

Let me show you some examples, where we can see our indexes in use, but also how they can be ignored by the query processor and become totally useless. I’m going to use the Microsoft sample database [WideWorldImporters] so you can follow along if you want.

Read on to learn more.

Comments closed

Clustered Indexes And Automatic Sorting

Kendra Little demonstrates that clustered indexes do not give us an automatic sorting of our data:

There is no “default” ordering that a query will fall back on outside of an ORDER BY clause.

  • Results may come back in the order of the clustered index, or they may not
  • Even if results come back in the order of the clustered index on one run of the query, they may not come back in the same order if you run it again

If you need results to come back in a specific order, you must be explicit about it in the ORDER BY clause of the query.

Click through for a demo proving these two points.

Comments closed

Naming Indexes

Monica Rathbun hits one of my hobby horses:

As you can see from above, none of the names gave a complete indication of what the index encompassed. The first one did indicate it was a Non Clustered Index so that was good, but includes the date which to me is not needed. At least I knew it was not a Clustered Index. The second index did note it is a “Covering Index” which gave some indication that many columns could be included I also know it was created with the Data Tuning Advisor due to the dta prefix. The third index was also created with dta but it was left with the default dta naming convention. Like the first one I know the date it was created but instead of the word Cover2, I know there are 16 key columns are noted by the “K#” and regular numbers tell me those are included columns. However, I still have no idea exactly what these numbers denote without looking deeper into the index. The last index is closer to what I like however the name only tells me one column name when in fact it encompasses five key columns and two included columns.

I absolutely love seeing lots and lots of “_dta_” indexes; it’s a sign that I have a long day ahead of me.

Comments closed

Fragmentation Can Affect Execution Plans

Jonathan Kehayias explains how index fragmentation can potentially affect execution plans:

Index fragmentation removal and prevention has long been a part of normal database maintenance operations, not only in SQL Server, but across many platforms. Index fragmentation affects performance for a lot of reasons, and most people talk about the effects of random small blocks of I/O that can happen physically to disk based storage as something to be avoided. The general concern around index fragmentation is that it affects the performance of scans through limiting the size of read-ahead I/Os. It’s based on this limited understanding of the problems that index fragmentation cause that some people have begun circulating the idea that index fragmentation doesn’t matter with Solid State Storage devices (SSDs) and that you can just ignore index fragmentation going forward.

However, that is not the case for a number of reasons. This article will explain and demonstrate one of those reasons: that index fragmentation can adversely impact execution plan choice for queries. This occurs because index fragmentation generally leads to an index having more pages (these extra pages come from page split operations, as described in this post on this site), and so the use of that index is deemed to have a higher cost by SQL Server’s query optimizer.

Let’s look at an example.

Check out the example, but definitely read the comments as there are some good conversations in there.

Comments closed