Press "Enter" to skip to content

Category: Partitioning

Partition Handling In Spark 2.1

Eric Liang, et al, discuss a change to Spark 2.1 which will make certain partitioned table access faster:

In Spark 2.1, we drastically improve the initial latency of queries that touch a small fraction of table partitions. In some cases, queries that took tens of minutes on a fresh Spark cluster now execute in seconds. Our improvements cut down on table memory overheads, and make the SQL experience starting cold comparable to that on a “hot” cluster with table metadata fully cached in memory.

This looks like a nice improvement in Spark.

Comments closed

Quickly Reloading Tables

Kenneth Fisher uses table partitioning to perform fast loads of data:

Now if this table is paritioned you’d use SWITCH and bring in a new partition.

For those that don’t know, when a table is partitioned, you can create a new empty partition, and a new empty table, load the table, make the table exactly match the partition (structure, check constraints, & indexes for example) and you can SWITCH it in. The SWITCH part is a metadata operation and is fast!

But what do you do if the table isn’t partitioned? Well, I was having a conversation with Andy Mallon (b/t) and he reminded me of something.

Read on for the details.  The upshot is that you can take your time loading the second table and once you’re ready to swap out, it’s a quick metadata change.  That’s really useful for ETL scenarios.

Comments closed

Table Partitioning And Performance

Kendra Little has a video on table partitioning:

Learn why SQL Server’s table partitioning feature doesn’t make your queries faster– and may even make them slower.

In this 20 minute video, I’ll show you my favorite articles, bugs, and whitepapers online to explain where table partitioning shines and why you might want to implement it, even though it won’t solve your query performance problems.

Articles discussed are by Gail Shaw, Remus Rusanu, and the SQL Customer Advisory Team (SQLCAT). Scroll down below the video for direct links to each resource.

Check out the video.

Comments closed

Partition Switching

James Anderson gives his favorite feature in SQL Server:

The good news is that the SWITCH command works on regular tables and in any edition. This means I can quickly transfer all of the data from one table to another in Standard Edition.

In reality, I have found few uses for the regular table to regular table switch; the trick to add the IDENTITY property to a column with existing data is the most recent. SWITCH is most useful when partitioned tables are involved. Sorry, Standard Edition users.

Partition switching is a fascinating solution for a difficult technical problem.

Comments closed

Writable Partition Failure

Paul White shows us a scenario in which attempts to update a writable partition could fail:

The where clause is exactly the same as before. The only difference is that we are now (deliberately) setting the partitioning column equal to itself. This will not change the value stored in that column, but it does affect the outcome. The update now succeeds (albeit with a more complex execution plan):

The optimizer has introduced new Split, Sort, and Collapse operators, and added the machinery necessary to maintain each potentially-affected nonclustered index separately (using a wide, or per-index strategy).

Read on for the reason why this happens, as well as a few solutions.

Comments closed

Remember Partition Alignment

Kendra Little shows that truncating partitions in SQL Server 2016 requires all indexes be aligned to the partition:

If you have a non-aligned index on the table, you’ll see an error like this:

Msg 3756, Level 16, State 1, Line 1

TRUNCATE TABLE statement failed. Index ‘ix_FirstNameByBirthDate_pt_BirthYear_FirstNameId_nonaligned’ is not partitioned, but table ‘FirstNameByBirthDate_pt’ uses partition function ‘pf_fnbd’. Index and table must use an equivalent partition function.

This isn’t a bug, and it makes total sense from a logical point of view. “Non-aligned” indexes are not partitioned like the base table is– by definition they are either partitioned differently, or not partitioned at all. The chunk of data that you’re trying to truncate isn’t all in an easily identifiable partition that can be quickly marked as “data non grata”. There’s just no way to do a simple truncate when the data’s scattered all around.

Ideally, all indexes on a partitioned table would be partition-aligned, as it makes maintenance a lot easier.  That’s not always possible, though, so keep this in mind.

Comments closed

DATETIME2 Partition Elimination

Kendra Little shows that DATETIME2 implicit conversion can prevent partition elimination:

SQL Server is implicitly converting my date value to DATETIME2(7). That is a larger, more precise value than the data type I have in the table–FakeBirthDateStamp is DATETIME2(0).

That data type mismatch is preventing partition elimination!

This is a nasty issue to catch in production, especially after you spend a bunch of time arguing with devs that DATETIME2 is the way of the future, that it’s better because of the variable precision, etc.

Comments closed

Table Partitioning

Slava Murygin walks through partitioning:

Usually new Partition Ranges are added to the end of a Function/Schema, but in my example I’ll demonstrate how to split middle Partition.
At first we create new File Group. Then make it next within a schema and then split a Function:

All records, which are Less than 1/1/2015 and more than 1/1/2014, were moved to the new partition.

Partitioning is one of those topics I run into frequently enough to need to know it, but not frequently enough to memorize it; every time it feels like I’m starting from scratch.

Comments closed

Partitioning Thoughts

Kendra Little has a few questions to ask before you set up sliding-window partitioning:

Map this out before you write the code. When will the jobs run, and what should happen if they fail? Should someone be engaged? What tools will they need, and when is the Service Level Agreement for when the process has to be complete? You’ll need lots of details on this to make sure your automation and documentation meet the bar.

Partitioning is an extra layer of complexity.  It can be a very useful extra layer of complexity, but this is a case where it’s best to spend an hour before you begin and walk through potential issues.  Those potential issues will come—automation jobs will fail, external configuration changes will affect your partition strategy, bad data will sneak in and fill up your supposedly-empty edge partitions.

Comments closed

Rebuilding Indexes By Partition

Andrea Allred shows how to rebuild indexes one partition at a time:

We have had an index job that has been failing for a while.  This is one of those things you really don’t want to clean up because no one is complaining, but you know you should.  I had heard that I could rebuild one partition at a time, but where to start?  Today, I worked my way through it, so here it is so that you can do it too.

First you need to find the biggest indexes, there is a good chance those are the ones that live on partitions. I am removing Primary Keys.

This can be a real time-saver if a majority of your partitions either are read-only or at least rarely update.

Comments closed