Press "Enter" to skip to content

Category: Partitioning

The Cost of Empty Partitions in SQL Server

Aaron Bertrand reminds us that TANSTAAFL:

Not too long ago, I came across a table that had 15,000 partitions—all but 4 of them empty. I bet when you have implemented partitioning you, too, have wondered: “Why shouldn’t I create all future partitions now?”

The question is valid: wouldn’t maintenance be easier if you only had to phase out old partitions, without ever worrying about adding partitions to accommodate new data?

Here’s the thing. Microsoft will never tell you this, but empty partitions are not free. I don’t mean you will get an invoice for creating too many, but you will pay for them in other ways.

I think the reason people do this sort of thing is that partition management is harder than it really needs to be in SQL Server. Adopting the partitioning process for dedicated SQL pools in Azure Synapse Analytics would be a good start but consider that partitions are almost always date or numeric intervals (often pseudo-date numbers like 20220601 to represent June 1, 2022) and allow people to create partitions based on the key and have the system manage it. I’m being vague and hand-wavey and not talking about all of the edge cases (like if some dope puts in a date for 22220601 instead of 20220601) but I think some PARTITION RANGE RIGHT ON [DateWK] INTERVAL MONTHLY WITH MAX_FUTURE_PARTITIONS=2, MAX_MAINTAINED_PARTITIONS=48, PARTITION_ARCHIVAL_TABLE=History.MyTable or something like that would cut to the core of what partition management does without writing thousand-line scripts full of dynamic SQL trying to manage these things.

Comments closed

Automated Partitioned Table Management

Eitan Blumin automates creation and deletion of partitions in SQL Server:

Before we begin, there are a few “ground rules” we should understand first:

1 – Partition Functions define the partition ranges

This means that whenever we want to eliminate an old partition range or add a new partition range, the PARTITION FUNCTION is the object that we actually need to modify.

Click through for Eitan’s entire process and a couple of scripts. This is an area that SQL Server could have made a lot easier, especially for periodic processes, by including options like “Daily” or “Monthly” or “Weekly(start on Monday)” for intervals rather than making people specify every partition separately.

Comments closed

Reasons for Partitioning in SQL Server

Erik Darling has opinions:

When I work with clients, nearly every single one has this burning question about partitioning.

“We’ve got this huge table, should we partition it?”

“Do you need to insert or delete data in big chunks?”

“No, it’s all transactional.”

“Do you have last page contention problems?”

“No, but won’t it help performance?”

“No, not unless you’re using clustered column store.”

“…”

Read on to unpack Erik’s argument. I do wish that there were more good cases for partitioning in SQL Server, but they’re almost all in the analytics space—which is part of why partitioning is a lot more useful in Azure Synapse Analytics dedicated SQL pools.

Comments closed

Partition Switching of Staging Data

Aaron Bertrand shares a technique to make table refreshes easier for end users:

So, what is a staging table in SQL? A staging table can be more easily understood using a real-world example: Let’s say you have a table full of vegetables you’re selling at the local farmer’s market. As your vegetables sell and you bring in new inventory:

– When you bring a load of new vegetables, it’s going to take you 20 minutes to clear off the table and replace the remaining stock with the newer product.

– You don’t want customers to sit there and wait 20 minutes for the switch to happen, since most will get their vegetables elsewhere.

Now, what if you had a second empty table where you load the new vegetables, and while you’re doing that, customers can still buy the older vegetables from the first table? (Let’s pretend it’s not because the older vegetables went bad or are otherwise less desirable.)

Read on for some techniques Aaron used for a long time and why he switched to partition switching.

Comments closed

Bug around Parallel Eager Spools and Batch Mode

Paul White digs into a nasty bug:

more accurate description of the issue would be:

This bug can cause wrong results, incorrect error messages, and statement failure when:

– A data-modification statement requires Halloween Protection.
– That protection is provided by a Parallel Eager Spool.
– The spool is on the probe side of a Batch Mode Hash Join.

This issue affects Azure SQL Database and SQL Server 2014 to 2019 inclusive.

Read on for a repro and Paul’s thoughts. As of March 2021, this is an active problem, so it’s worth keeping an eye on in your environment. I’d wager, though, that this probably doesn’t pop up on its own very frequently.

Comments closed

The Benefits of Table Partitioning

Brenda Bentz lays out some of the benefits of table partitioning in SQL Server:

Table partitioning in MS SQL Server is an advanced option that breaks a table into logically smaller chunk (partition) and then stores each chuck to a multiple filegroups.  Each chuck can be accessed and maintained separately. Partitioning is not visible to users; it behaves like one logical table to the end user. Partitioning was introduced in SQL 2005 as an Enterprise edition feature but starting with SQL 2016 SP1 it is available on all the editions.

Table partitioning can work to improve performance in specific circumstances, though you’ll definitely want to do a lot of testing before rolling it out. H/T Amanda White.

Comments closed

Partitioning Tricks

Raul Gonzalez shows us five things you can do with partitioning in SQL Server:

Once we have rebuilt that old data to minimise its footprint and moved it to a cheaper storage tier, if we know no one will have to modify it, it’d be a good idea to make it READ_ONLY.

By making the data READ_ONLY, we can not only prevent accidental deletion or modification, but also reduce the workload required to maintain it, because as we’ve seen before, we can action index maintenance only on the READ_WRITE parts (partitions) of the data where fragmentation might still happen.

Read on for the rest of the tips and note that none of these are directly of the “Make your queries faster” variety, though a couple can have positive performance implications.

Comments closed

Partition Switching to Make Table Changes

Daniel Hutmacher shows a couple things you can change with near-zero downtime using partition switching:

Look, I’m not saying that you’re the type that would make a change in production while users are working.

But suppose that you would want to add an identity column to dbo.Demo, and change the clustered index to include that identity column, and make the index unique? Because it’s the table’s clustered index, you’re effectively talking about rebuilding the table (remember, the clustered index is the table), which involves reorganizing all of the rows into a new b-tree structure. While SQL Server is busy doing that, nobody will be able to read the contents of the table.

Daniel mentions a read-only table, though you could also do this with a read-write table as long as you have triggers to keep the two tables in sync until go time. That adds to the complexity, but it is an option if you need it.

Comments closed

Table Partitioning: WAIT_AT_LOW_PRIORITY on Standard Edition

Michael Bourgon explains what the WAIT_AT_LOW_PRIORITY option does with table partitioning and that it is available in Standard Edition:

But how about WAIT_AT_LOW_PRIORITY?  That was introduced in 2014 to make table partitioning deal better with That-Dude-From-Accounting-Who-Kicks-Off-A-Massive-Query-On-Friday-at-5pm, which causes partitioning to hang on Saturday when you’re trying to add and remove partitions.

Read on for a demo.

Comments closed