Press "Enter" to skip to content

Category: Partitioning

A Primer on Partitioned Views

Erik Darling talks about an old-style way of partitioning in SQL Server:

Erik Darling here with Darling Data. And we’re going to finish off this Friday by talking about partitioned views. And look, there are a lot of things I could say about partitioned views that are great and grand and that have come in handy for me over the years in ways that I’m like, wow, thank you partitioned views. Thank you for not being normal table partitioning. Thank you for existing. 

Read on to see how they work, how you can write into them, things that might prevent you from writing into partitioned views directly, and why you probably don’t want writable partitioned views anyhow.

Leave a Comment

Partitioning and Columnstore Indexes

Erik Darling puts together a great combination for a very large dataset:

 So, today we’re going to talk about partitioning in columnstore because there are important differences between partitioned columnstore tables and partitioned rowstore tables. One of the sort of superpowers that columnstore has is the ability to use metadata about which row groups have which data in them, and it can skip entire segments that do not contain relevant data.

I agree with Erik’s point that you do need around 500 million or so rows before this capability really shines, but if you do pick the right partition key, you get one of those rare and coveted performance improvements from partitioning.

Leave a Comment

Enforcing Constraints across Postgres Partitions

Shaun Thomas explains a rule:

Postgres table partitioning is one of those features that feels like a superpower right up until it isn’t. Just define a partition key, carve up data into manageable chunks, and everything hums along beautifully. And what’s not to love? Partition pruning in query plans, smaller tables, faster maintenance, easy archiving of old data; it’s a smorgasbord of convenience.

Then you try to enforce a unique constraint without including the partition key, and Postgres behaves as if you just asked it to divide by zero. Well… about that.

Click through for an explanation, some workarounds that might work in specific circumstances, and a few closing remarks.

As for SQL Server, the same rule applies. If you want a unique index (which is what a unique key constraint uses under the covers), you must include the partitioning column. If you don’t include it, SQL Server will include it for you rather than giving a hard error.

Comments closed

Vertical Partitioning for Performance

Eran Golan splits out a table:

Not long ago, I worked with a customer who was experiencing persistent blocking and occasional deadlocks in one of their core systems. The application itself wasn’t new, but over the years it had grown significantly. New features had been added, more processes were interacting with the database, and naturally the schema had evolved along the way.

One table in particular stood out. It had gradually grown to contain well over a hundred columns. Originally it had been designed to represent a single business entity in one place, which made the model easy to understand and query. But as more attributes were added over time, the table became increasingly wide.

Frankly, based off of Eran’s description, this sounds like a failure in normalizing the table appropriately. Normalization is not just about “There are many of X to one Y, so make two separate tables for X and Y.” In particular, 5th normal form (keys imply join dependencies) tells us that, if we can break out a table X into X1 and X2, and then join X1 and X2 together without losing any information or generating spurious new information, then 5NF requires we break it out. Eran is describing in narrative exactly that concept, though the description of how the customer broke that data out may or may not have satisfied 5NF.

Comments closed

ALTER TABLE and Partitioned Tables in PostgreSQL

Chao Li classifies a series of commands:

Does an operation propagate to partitions? Does it affect future partitions? Does ONLY do what it claims? Why do some commands work on parents but not on partitions—or vice versa?

Today, PostgreSQL documentation describes individual ALTER TABLE sub-commands well, but it rarely explains their interaction with partitioned tables as a whole. As a result, users often discover the real behavior only through trial and error.

This post summarizes a systematic investigation of ALTER TABLE behavior on partitioned tables, turning scattered rules into a consistent classification model.

Click through for 15 buckets of commands relating to ALTER TABLE in PostgreSQL and see how they handle dealing with partitioned tables.

Comments closed

Function Volatility and PostgreSQL Partition Performance

Deepak Mahto covers how function volatility can affect how queries on partitioned data perform:

In one of our earlier blogs, we explored how improper volatility settings in PL/pgSQL functions — namely using IMMUTABLESTABLE, or VOLATILE — can lead to unexpected behavior and performance issues during migrations.

Today, let’s revisit that topic from a slightly different lens. This time, we’re not talking about your user-defined functions, but the ones PostgreSQL itself provides — and how their volatility can quietly shape your query performance, especially when you’re using partitioned tables.

Click through for one example using date-based partitioning and date functions.

Comments closed

Statistics on Partitioned Tables in PostgreSQL

Laurenz Albe gathers stats:

I recently helped a customer with a slow query. Eventually, an ANALYZE on a partitioned table was enough to fix the problem. This came as a surprise for the customer, since autovacuum was enabled. So I decided to write an article on how PostgreSQL collects partitioned table statistics and how they affect PostgreSQL’s estimates.

Read on to see how it works and how you can generate statistics at the table level and not just the partition level.

Comments closed

Data Archival and Retention in PostgreSQL

Daria Nikolaenko walks through a presentation:

I’ve started talking about something that happens with almost every Postgres database — the slow, steady growth of data. Whether it’s logs, events, or transactions — old rows pile up, performance suffers, and managing it all becomes tricky. My talk was focusing on  practical ways to archive, retain, and clean up data in PostgreSQL, without breaking queries or causing downtime.

Read on to learn more.

Comments closed

Partitioning in PostgreSQL

Umair Shahid takes us into partitioning strategies in PostgreSQL:

My recommended methodology for performance improvement of PostgreSQL starts with query optimization. The second step is architectural improvements, part of which is the partitioning of large tables.

Partitioning in PostgreSQL is one of those advanced features that can be a powerful performance booster. If your PostgreSQL tables are becoming very large and sluggish, partitioning might be the cure. 

It’s interesting to compare this against SQL Server, where partitioning is not a strategy for query performance improvements.

Comments closed

Swap-and-Drop for Partition Management

Rich Benner deals with a troublesome partition:

What are stubborn partitions in SQL Server and how do you delete them? This was an interesting issue I recently had to deal with on a client site that I thought our readers might find interesting.

The tables in use here are partitioned. The partition field is based upon a date field and we have a partition per month. There is a monthly maintenance job which creates our new partitions. The job should also delete the oldest partitions. This job has been failing to delete an old partition as the data file contained within is not empty. It’s stubborn!

If we try to remove this file we get the error “The File cannot be removed because it is not empty,” as you can see:

Read on for some diagnosis of the problem, as well as the solution Rich developed.

Comments closed