Partitioning in PostgreSQL was for many years a feature almost entirely ignored, and even today is often thought of only as a means to increase performance on large tables. While it’s certainly useful for this, it’s also an invaluable tool for solving common data maintenance problems. In this article, we’ll examine the feature from this perspective.
Let’s consider a common situation: a table continually ingesting time series data ordered chronologically: purchase orders, forum posts, time-series data, whatever. Most systems contain more than one table like this. And because we live in a world of Big Data, these tables can contain hundreds of millions, even billions of rows. Business rules require the data to remain active for some fixed period of time, then be deleted, (or archived first, then deleted). Commonly this is a monthly process: every month copy data that has “aged out” to another location, then delete the rows from the live table.
It’s interesting that people consider partitioning a performance-improving technique in the PostgreSQL world, but in the SQL Server world, partitioning is primarily for ease of maintenance and we warn against thinking about it as a performance-improving technique.