Partitioning – Page 2

Partitioning in Oracle versus PostgreSQL

Published 2024-11-07 by Kevin Feasel

Umair Shahid continues a series on migrating from Oracle to PostgreSQL:

Table partitioning is a database design technique that divides a large table into smaller, more manageable sub-tables called partitions. Each partition holds a subset of the data based on specific criteria, such as date ranges, categories, or hash values. While partitioning makes it seem like you’re working with a single large table, behind the scenes, queries and operations are distributed across multiple partitions.

Read on to understand the differences between the two platforms. In this case, there are some pretty significant differences.

Comments closed

Table Partitioning in PostgreSQL

Published 2024-08-19 by Kevin Feasel

Lee Asher talks partitions:

Partitioning in PostgreSQL was for many years a feature almost entirely ignored, and even today is often thought of only as a means to increase performance on large tables. While it’s certainly useful for this, it’s also an invaluable tool for solving common data maintenance problems. In this article, we’ll examine the feature from this perspective.

Let’s consider a common situation: a table continually ingesting time series data ordered chronologically: purchase orders, forum posts, time-series data, whatever. Most systems contain more than one table like this. And because we live in a world of Big Data, these tables can contain hundreds of millions, even billions of rows. Business rules require the data to remain active for some fixed period of time, then be deleted, (or archived first, then deleted). Commonly this is a monthly process: every month copy data that has “aged out” to another location, then delete the rows from the live table.

It’s interesting that people consider partitioning a performance-improving technique in the PostgreSQL world, but in the SQL Server world, partitioning is primarily for ease of maintenance and we warn against thinking about it as a performance-improving technique.

Comments closed

Three Partitioning Options in Postgres

Published 2024-06-13 by Kevin Feasel

Semab Tariq shows how to perform three types of partitioning in PostgreSQL:

PostgreSQL is renowned for its exceptional performance in managing data. One of its standout features is partitioning, a technique that divides large datasets into smaller, more manageable segments. Partitioning provides several benefits, including improved query performance, streamlined data management, and enhanced scalability. By organizing data into partitions, PostgreSQL can execute searches more efficiently and handle tasks with greater ease.

In this blog, we will delve into the details of partitioning in PostgreSQL, exploring its various types, advantages, and drawbacks. We’ll uncover how partitioning can revolutionize data management and decision-making processes in database environments.

Click through for demonstrations of range, list, and hash partitioning.

Comments closed

The Joy of Partitioned Views

Published 2024-05-16 by Kevin Feasel

Rod Edwards talks partitioned views:

This post came around when I was at a loose end one evening, and just started poking at a local sandpit database, and it got me reminiscing and revisiting / testing a few things. The devil makes work for idle thumbs and all that…

Partitioned Views…do they have a place in society anymore?

Rod does a great job of following Betteridge’s Law of Headlines, as well as saving the ‘Yes’ answer for the post itself. Partitioned views come with their own pains, though one use case Rod did not include is using PolyBase and partitioned views to move “cold” data to slower external storage.

Comments closed

Arbitrary Intervals for Partitioning in Postgres

Published 2024-05-10 by Kevin Feasel

Keith Fiske does a bit of interval math:

Whether you are managing a large table or setting up automatic archiving, time based partitioning in Postgres is incredibly powerful. pg_partman’s newest versions support a huge variety of custom time internals. Marco just published a post on using pg_partman with our new database product for doing analytics with Postgres, Crunchy Bridge for Analytics. So I thought this would be a great time to review the basic and complex options for the time based partitioning.

Read on for a note of how pg_partman works and interval management, especially for versions earlier than 5.0.

Comments closed

An Overview of Data Partitioning Strategies

Published 2024-03-05 by Kevin Feasel

thanhdoancong (there are spaces in there somewhere but I’d probably guess wrong) talks partitions:

Data partitioning is the magic wand that divides your massive dataset into smaller, organized subsets called partitions. These partitions are based on specific criteria, like date ranges, customer segments, or product categories.

It’s like organizing your overflowing closet by color, season, or type of clothing. Each section becomes easier to browse and manage, making life (and data analysis) much easier.

Read on for a few varieties of partitioning and how they could improve your data estate. There’s no guarantee that partitioning will definitely improve performance—and in SQL Server’s case, the partitioning feature often does not improve performance at all because that isn’t its intent—but this is a good read to get an idea of what strategies are available.

Comments closed

Primer on Indexing and Partitioning in Postgres

Published 2023-11-20 by Kevin Feasel

Salman Ahmed gives us a 10,000 foot view of two topics:

When it comes to managing large and complex databases in PostgreSQL, an important decision you’ll face is how to optimize your data storage and retrieval strategies. Two common techniques for improving database performance and manageability are indexing and partitioning in PostgreSQL.

Read on for a quick overview of each topic, including the variety of index types and partitioning strategies available.

Comments closed

Finding Partitioned Tables in SQL Server

Published 2023-09-28 by Kevin Feasel

Andrea Allred has a script for us:

I recently needed to know which tables in my database were partitioned. I tried a bunch of queries and some got incredibly complex. I finally found one that I like:

Click through for the script and for the assumption Andrea makes (which is a reasonable one).

Comments closed

ALTER TABLE SWITCH and Errors 4907, 4908, and 4912

Published 2023-08-18 by Kevin Feasel

Eitan Blumin works out some problems:

When it comes to managing tables and indexes in SQL Server, the ALTER TABLE SWITCH statement is a powerful tool for “moving” data swiftly between tables. However, this convenience can sometimes be met with frustrating roadblocks, such as errors 4907 and 4908.

These errors may be confusing about their underlying cause, particularly when the source and target tables have identical partitions, including in non-clustered indexes.

Read on to see what these error messages mean and how you can correct them.

Comments closed

Thoughts on Partitioning in Postgres

Published 2023-08-07 by Kevin Feasel

Ryan Booz splits things out:

For 20+ years of database and application development, time-series data has always been at the heart of the products I work with. Inevitably, as the quantity of data grew over time, management became more difficult and query performance suffered. Over the years, the primary method for managing this growth in data effectively would be to partition it. The problem is, until recently, partitioning wasn’t easy to setup in most OLTP databases like PostgreSQL or SQL Server.

Fortunately, PostgreSQL has significantly improved its ability to partition large data tables over the last 6 years, starting with PostgreSQL 10.

Read on for Ryan’s recommendations around partitioning and a few thoughts on sharding.

Comments closed

Category: Partitioning