Partitioning – Page 2

Three Partitioning Options in Postgres

Published 2024-06-13 by Kevin Feasel

Semab Tariq shows how to perform three types of partitioning in PostgreSQL:

PostgreSQL is renowned for its exceptional performance in managing data. One of its standout features is partitioning, a technique that divides large datasets into smaller, more manageable segments. Partitioning provides several benefits, including improved query performance, streamlined data management, and enhanced scalability. By organizing data into partitions, PostgreSQL can execute searches more efficiently and handle tasks with greater ease.

In this blog, we will delve into the details of partitioning in PostgreSQL, exploring its various types, advantages, and drawbacks. We’ll uncover how partitioning can revolutionize data management and decision-making processes in database environments.

Click through for demonstrations of range, list, and hash partitioning.

Comments closed

The Joy of Partitioned Views

Published 2024-05-16 by Kevin Feasel

Rod Edwards talks partitioned views:

This post came around when I was at a loose end one evening, and just started poking at a local sandpit database, and it got me reminiscing and revisiting / testing a few things. The devil makes work for idle thumbs and all that…

Partitioned Views…do they have a place in society anymore?

Rod does a great job of following Betteridge’s Law of Headlines, as well as saving the ‘Yes’ answer for the post itself. Partitioned views come with their own pains, though one use case Rod did not include is using PolyBase and partitioned views to move “cold” data to slower external storage.

Comments closed

Arbitrary Intervals for Partitioning in Postgres

Published 2024-05-10 by Kevin Feasel

Keith Fiske does a bit of interval math:

Whether you are managing a large table or setting up automatic archiving, time based partitioning in Postgres is incredibly powerful. pg_partman’s newest versions support a huge variety of custom time internals. Marco just published a post on using pg_partman with our new database product for doing analytics with Postgres, Crunchy Bridge for Analytics. So I thought this would be a great time to review the basic and complex options for the time based partitioning.

Read on for a note of how pg_partman works and interval management, especially for versions earlier than 5.0.

Comments closed

An Overview of Data Partitioning Strategies

Published 2024-03-05 by Kevin Feasel

thanhdoancong (there are spaces in there somewhere but I’d probably guess wrong) talks partitions:

Data partitioning is the magic wand that divides your massive dataset into smaller, organized subsets called partitions. These partitions are based on specific criteria, like date ranges, customer segments, or product categories.

It’s like organizing your overflowing closet by color, season, or type of clothing. Each section becomes easier to browse and manage, making life (and data analysis) much easier.

Read on for a few varieties of partitioning and how they could improve your data estate. There’s no guarantee that partitioning will definitely improve performance—and in SQL Server’s case, the partitioning feature often does not improve performance at all because that isn’t its intent—but this is a good read to get an idea of what strategies are available.

Comments closed

Primer on Indexing and Partitioning in Postgres

Published 2023-11-20 by Kevin Feasel

Salman Ahmed gives us a 10,000 foot view of two topics:

When it comes to managing large and complex databases in PostgreSQL, an important decision you’ll face is how to optimize your data storage and retrieval strategies. Two common techniques for improving database performance and manageability are indexing and partitioning in PostgreSQL.

Read on for a quick overview of each topic, including the variety of index types and partitioning strategies available.

Comments closed

Finding Partitioned Tables in SQL Server

Published 2023-09-28 by Kevin Feasel

Andrea Allred has a script for us:

I recently needed to know which tables in my database were partitioned. I tried a bunch of queries and some got incredibly complex. I finally found one that I like:

Click through for the script and for the assumption Andrea makes (which is a reasonable one).

Comments closed

ALTER TABLE SWITCH and Errors 4907, 4908, and 4912

Published 2023-08-18 by Kevin Feasel

Eitan Blumin works out some problems:

When it comes to managing tables and indexes in SQL Server, the ALTER TABLE SWITCH statement is a powerful tool for “moving” data swiftly between tables. However, this convenience can sometimes be met with frustrating roadblocks, such as errors 4907 and 4908.

These errors may be confusing about their underlying cause, particularly when the source and target tables have identical partitions, including in non-clustered indexes.

Read on to see what these error messages mean and how you can correct them.

Comments closed

Thoughts on Partitioning in Postgres

Published 2023-08-07 by Kevin Feasel

Ryan Booz splits things out:

For 20+ years of database and application development, time-series data has always been at the heart of the products I work with. Inevitably, as the quantity of data grew over time, management became more difficult and query performance suffered. Over the years, the primary method for managing this growth in data effectively would be to partition it. The problem is, until recently, partitioning wasn’t easy to setup in most OLTP databases like PostgreSQL or SQL Server.

Fortunately, PostgreSQL has significantly improved its ability to partition large data tables over the last 6 years, starting with PostgreSQL 10.

Read on for Ryan’s recommendations around partitioning and a few thoughts on sharding.

Comments closed

An Overview of Partitioning and Sharding in Postgres

Published 2023-08-07 by Kevin Feasel

Michael Christofides defines terms:

It has been possible to do partitioning in PostgreSQL for quite a while — splitting what is logically one large table into smaller physical tables. Since version 10, a huge leap was made with the introduction of declarative partitioning, and more improvements have come every year since.

Sharding is a different story — splitting what is logically one large database into smaller physical databases. The primary tool for this in the PostgreSQL ecosystem is the Citus extension. But you can also handle the sharding logic at the application level, as recent posts from the likes of Notion and Figma have described. Somewhat confusingly, some forms of sharding are sometimes referred to as vertical partitioning, including by the team at Figma.

Read on for a few thoughts on when to perform each action and what the costs and benefits are.

Comments closed

Automatic Partition Maintenance in Power BI

Published 2022-11-29 by Kevin Feasel

Shabnam Watson answers an attendee question:

During one of my presentations on Incremental Refresh (IR) in Power BI, someone asked what happens during a Power BI automatic partition maintenance window when Power BI has an opportunity to merge smaller partitions into larger ones. Does Power BI use the data that is already imported into Power BI for the smaller partitions and combine it into a bigger one or does it re-read the data for those smaller partitions again. For example, if a dataset has an IR policy to refresh the last 1 day, and it has read data for all the days in a previous month, one day for each, when the new month arrives, it has an opportunity to merge the smaller day partitions into a month partition for the previous month. Does it re-read the previous month’s data from the source again or does it use what it already has in its memory?

Read on for the answer.

Comments closed

Category: Partitioning