HA / DR – Curated SQL

Barman is a popular tool in the PostgreSQL ecosystem for managing backups, especially in High Availability (HA) environments. It’s known for being easy to set up and for offering multiple types and modes of backups. However, this flexibility can also be a bit overwhelming at first. That’s why I’m writing this blog to break down each backup option in a simple and clear way, so you can choose the one that best fits your business needs.

Click through for the available options, as well as some recommendations.

Choosing a High Availability Solution in PostgreSQL

Published 2025-06-24 by Kevin Feasel

Semab Tariq compares two alternatives:

When designing a highly available PostgreSQL cluster, two popular tools often come into the conversation: Pgpool-II and Patroni. Both are widely used in production environments, offer solid performance, and aim to improve resilience and reduce downtime; however, they take different approaches to achieving this goal.

We often get questions during webinars/talks and customer calls about which tool is better suited for production deployments. So, we decided to put together this blog to help you understand the differences and guide you in choosing the right solution based on your specific use case.

Click through for a primer on the topic, followed by some recommendations.

Split-Brain Scenarios in PostgreSQL Clusters

Published 2025-06-02 by Kevin Feasel

Semab Tariq knows that an application cannot serve two masters:

In this blog post, we will try to explore a critical failure condition known as a split-brain scenario that can occur in PostgreSQL HA clusters. We will first see what split-brain means, and then how it can impact PostgreSQL clusters, and finally discuss how to prevent it through architectural choices and tools available in the PostgreSQL ecosystem

Click through for an explanation of split-brain and what can cause this problem. Additionally, Semab includes several tips on how to limit the likelihood of a split-brain scenario occurring.

Comments closed

HA/DR in Oracle with Data Guard

Published 2025-05-30 by Kevin Feasel

Kellyn Gorman takes a peek at Oracle Data Guard:

In its traditional, (and free) configuration, Oracle Data Guard operates in an active/passive architecture. This incredibly well-designed and valuable solution from Oracle which comes included with the Enterprise Edition has as part of its architecture:

A primary database, which is an active, accessible database system.

One or more standby databases, which are passive replicas that continuously receive redo data from the primary.

Click through for an overview of the product.

Comments closed

Database Snapshots in High-Availability Setups

Published 2025-05-14 by Kevin Feasel

Stephen Planck adds one more layer of complexity:

SQL Server’s database-snapshot feature is a wonderfully simple tool: at the instant you create the snapshot, every page in the database is marked “copy-on-write.” Nothing is copied across the wire, no blocking locks appear, and the snapshot opens immediately as a read-only database on the local replica. Queries against the snapshot see the world exactly as it looked at that moment while the live workload keeps changing pages in the primary data files. Because snapshots live only in sparse files on the server that owns them, they are not a replacement for backups—but they are perfect for ad-hoc reporting, quick “before-and-after” comparisons, or a safety net when you want an easy way to back out a risky change that should finish within minutes or hours.

But read on to see how they interact with high-availability features such as transactional replication and availability groups.

Comments closed

Failover Groups in Azure SQL Database

Published 2025-02-21 by Kevin Feasel

Mika Sutinen looks at some interesting functionality:

One of the interesting features in Azure SQL Database is the Failover Groups. It allows you to manage replication of an Azure SQL database, or group of databases, to another logical server. The reason I’ve bolded the manage replication is, that the replication itself is handled by active geo-replication, which is also a feature of Azure SQL Database.

Read on to see how these are different and why you might want to use failover groups.

Comments closed

Using Kubernetes with Distributed Availability Groups

Published 2024-11-19 by Kevin Feasel

Andrew Pruski has a guide for us:

A while back I wrote about how to use a Cross Platform (or Clusterless) Availability Group to seed a database from a Windows SQL instance into a pod in Kubernetes.

I was talking with a colleague last week and they asked, “What if the existing Windows instance is already in an Availability Group?”

This is a fair question, as it’s fairly rare (in my experience) to run a standalone SQL instance in production…most instances are in some form of HA setup, be it a Failover Cluster Instance or an Availability Group.

Read on for the tutorial. There are quite a few steps involved.

Comments closed

Cross-Regional Failover Clusters in Google Cloud Platform

Published 2024-10-31 by Kevin Feasel

Dave Bermingham builds a cluster:

I was the principal author of this SIOS whitepaper, which describes how to build a 2-node SQL Server cluster in Google Cloud Platform (GCP) spanning multiple zones. Today, I’ll explain how to extend this cluster by adding a third node in a different GCP region.

Check out the paper and then Dave’s step-by-step instructions.

Comments closed

Online DR from SQL Server 2022 and Azure SQL MI Now Available

Published 2024-10-10 by Kevin Feasel

Djordje Jeremic announces general availability of one of the key selling points from SQL Server 2022:

Today, we are announcing the general availability of the following two major capabilities of the Managed Instance link feature with SQL Server 2022:

Two-way failover between SQL Server 2022 and SQL Managed Instance through the link to unlock true disaster recovery (DR) with Azure

Creating a link from SQL Managed Instance to SQL Server 2022 to unlock off-PaaS data mobility for regulatory and dev/test scenarios

Click through for more detail.

Comments closed

pgBackRest and Standby Server Backups

Published 2024-09-23 by Kevin Feasel

Stefan Fercot does some explaining:

Recently, we’ve received many questions about how to take backups from a standby server using pgBackRest. In this post, I’d like to clarify one of the most frequently asked questions and address a common misconception for new users.

First of all, it’s important to understand that taking a backup exclusively from the standby server is not currently possible. When you trigger a backup from the standby, pgBackRest creates a standby backup that is identical to a backup performed on the primary. It does this by starting/stopping the backup on the primary, copying only files that are replicated from the standby, then copying the remaining few files from the primary.

Read on to learn more and to see an example of how this works.

Comments closed

Category: HA / DR

Using Barman to Back Up HA-Enabled PostgreSQL Clusters

Choosing a High Availability Solution in PostgreSQL

Split-Brain Scenarios in PostgreSQL Clusters

HA/DR in Oracle with Data Guard

Database Snapshots in High-Availability Setups

Failover Groups in Azure SQL Database

Using Kubernetes with Distributed Availability Groups

Cross-Regional Failover Clusters in Google Cloud Platform

Online DR from SQL Server 2022 and Azure SQL MI Now Available

pgBackRest and Standby Server Backups