Press "Enter" to skip to content

Category: HA / DR

Backups Are for DR, Not HA

Kevin Hill gives us a poignant reminder:

Please continue doing your backups!

Backups are Disaster Recovery, yes…but not HA.

Some will argue with this (in the comments most likely), but I broadly define “High Availability” as a system that can recover in seconds or minutes at most. Sometimes that is automatic, sometimes manual.

I agree that backups are for DR, not HA. I’d consider log shipping an option for both HA and DR, albeit one that requires manual failover (or rigging up a script that performs the failover for you).

I disagree about replication as an HA solution. Yes, you do need to make sure that everything can replicate, but if your publisher goes down, the subscriber can continue and your data is still available for use. And if you’re a complete masochist, you can use merge replication to allow writes to continue while the publisher is down. Cleaning up after that is a mess, especially if you end up with a bunch of conflicts, but High Availability doesn’t mean Easy Mode.

Comments closed

Hybrid Failover Rights from SQL Server 2022 to Azure SQL MI

Dani Ljepava explains a new benefit:

Hybrid failover rights is a new benefit that allows you to run a license-free Azure SQL Managed Instance when used as a passive DR replica for your SQL Server 2022 licensed under Software Assurance (SA), or using Pay-as-you-go billing option.

How the Hybrid Failover Rights benefit works

The new Hybrid failover rights licensing benefit is technology agnostic. You can use any technology, such is MI link as the most advanced replication technology using Always On, or perhaps LRS, ADF, transactional replication, backup and restore, or similar to setup replication between SQL Server and Managed Instance. As long as you are using Azure SQL Managed Instance only as a passive replica for your SQL Server 2022, you are eligible to apply the new licensing benefit.

Read on for more details on how you can activate this benefit.

Comments closed

Auto-Failover Groups in Azure SQL DB

Etienne Lopes wraps up a series:

So, first of all, what is Auto-failover groups?

The auto-failover groups feature allows you to manage the replication and failover of databases to another Azure region. You can include of a group of databases or all user databases in a logical server to be replicated to another logical server. It is a declarative abstraction on top of the active geo-replication feature, designed to simplify deployment and management of geo-replicated databases at scale.

Read on to see some of the benefits of this, as well as how to enable it.

Comments closed

Oracle: RMAN and Non-Synchronizing Standby Database

David Fitzjarrell proffers advice on recovering from a non-synchronizing standby database:

Occasionally the unthinkable can occur and the DBA can be left with a standby database that is no longer synchronizing with the primary. A plethora of “advice”will soon follow that discovery, most of it much like this:

“Well, ya gotta rebuild it.”

Of course the question to ask is “how far out of synch is the standby>” That question is key in determining how to attack this situation. Let’s go through the two most common occurrences of this and see how to address them.

Read on to see David’s advice.

Comments closed

Service Level Agreements (RPO and RTO) and SQL Server

David Klee wants to know how much downtime is acceptable to you:

Database professionals of the world – I have a question. Has your organization defined service level agreements (SLAs) for your data estate? I’m talking specifically the Recovery Point Objective (RPO) and Recovery Time Objective (RTO), and to have these defined not in an arbitrary number of nines, but in minutes or hours. If these aren’t defined from above, your business continuity plan is doomed to fail.

Read on to learn what RPO and RTO mean, how to think in terms of RPO and RTO, and some of David’s recommendations.

Comments closed

Trying out Azure Geo-Replication

Etienne Lopes continues a series on Azure SQL DB HA/DR:

So, first of all, what is Active Geo-Replication?

Active geo-replication is a feature that lets you create a continuously synchronized readable secondary database for a primary database. The readable secondary database may be in the same Azure region as the primary, or, more commonly, in a different region. This kind of readable secondary database is also known as a geo-secondary or geo-replica.“

Read on to learn more about the topic, including how to set it up and ways to try it out.

Comments closed

Three-Node Postgres HD Cluster with pg_cirrus

Salman Ahmed wants to be highly available:

We are thrilled to announce the release of pg_cirrus! First of all, you might be wondering what “cirrus” means. The term refers to the thin and wispy clouds that are often seen at high altitudes.

pg_cirrus is a simple and automated solution to deploy highly available 3-node PostgreSQL clusters with auto failover. It is built using Ansible and to perform auto failover and load balancing we are using pgpool.

Read on to see how it works. It’s also licensed under GPLv3, so it’s not only highly available but also freely available.

Comments closed

Data Inconsistency in Postgres HA Clusters

Umair Shahid gives us an overview:

While PostgreSQL is known for its robustness, scalability, and reliability, data inconsistency can occur in PostgreSQL clusters, which can cause issues and impact the overall performance of the system. In this blog, we’ll define data inconsistency in PostgreSQL clusters, discuss the challenges it poses, its causes, and provide some tips on how to prevent and resolve it if it occurs.

Click through for the article.

Comments closed

False Alarms in Highly Available Postgres Clusters

Umair Shahid pulls the alarm:

False alarms can be a significant problem in highly available clusters of PostgreSQL. They can cause unnecessary downtime and disruptions that can impact the performance of the nodes. In this blog post, we will explore the causes, prevention, and resolution of false alarms in PostgreSQL clusters.

It’s a good idea to sit back and think about how complex the problem of high availability is, even if the service (SQL Server, Postgres, or whatever) offers capabilities to simplify a lot of it. The trick is that you want your service to fail over if and only if it needs to, but what tells you if it “needs to” is noisy signal.

Comments closed

Creating a Disaster Recovery Plan for Synapse

Freddie Santos talks HA/DR with Synapse:

Many of our customers have been asking about creating a disaster recovery plan for their Synapse Workspace. In a new blog series, we will cover the basics of disaster recovery and business continuity, discussing available options and custom solutions.

In this first post, we’ll review important concepts and questions to answer before building a disaster recovery plan, including the differences between High Availability and Disaster Recovery.

The focus in this post is on the dedicated SQL pool and Azure Data Lake Storage Gen2 (because people still think about Gen1?), though that’s the majority of what you’d need to think about—Spark pools and the serverless SQL pool really drive from the data lake. There’s also Data Explorer pools, which have their own storage and HA/DR capabilities.

Comments closed