Press "Enter" to skip to content

Category: HA / DR

Tips for Azure Site Recovery

Joey D’Antoni shares a few experiences when using Azure Site Recovery:

I need to blog more. Stupid being busy. Anyway, last week, we were doing a small scale test for a customer, and it didn’t work the way we were expecting, and for one of the dumbest reasons I’ve ever seen. If you aren’t familiar with Azure Site Recovery it provides disk level replication for VMs, and allows you to bring on-premises VMs online in Azure, or in another Azure region, if you VMs are in Azure already. It’s not an ideal solution for busy SQL Server VMs with extremely low recovery point objectives, however, if you need a simple DR solution for a group of VMs, and can sustain around 30 minutes of data loss, it is cheap and easy. The other benefit that ASR provides, similar to VMware’s Site Recovery Manager, is the ability to do a test recovery in a bubble environment.

Read on for notes from Joey.

Comments closed

NameNode and Secondary NameNode in Hadoop

The Hadoop in Real World team hit on a naming scheme that I think is bad:

NameNode is the heart of HDFS. NameNode maintains the metadata of HDFS – files, list of blocks, directories, permissions etc. The metadata is persisted on a file named FSIMAGE. During the start up of NameNode, the FSIMAGE file will be read and loaded into memory. 

Any ongoing changes to the files, directories in FSIMAGE will be written to memory and to a temporary log file. NameNode does not save the ongoing changes to FSIMAGE directly and this is because FSIMAGE file could be big for a big HDFS and updating a big file at runtime will be quite expensive and slow.

Read on to learn what the secondary NameNode does. As a hint, it’s not a secondary NameNode in the sense of high availability. If you’re a new Hadoop administrator, the name can be deceiving, letting you think you have high availability when you really don’t.

Comments closed

Handling Disaster Recovery

Randolph West has a disaster recovery plan:

I’ve had several occasions where hard drives have failed and attempts to recover data from these wonders of mechanical engineering have been mostly fruitless. I’ve experienced profound examples of data loss, in both cases losing years of email and contact details for people I met online.

This is all to say that I care deeply about data loss, and I take it personally when I’m asked to engage with potential customers to recover data in SQL Server.

This post is a high-level overview of how I tackle data recovery, whether personally or for professional consulting reasons.

Click through for the steps.

Comments closed

High Availability Options for DBAs

Pamela Mooney has a list:

In previous articles in this series, I have stated that the job of the DBA is to make the right data available to the right people as quickly as possible.

Here is where we delve more into the word “available” and take it up a notch. SQL Server offers several options for high availability, and understanding the advantages and caveats of each one will give you the best chance of ensuring the availability of data in any scenario. Let’s discuss the options for high availability in general terms and find out where to go to get more information as you need it.

Due to the breadth of this article and keeping with the idea of just learning the basics, I am not going to cover Azure here except to say that Azure either has compatibility with these features in most of its offerings uses them in background processes.

Read on for the list.

Comments closed

Mixed MultiSubnetFailover Support on AGs

Andy Mallon continues a line of thought:

In yesterday’s post, I showed how to configure an availability group (AG) to use the RegisterAllProvidersIP=0 when you can’t get clients to connect using the MultiSubnetFailover=true connection string attribute.

I mentioned that you have to make some trade-offs when you set RegisterAllProvidersIP=0, and included this comparison:

But….when if you can eat your cake and have it, too?

In some cases, you’ll have some applications & clients that are not able to use MultiSubnetFailover=true, and other clients that can. Perhaps you’re working on updating a bunch of legacy Java apps to move from old jTDS drivers to the current Microsoft JDBC drivers that properly support MultiSubnetFailover=true. Parts of your codebase have been updated, and you want them to make use of the connection string attribute for fast cross-subnet failover. But other parts of your codebase are still being updated and rely on the RegisterAllProvidersIP cluster parameter to be false. Wouldn’t it be nice to have both?

Read on to learn how.

Comments closed

So You Want to Fail Over a SQL Managed Instance

Danimir Ljepava takes us through user-initiated failover of SQL Managed Instances:

In August 2020, we have released a new feature user-initiated manual failover allowing to manually trigger a failover on SQL Managed Instance using PowerShell or CLI commands, or through invoking an API call.

Manually initiated failover on a managed instance will be an equivalent of the automated failover for high availability and software patches initiated automatically by the service. Manually invoking a failover on MI will help test end-to-end applications for fault resiliency on automatic failovers in case of planned or unplanned events before deploying to production. In addition to testing how failover impacts existing database sessions, it can also help verify if it changes the end-to-end performance due to changes in the network latency. In some cases if performance issues are encountered on SQL MI, manually invoking a failover to a new node can help mitigate the performance issue.

Read on to see how you can perform failover and how you can confirm that it worked.

Comments closed

Azure SQL Database Business Continuity Options

James Serra covers business continuity scenarios with Azure SQL Database:

I have wrote a number of blogs on the topic of business continuity in SQL Database before (HA/DR for Azure SQL DatabaseAzure SQL Database high availabilityAzure SQL Database disaster recovery) but with a number of new features I felt it was time for a new blog on the subject, focusing on disaster recovery and not high availability.

Business continuity in Azure SQL Database and SQL Managed Instance refers to the mechanisms, policies, and procedures that enable your business to continue operating in the face of disruption, particularly to its computing infrastructure. In the most of the cases, SQL Database and SQL Managed Instance will handle the disruptive events that might happen in the cloud environment and keep your applications and business processes running.

James takes us through options available for Azure SQL Database as well as managed instances.

Comments closed

An Overview of HADR Concepts with SQL Server

Kevin Hill walks through different topics around high availability and disaster recovery:

Replication—a very Special Snowflake:


Not everything in a SQL Server user database CAN be replicated, such as users, or tables with no Primary Key. New objects are not automatically sent from Publisher to Subscriber. System databases are not replicated.

There’s plenty of good information in here, so check it out.

Comments closed

Disaster Recovery for Your Workstation

Randolph West explains that disaster recovery isn’t just for your servers:

I just completed a chapter for another book where I spoke about the Recovery Point Objective (how much data you are prepared to lose) and Recovery Time Objective (how long you have to bring your environment up again) after a disaster, and while I never get tired of repeating myself, that’s SQL Server. What happens if your development environment — or workstation — experiences a catastrophic failure?

Or what if, say, you’re on a cruise ship in the middle of the ocean with Internet access and a phone (but no laptop) and your on-call person just died? (I’ll leave this as an exercise for the reader to decide if this really happened.)

The answer is, if we do a careful bit of planning using the same disaster recovery principles we already know, the impact could be minimal. Note that this post assumes that you have Internet access and are using Microsoft Windows as your environment.

Click through for some useful suggestions.

Comments closed