Press "Enter" to skip to content

Category: Availability Groups

Automate Availability Group Failover for SSISDB 2012 and 2014

Alex Stuart shows how to fail over SSISDB in SQL Server 2012 or 2014:

Hopefully not many people are still configuring SSIS instances on SQL 2012 or 2014 – especially HA instances – but if you are, this post is for you.

If you’re running SQL Server 2016 or above, having the SSIS catalog function correctly in an AG is supported by built-in functionality to manage the DMK (database master key). In 2012/2014 however there is no such support. Without intervention this makes SSISDB unable to be opened after a failover, because the DMK isn’t open – leading to errors such as “Please create a master key in the database or open the master key in the session before performing this operation.

Read on to see how to resolve this error, and then how to do this automatically.

Comments closed

Availability Groups and Logins

Andrea Allred runs into a post-failover issue:

While doing a planned Availability Group failover, the application stopped talking to the database. After checking the SQL Server log, we found that all the SQL Logins were failing with an “incorrect password” error. The logins were on the server, the users were in the databases, and the passwords were even right, so what was wrong? It all comes down to SID’s (Security Identifiers).

Read on for the cause and the solution. I’d also recommend Sync-DbaAvailabilityGroup as a good dbatools cmdlet to use.

Comments closed

Understanding Long Failover Times for Availability Groups

Sean Gallardy has answers to your Availability Group questions, as long as you ask the specific question in this post:

One of the most common issues I look at from day to day is some variation of the question. “Why did it take a long time for my AG/Database to failover?”. There are many different meanings for this innocuously simple looking statement, for example was it that the failover time was long or was it a long time bringing the database online, or was it that it took a long time because a failover wasn’t possible, and what *exactly* is a long time? Are we talking a long time means 10 seconds, 1 minute, 5 minutes, 30 minutes? To each different business and their needs, “long” dramatically fluctuates. I’d like to go through at a high level, some of the most common reasons that I troubleshoot and if they might apply to your environment. FYI, if you tell me 1 second is a long time then I’m going to point you toward different architectures with multiple layers of caches and front-end servers/services which isn’t going to be cheap, but that’s what you want so you’re _willing_ to pay for it, right? Yeah, I thought not.

Click through for several factors which may affect how long it takes for a failover to occur.

Comments closed

Windows Server Failover Clustering Error Code 5054

Josh Darnell walks us through an error when setting up an Availability Group:

For setting up the environment, I was following this really in-depth guide from former Data Platform MVP and current Microsoft employee Ryan J. Adams: Build a SQL Cluster Lab Part 1

The guide is generally fantastic, and provides a lot of good insight into the non-SQL Server related aspects of setting up an Availability Group. I’d highly recommend checking it out if you’re interested in that sort of thing.

Relevant to this post, he has provided a diagram of how the different networks are configured:

If you’re very experienced with networking, you may already have some idea of what the problem is going to be. Don’t spoil it for everyone else okay?

I’ll admit I did not have an idea of what the problem was.

Comments closed

What to Do if a Database Isn’t Synchronizing

Lee Markup has some advice:

I currently manage 5 AlwaysOn Availability Groups. Two are on SQL Server 2014 and overdue for an upgrade, while three of them are on SQL Server 2017.  From time to time I have run into a couple of different situations that I needed to troubleshoot and I want to tell you where to look and what to check on in these scenarios. I can’t possibly tell you about everything that could go wrong, but I can tell you about my experience with AlwaysOn Availability Groups and let you decide if that experience helps you or not.

Click through for more.

Comments closed

Restoring a Database Formerly in an Availability Group

Jack Vamvas has a process for us:

Steps to restore a database from a backup device that was part of an Always On Availability Group, and now needs to be restored 

Recovery Scenario : Requesting an older database copy previously backed up 

Name of Always on Availability Group = MyAG1

Name of Always On Availability Group db = MyAGDB1

Note: this is a workflow – and there may be some slight variations depending your Availability Group set up 

Read on for rest of the workflow.

Comments closed

A Warning on Using Distributed Network Names

Allan Hirt has a warning for us:

DNNs are supported as of SQL Server 2019 CU2 and require Windows Server 2016 or later. I wrote more about them in my blog post Configure a WSFC in Azure with Windows Server 2019 for AGs and FCIs. Go there if you want to see what they look like and learn more.

Right now, I cannot wholeheartedly recommend the use of DNNs for listeners or FCIs if you are using Enterprise Edition. Why?

Read on to learn why.

Comments closed

Availability Groups and the Shakes

Niko Neugebauer coins a term:

Disclaimer: I am using the word shake by my own initiative and no Microsoft Documentation ever to my knowledge ever mentioned that situation. Those shakes are represented most of the time as health events to the cluster, such as the Lease Timeout resulting in a sudden attempt of Failover.
Why did I choose that word ? I don’t know. Honestly. 🙂

Read on to see it in context around hosts, CPU, and especially I/O.

Comments closed

Preparing an Availability Group for VM-Level Replication

David Klee takes us through an interesting scenario:

If you have a SQL Server Availability Group (AG) and the VMs are being replicated to a disaster recovery site (cloud or on-prem), chances are the networking topology is not the same at the second site. These replication technologies can include VM replication, SAN LUN replication, or replicating server-level backups to the second site. It is quite complex to have the same network subnet existing at both sites, so usually, the secondary site contains a different networking subnet structure. It means that the servers being brought up at the secondary site are going to receive different IP addresses.

The Availability Group architecture, especially with its dependency on the Windows Server Failover Cluster (WSFC) layer, are quite intolerant of having these IP addresses changed. The utilities performing the failover might not even be aware of the WSFC-specific components that need to be adjusted.

Click through to see what you can do.

Comments closed