Press "Enter" to skip to content

Category: Availability Groups

Always On Lease Timeout Monitoring

Yvonne Vanslageren gives us one more thing to check:

SQL Server Always On Availability Groups are a robust solution for achieving high availability and disaster recovery for SQL Server databases. However, simply configuring them is not enough—you also need a solid monitoring strategy to ensure data integrity and system reliability. One key aspect of this monitoring process is keeping an eye on lease timeouts, which can signal larger issues and help prevent potentially catastrophic split-brain scenarios.

In this post, we’ll walk through the various health checks available for Always On Availability Groups, discuss how lease timeouts work, and explore practical methods for monitoring and troubleshooting.

Read on to learn more about the lease timeout concept, as well as where you can get this information and further recommendations around how to deal with the information.

Leave a Comment

Setting RCSI on a Database in an Availability Group

Haripriya Naidu makes a change:

I was working on modifying isolation level of database from Read Committed to Read Committed Snapshot(RCSI) and had to get exclusive access to the database. After letting the application team know about it and having stopped their processes, I tried to set the database to SINGLE_USER but it errored out.

It turns out that you cannot set a database to single user mode if it is in an availability group or part of database mirroring. Nonetheless, there is still a way to make this change. Read on to learn more.

Leave a Comment

Availability Group Seeding and Transient Failure 108

Chad Callihan runs into an error with an availability group:

The availability group in question was unhealthy, and none of the added databases were syncing. By the time I started investigating, the SQL service on the secondary had been restarted. There were also no recent errors in Failover Cluster Manager.

I checked the SQL Server Error Log and found some clues. The SQL Server Error Log was filled with “Always On: DebugTraceVarArgs” errors for each database that included the message:

“Seeding encountered a transient failure ‘108’, retrying…”

Read on to see how Chad fixed this.

Comments closed

Using Kubernetes with Distributed Availability Groups

Andrew Pruski has a guide for us:

A while back I wrote about how to use a Cross Platform (or Clusterless) Availability Group to seed a database from a Windows SQL instance into a pod in Kubernetes.

I was talking with a colleague last week and they asked, “What if the existing Windows instance is already in an Availability Group?”

This is a fair question, as it’s fairly rare (in my experience) to run a standalone SQL instance in production…most instances are in some form of HA setup, be it a Failover Cluster Instance or an Availability Group.

Read on for the tutorial. There are quite a few steps involved.

Comments closed

Contained Availability Groups and Database Mail

Ben Miller points out a gap in functionality:

Call to action for Microsoft. Contained Availability Groups came out in SQL 2022 and they definitely have their use. But there were some artifacts left behind that need some fixing. Namely when you use DBMail while in the Availability Group jobs or operations. Let’s see what there is left.

First, here is the link to the Feedback Item that is out there for voting to get Microsoft to fix this issue. There has already been an issue fixed with the msdb proc to activate dbmail in a Contained AG ([dbo].[sp_sysmail_activate]).

Read on to learn more about the issue.

Comments closed

Resuming Data Movement for an Availability Group

Chad Callihan gets things moving after a few 1s without enough 0s clog up the pipe:

Keeping an Always On Availability Group healthy is crucial, and seeing a non-synchronizing database in an Always On High Availability Group can give you a sinking feeling (pardon the pun). Disregarding the reason for the syncing issue, there are a few ways to resume syncing and get your setup back in the green.

Let’s look at resuming using the SSMS GUI and running a SQL statement.

Read on for the process. I appreciate that Chad also includes the T-SQL operation to do this.

Comments closed

SQL Server AGs and Kubernetes

Andrew Pruski shakes his head:

Say we have a database that we want to migrate a copy of into Kubernetes for test/dev purposes, and we don’t want to backup/restore.

How can it be done?

Well, with cross platform availability groups! We can deploy a pod to our Kubernetes cluster, create the availability group, and then auto-seed our database!

The caveat is, this probably isn’t a good idea. But then again, when has that ever stopped anyone?

Comments closed

Filestream and Availability Groups

Chad Callihan kicks a hornets’ nest:

Migrating databases to new servers and configuring them for availability groups can be a complex process. It can be more complex depending on what type of sneaky features are in use by the databases being migrated. I encountered an interesting issue migrating a database that had been setup to use FILESTREAM and wanted to show the steps I took to identify and resolve it.

Read on for the issue and what happened. It was a small thing overall, but the kind of small thing that can eat up the better part of a day if you’re not aware of it.

Comments closed

Dealing with AG Secondaries Falling Behind

David Fowler troubleshoots a wait type:

Are you struggling with a laggy redo and a build up in the redo queue on your readonly secondaries? Are you suffering with high PARALLEL_REDO_TRAN_TURN waits? Then this magic remedy could cure your ailments.

Read on to learn more about the parallel redo process and one thing you can do if it’s falling too far behind. I wouldn’t permanently switch back to serial redo, but maybe make that switch if there’s some consistent pattern to when the process falls behind.

Comments closed

Limiting Jobs to the Primary Replica of an AG

Chad Callihan doesn’t want jobs running willy-nilly:

Transitioning from a failover cluster configuration to an Availability Group configuration brings with it all kinds of “fun” challenges. One such challenge that you may not have considered is the handling of jobs on whatever server is Primary, along with secondary servers. Let’s briefly discuss a potential challenge and an option to address it.

Click through for the example and a solution. Eitan Blumin has another solution in the comments, so check that one as well.

Comments closed