Press "Enter" to skip to content

Category: Availability Groups

Don’t Use sys.dm_hadr_cluster_members for Quorum Info

Sean Gallardy explains a limitation of sys.dm_hadr_cluster_members:

I’ve now run across a few different instances where the monitoring for quorum was done via this DMV. On the surface, it seems like nothing would be wrong with using the “number_of_quorum_votes” column to check on the members of the cluster and see their voting status. However, this isn’t quite the case… you see there are various mechanisms that influence whether or not a member (or witness) has a vote and these continue to be expanded in each version of WSFC.

Click through for a short history lesson as well as some good advice on how accurately to get this information.

Comments closed

Availability Group Offerings in Standard Edition

Guy Glantser notes an issue with Availability Group documentation:

In SQL Server 2017 Microsoft added a new flavor called Read-Scale Availability Groups. This is different, because the goal here is not high availability or disaster recovery, but rather read-scalability. As opposed to the other flavors, in RSAG there is no cluster, and there is also no automatic failover mechanism. But you can set up multiple secondary replicas with read-only access and load balancing, and offload read workloads from the primary replica. This is a great scalability feature, and you can read more about it here.

Now, if you check Microsoft documentation regarding the editions and supported features of SQL Server, you will be happy to see that RSAG is supported in Standard Edition. I was happy to see it too. Unfortunately, if you try to set up a Read-Scale Availability Group on Standard Edition, it will not work. You will only be able to create a Basic Availability Group, as discussed earlier.

Click through for the answer, as well as what you can do in Standard Edition.

Comments closed

Managing Jobs in Availability Groups

Goncalo Cruz has a plan to run SQL Agent jobs on the primary node in an availability group:

In SQL Availability Groups the SQL jobs have to be created in all replicas and you need to add logic at the beginning of each relevant job to make it execute on the primary database. (this only applies when the local replica is the primary for the database)

If you do not add the logic they will execute with success in the primary replica but they will fail in the secondary replica.

Read on for a process which keeps jobs from running except on the primary.

Comments closed

Monitoring Availability Groups

Nisarg Upadhyay gives us some of the low-down on monitoring availability groups:

In my previous articles, I have explained the step-by-step process of deploying an AlwaysOn Availability group on SQL Server 2017. In this article, I am going to explain how to monitor AlwaysOn availability groups.

First, let’s review the configuration of the availability group we had deployed previously. To do that, open SQL Server Management Studio  Expand database engine from the object explorer  Expand “AlwaysOn High Availability”  Expand “Availability Groups.” You can see the availability group named SQLAAG. Under this availability group (SQLAAG), you can see the list of availability replicas, availability databases, and availability group listeners.

Click through for some tooling built into SQL Server Management Studio, as well as relevant Perfmon counters.

Comments closed

Ghost Records and Availability Groups

Aaron Bertrand ran into an interesting problem:

I recently came across an issue in a SQL Server Availability Group scenario where queries against a heavily-used queue table were taking longer and longer over time. The symptoms of the query were that logical reads were increasing rapidly, and we ultimately tracked it down to ghost records (and version ghost records) that were being created as rows from the queue table were consumed and deleted. Because the database was being used in a readable secondary, the ghost cleanup process simply wasn’t able to keep up with the volume of deletes against the table. The first workaround implemented was to suspend the queue consumers and rebuild the table.

Read on to see how Aaron used filtered indexes and soft deletes to mitigate some of the pain.

Comments closed

PARALLEL_REDO_FLOW_CONTROL Waits on Availability Groups

Taryn Pratt goes through a short outage at Stack Overflow:

While I can’t be 100% sure of the trigger, I’m 99.9% sure, because the job was running before the outage, so the timing is right. After looking through our monitoring logs, everything pointed to the job being the cause, so yes, I’m confident it caused it.

We don’t have regular maintenance windows for any of our servers, so we run jobs throughout the week, and if possible, try to schedule them during low-usage times. In this case, the job was an index maintenance job.

Now, before you scream at me about running an index maintenance job, I’m not going to argue the pros and cons of using it or whether or not we should run it — we can do that at another time. For this post, just accept the fact that we were running a job to rebuild/reorganize indexes

This is an interesting after-analysis of an outage. I have a lot of respect for people who can put these together and make them public—I would have a lot of trouble doing that myself.

Comments closed

High Availability Changes in SQL Server 2016

Martin Surasky looks at what SQL Server 2016 changed with respect to High Availability options:

AlwaysOn Availability Groups, first introduced in SQL Server 2012 is a feature that is conceptually similar to database mirroring. I’m going to assume you already know what AlwaysOn Availability Groups are in general, their main purpose and how they are different (essential aspects at least) from other technologies to provide replication such as Database Mirroring.

In SQL Server 2014, the significant enhancement to availability groups was the increase in the number of supported secondary replicas from three to eight. SQL Server 2016 includes a number of new enhancements

The biggest thing about it, as I recall, was stability: I wouldn’t have recommended too many places go into production with Availability Groups in 2012, but by 2016, many of the biggest bugs were ironed out.

Comments closed

Creating a Failover Cluster Instance with Shared Storage

Ryan Adams wraps up a video series on setting up a SQL Server lab environment:

You are going to create a SQL Server Failover Cluster Instance in Part 4 of our series on how to build a SQL Cluster Lab. The FCI will only be installed on Node1 and Node2. FCIs require shared storage so you will make your domain controller an iSCSI target. Last you will create your FCI using the iSCSI drives you presented to the cluster. 

Click through for links to the entire series.

Comments closed

Building a SQL Cluster: Availability Groups

Ryan Adams continues a series on building a SQL Server cluster as a lab:

You are going to create a multi-subnet Availability Group in Part 3 of our series on how to build a SQL Cluster Lab. First you give the Cluster Name Object rights in Active Directory and then install your AG. You will also ensure the AG endpoints are using the dedicated network created for them in Part 1. Here are links to the other articles in the series.

Click through for the video.

Comments closed

High Availability Announcements from Microsoft

Allan Hirt looks at a couple announcements from Microsoft:

I’m going to discuss what I feel are the biggest game changers. I knew licensing was changing as I had conversations with Microsoft around this months ago. I was not sure what the final result was going to be, but I’m fairly pleased. Is it perfect? No, but it’s much better than it was.

You’ll definitely want to read Allan’s thoughts on Microsoft’s SQL Server licensing changes, as well as a private preview of Azure Shared Disks.

Comments closed