Availability Group Agent Alerts

Tracy Boggiano builds a set of SQL Server alerts related to Availability Group happenings and issues:

For Availability Groups we have a few extra error numbers we care about. Error number 1480 tells when a server changes roles, so we can know when a server flips from a secondary to a primary, or from a primary to a secondary. Error number 35264 tells when data movement has suspended on any database. This can occur for many reasons. One I have seen is when you have expanded your mount point on your primary and the data or log file runs out of space on the secondary the data or log file can not expand on the secondary because you forgot to expand the secondary. Error number 35265 tells you when the data movement has resumed on any database.  Error number 41404 let’s you know if your AG is offline which can be bad if you expected an automatic failover.  Error number 41405 let’s you know if an Availability Group can’t automatically failover for any reason.  In the later to cases you will want to look at your SQL Error logs and AlwaysOn Extended Events Health session.

Click through for the alert scripts.

Related Posts

Adding Instance Name To The AlwaysON_health Session

Jonathan Kehayias shows how to add server_instance_name to the AlwaysON_health event session to make Availability Group troubleshooting easier: The AlwaysOn_health event session in Extended Events is intended to make analyzing problems with Availability Groups possible after they have occurred.  While this event session goes a long way towards making it possible to piece together the […]

Read More

Trial And Error With Read-Only Replica Queries

Cody Konior stress tests Availability Group round-robin routing: I’ve been hearing about round-robin read-only routing ever since SQL 2016 came out but whenever I tried to test if it’s working it never seemed to be. But now I know exactly how it works and there’s a few loopholes where it may not trigger, and they’re not the […]

Read More