I logged onto that node and the AG Dashboard looked okay at first glance. But the test was still failing when I re-ran it manually, so, I looked deeper.
I logged onto the second node and noticed the AG was completely gone. All the databases were in recovery but there was no sign of the AG at all. Nothing. Nada. Zip. (I don’t have any other words). It’s like it was never there.
At first I thought someone must have done something awful. I quickly poured a coffee while checking the default trace which usually records system-level configuration changes like dropping an entire replica but in this case nothing relevant showed up.
Read on for the answer, as well as action items to take if you’re actively using Availability Groups.