Whilst I was combing through various combinations of sys.dm_hadr and sys.Availability I stumbled across a couple of gems that I thought would be nice to share.
Read on for usage examples.
Yesterday I blogged about log shipping performance issues and mentioned a performance problem that can be caused by using availability group readable secondaries, and then realized I hadn’t blogged about the problem, only described it in our Insider newsletter. So here’s a post about it!
Availability groups (AGs) are pretty cool, and one of the most useful features of them is the ability to read directly from one of the secondary replicas. Before, with database mirroring, the only way to access the mirror database was through the creation of a database snapshot, which only gave a single, static view of the data. Readable secondaries are constantly updated from the primary so are far more versatile as a reporting or non-production querying platform.
But I bet you didn’t know that using this feature can cause performance problems on your primary replica?
Definitely read the whole thing.
Microsoft says that turning on TDE (Transparent Data Encryption) for a database will result in a 2-4% performance penalty, which is actually not too bad given the benefits of having your data more secure. There is even more of a performance hit when enabling cell level or column level encryption. When encrypting any of your databases, keep in mind that the tempdb database will also be encrypted. This could have a performance impact on your other non-encrypted databases on the same instance.
In a previous post I demonstrated how to add an encrypted database to an AlwaysOn group in SQL2016. In this article I will demonstrate the performance effects of having an encrypted database in your AlwaysOn Group compared to the same database not-encrypted.
The results aren’t surprising, though the magnitude of the results might be.
Now that the extended events session is setup we can use some queries to query information about our AGs and error messages happening on our servers. First to query when the server failover and becomes your primary server query the event availability_replica_state_change. The first 10 lines just read in the files for the extended event session so you don’t have to identify the exact filenames. Then we parse the xml event for get the timestamp, previous state, current state, repliace name, and group name for the event FROM the filenames we collected before. In the WHERE clause were are looking for when the state has changed to PRIMARY_NORMAL indicating a failover to the server.
Read on for scripts showing what she extends and also how to query this data.
Don’t panic, this is still going as planned as this is totally expected and this is the reason why:
We are now in a situation where we have the Primary server running 2017 but one (for us) or possibly more than one for you running on 2016 , its not possible for the 2016 server to synchronize as its databases have not been upgraded yet, they will therefore be stuck in recovery but we are about to fix that very soon.
This is a viable upgrade option: we did it when upgrading from 2014 to 2016. There are a lot of steps, but in the end, it worked fine.
There is an awesome set of PowerShell cmdlets out there written by MVP Chrissy LeMaire. This method is my personal choice. It works great and is easy to automate. You can run it with SQLAgent or you can just use Scheduled Tasks in the OS. The scheduled tasks method is a little cleaner, but you don’t get to see it in SQL Server. Also if you are on a cluster and running Windows 2012 you can cluster the task scheduler as an added benefit.
Chrissy wrote this with the intent of making migrations easier, and she succeeded. In fact, I made it a point to thank her at MVP Summit last year because it made my life insanely easier. The advantage here is that you can automate a lot more than than just logins. In fact you can migrate and automate pretty much anything at the server level. Here is the link that I guarantee you are going to bookmark followed by a video demo where I show how to install and automate the syncing of logins using both the SQLAgent method and the Scheduled Tasks method.
DBATools would be my preference in this situation as well, but click through to see four other methods, as well as code.
The default, and the way my AG was configured, was Prefer Secondary. As the image shows, this means backups will be made on the secondary, unless the secondary is unavailable, in which case, they will be made on the primary.
There are a couple of things to note when you use this setting:
Full backups made on the secondary are Copy Only backups. This means they won’t reset the differential bitmap and your differentials will continue to increase in size until a full backup is made on the primary.
Differential backups cannot be made on the secondary.
Transaction log backups can be made on the secondary and they do clear the log, so your log file will not continue to grow.
Read on for more details.
In my lab, I decided to play around with the automatic seeding functionality that is part of Availability Groups. This was sparked by my last post about putting SSISDB into an AG. I wanted to see how it would work for a regular database. When I attempted to do so, I received the following error:
Cannot alter the availability group ‘Group1’, because it does not exist or you do not have permission. (Microsoft SQL Server, Error: 15151)
Read on for the answer; it turns out automatic seeding itself was not the culprit.
For Availability Groups we have a few extra error numbers we care about. Error number 1480 tells when a server changes roles, so we can know when a server flips from a secondary to a primary, or from a primary to a secondary. Error number 35264 tells when data movement has suspended on any database. This can occur for many reasons. One I have seen is when you have expanded your mount point on your primary and the data or log file runs out of space on the secondary the data or log file can not expand on the secondary because you forgot to expand the secondary. Error number 35265 tells you when the data movement has resumed on any database. Error number 41404 let’s you know if your AG is offline which can be bad if you expected an automatic failover. Error number 41405 let’s you know if an Availability Group can’t automatically failover for any reason. In the later to cases you will want to look at your SQL Error logs and AlwaysOn Extended Events Health session.
Click through for the alert scripts.
3. Use MultiSubnetFailover=true
The Availability Group Listener is technically an optional component of an Availability Group. However, in my opinion it is necessary. By default, your listener will register all IP addresses as DNS A records and it will have multiple IP addresses when your cluster crosses subnets, most commonly when you have disaster recovery between data centers. Using the MultiSubnetFailover=true parameter in your client connection strings will attempt to connect to all IP addresses and completes the connection on the first thread to succeed. The listener ensures that only one IP address is online at a time, therefore you always connect to correct node.
This feature effectively bypasses the limitations of your DNS cache. Traditionally, you would cache the IP address for a DNS record. When you needed the client to connect to a different IP address using the same virtual network name, you would have to wait for the time to live setting to expire. This would delay your recovery time. With the MultiSubnetFailover setting, you can still cache your IP addresses but without the delay that they could induce.
There’s some good reading here.