Changing AG Endpoint Ports

Matt Gordon walks through changing endpoint ports:

Since a communication failure within an AG is usually a “hair on fire” kind of event, a quick fix may be desired. The quickest fix I’ve found is to change the port on which the AG endpoint is listening. While the Microsoft help pages on this contain enough information to put together the right script eventually, the first time this happened to me I really would have liked a blog post specific to this issue that gave me the right script to use.

In keeping with this T-SQL Tuesday’s theme, below is my SQL Server present. Please note that I used 5023 as an example but your choice can be flexible depending on what else is consuming ports on your machine. Hopefully this helps somebody (or me if I manage to travel back in time and encounter this same issue)

In an ideal world, you’d want a bit more control over what’s running on various ports to prevent this scenario, but that’s probably wishful thinking.

Buffer Pools And Availability Groups

Joey D’antoni answers the question, what happens to pages in the buffer pool when you have an AG failover?

Recently at SQL Saturday Philadelphia, we started discussing failover  as it relates to mirroring and Always On Availability Groups. Specifically, we were wondering what would happen if you had a relatively busy readable secondary replica (which would have a lot of pages in the buffer pool on the secondary instance) and if those pages would be flushed from cache or anything like that. So I reached out to the product group and Kevin Farlee from Microsoft was extremely helpful

The answer is not what I was expecting.

Auto-Seeding Availability Groups

John Sterrett is the latest smart person to take a look at automatic seeding of Availability Groups:

The 600 GB databases took about 66 minutes to seed across the network from a primary replica to the secondary replica.  I noticed 1.4 Gbps of consistent throughput during the seeding process. This makes a lot of sense as it caps out around what the storage system can deliver in this environment.

The first thing I would look at for benchmarking throughput for network activity would be the bytes sent per second from the primary replica and bytes received per second on the secondary replicas.

John also includes an extended event session statement to track seeding.  Great read.

More On Direct Seeding AGs

Erik Darling discusses direct seeding of Availability Groups:

This isn’t in the GUI (yet?), so don’t open it up and expect magic mushrooms and smiley-face pills to pour out at you on a rainbow. If you want to use Direct Seeding, you’ll have to script things. But it’s pretty easy! If I can do it, anyone can.

I’m not going to go through setting up a Domain Controller or Clustering or installing SQL here. I assume you’re already lonely enough to know how to do all that.

The script itself is simple, though. I’m going to create my Availability Group for my three lovingly named test databases, and add a listener. The important part to notice is SEEDING_MODE = AUTOMATIC. This will create an Availability Group called SQLAG01, with one synchronous, and one asynchronous Replica.

Mike Fal wrote about this as well.

Direct Seeding Availability Group Replicas

Mike Fal is excited about direct seeding of Availability Groups in SQL Server 2016:

In SQL Server 2012 and 2014, creating an Availability Group could take a significant amount of work. One of the more tedious tasks is setting up your replica databases. This is because that you need to restore your database to your replica node in a state close enough to the primary to allow synchronization to happen. It can take several log backup restores to hit that magic window where you can join the database on the secondary node. Then, you get to do it again on the next replica!

Enter direct seeding in 2016. With this feature you no longer have to do any of the restores. You simply create your replicas with direct seeding enabled, then when you add a database to the AG, SQL Server will directly seed the database to your replica nodes. It’s surprisingly simple.

This sounds pretty interesting.

Dueling Log Backup Jobs

Robert Davis ran into HADR_WORK_QUEUE waits recently:

Our 3rd party monitoring solution collects blocking information, but not for system threads. There was no additional information available for this blocking incident, but I could see that the system thread was a background process with the command “UNKNOWN TOKEN” and was sitting in a wait type of “HADR_WORK_QUEUE”. It was clearly the worker thread for the AG of a specific database.

A little later, we had blocking again involving that same thread, but this time, the AG worker thread was blocking the log backup thread. Seemed logical that if the worker thread could block the log backup, then the log backup could have also blocked the worker thread, but still it did not make sense to me.

This is one of those cases in which the answer makes perfect sense after the fact, but can be maddening until then.

Distributed Transactions With Always-On Availability Groups

Dave Bermingham looks at distributed transactions within Always-On Availability Groups in SQL Server 2016:

In SQL Server 2016, Distributed Transactions are only supported if the transaction is distributed across multiple instances of SQL Server. It is NOT supported if the transaction is distributed between different databases within the same instance of SQL Server. So in the picture above, if the databases are on separate SQL instances it will work, but not if the databases reside on the same instance which is more likely.

This seems like a half-finished job.  We’ll see if Microsoft improves on this later.

Introducing Timing Delays

Andrea Allred introduces a timing delay to solve a log growth problem:

We have a lovely Availability Group that holds A LOT of data that is broken into partitions.  We have 42 partitions and they are usually moving information around daily between them.   The index rebuilds on them were making our logs HUGE because the the Availability Group was taking too long to catch up, we tried both Synchronous and Asynchronous mode.  We would see all kinds of errors.  We were doing horrible things like auto shrinking our transaction log after the indexing finished and  ignoring alarms during the time the database was rebuilding.  We had requested more and more space from our storage team and sometimes the job wouldn’t even finish because it ran out of space.  Our first idea was to split out the index rebuilds so that we could do one partition at a time.

I’d call this a bit of a hack, but if it’s documented and it does the job better than any known alternative, it’s a good decision.

Availability Group Backup Failures

James Anderson had a recent experience in which a database in an Availability Group failed to back up properly:

Last week, I received an alert that the percentage of transaction log in use on one of our production databases was increasing more than it should have been. I’ll refer to this database as DB1 from here on in. It had reached 18%, which is above the normal 5-10% we like to see it at. DB1 is in an Always On Availability Group so if the log usage jumps we jump.

The first thing I checked was that the log backups were running. We use Ola Hallengren’s maintenance solution to manage our backups. The backup jobs were running without error.

The next thing to check was the log_reuse_wait_desc column in sys.databases.

The highlighted row is the row for DB1. A log_reuse_wait of 2 means there were no changes to backup since the last log backup. If that were the case then why was my transaction log slowly filling like an unattended bath?

Read on to learn more.  James also has a link to a (closed/won’t-fix) Connect item.

Availability Group Changes

Allan Hirt digs into SQL Server 2016 Availability Group changes:

What is a distributed availability group? Distributed availability groups allows you to create two different AGs on different Windows Server failover clusters (WSFCs) but join them together – if you will, an AG of AGs. This is great for disaster recovery scenarios where you do not want to worry (more than you should) about things like voters and quorum in a  single WSFC which could make the configuration more complex. This also allows for different versions of Windows Server (one WSFC is Windows Server 2012 R2 and another Windows Server 2016). Heck, it could even facilitate migrations to new hardware/the public cloud/virtualization assuming the same major version of SQL Server. It’s a cool feature. You can only manually fail over the AG between the WSFCs, but that’s OK, and you do need a listener for each AG. If you do not plan on using a listener, you cannot create a distributed AG.

Side note:  when I read DAG, I think directed acyclical graph.  Maybe I’m just weird that way…

Main note:  the idea of “seeding” an Availability Group sounds wonderful.

Categories

January 2019
MTWTFSS
« Dec  
 123456
78910111213
14151617181920
21222324252627
28293031