The secondary nodes may be configured using asynchronous or synchronous commit. With asynchronous commit, transaction considered to be committed and all locks were released when COMMIT log record is hardened on the primary node. SQL Server sends COMMIT record to secondary node; however, it does not wait for the confirmation that the record had been hardened in the log there.
This behavior changes when you use synchronous commit as shown in Figure 1. In this mode, SQL Server does not consider transaction to be committed until it receives the confirmation that COMMIT log record is hardened in the log on the secondary node. The transaction on primary will remain active with all locks held in place until this confirmation is received. The session on primary is suspended with HADR_SYNC_COMMIT wait type.
Click through for the full story.
What about your SQL Agent jobs? Have you got jobs that perform actions on your data? If you have, do those jobs exist on the new primary? If they don’t then I’m happy to bet that whatever function that they were playing probably isn’t happening anymore.
One thing that I always want to make sure before I failover is, do I have all the relevant jobs ready to roll on the secondary server?
But what about the jobs that you’ve got on the old primary? There’s a fair chance that, if they’re doing any sort of data manipulation, they’re going to be failing.
Click through for additional considerations.
The -Kreadonly switch is your key to success here but remember to also specify the database using -d. When not set (and with an initial catalog of master for my login), I found I always got the primary instance back during my check. This simple omission cost me hours of troubleshooting work, because I was convinced my listener wasn’t working correctly. In fact, I just wasn’t testing it correctly.
There’s some good information in here for sqlcmd and for SQL Server Management Studio.
This has always worked super, and has been a go-to query for me for years. But when looking at a SQL Server 2017 distributor with the distribution database in an Availability Group, that wasn’t working. All the publications had a
publisher_idof 1, but in
server_id1 was some random linked server, and definitely not the publisher. But replication was working great. Maybe replication was set up on the other AG replica, and
server_id1 came from there.
Nope. On the other replica, it was the same story.
Server_id1 was a random linked server, and nothing to do with replication at all, let alone the publisher. But replication was working perfectly. A teammate fooling around with it in dev confirmed that if he updated the
publisher_idto match the
server_idwe thought it should join to, replication stopped working. So, that
publisher_idof 1 was correct. Or special. But also definitely different than what I’ve seen in prior versions of SQL Server.
Read on to see what Andy learned.
Jonathan was working with a client recently who experienced a CLR assembly failure after an AG failover and needed to figure out why. They’d been testing their AG disaster recovery strategy and ran into an unexpected problem with their application which relies heavily on SQLCLR and an UNSAFE assembly that calls a web service from inside SQL Server. When they failed over their AG to their DR server, the CLR assembly failed with the following error:
An error occurred in the Microsoft .NET Framework while trying to load assembly id 65546. The server may be running out of resources, or the assembly may not be trusted with PERMISSION_SET = EXTERNAL_ACCESS or UNSAFE. Run the query again, or check documentation to see how to solve the assembly trust issues. For more information about this error: System.IO.FileLoadException: Could not load file or assembly ‘sqlclr_assemblyname, Version=126.96.36.199, Culture=neutral, PublicKeyToken=fa39443c11b12591’ or one of its dependencies. Exception from HRESULT: 0x80FC80F1
Read on to see the root cause and what you can do to correct it.
There is a constant rumble among Oracle DBAs- either all-in for Oracle Real Application Cluster, (RAC) or a desire to use it for the tool it was technically intended for. Oracle RAC can be very enticing- complex and feature rich, its the standard for engineered systems, such as Oracle Exadata and even the Oracle Data Appliance, (ODA). Newer implementation features, such as Oracle RAC One-Node offered even greater flexibility in the design of Oracle environments, but we need to also discuss what it isn’t- Oracle RAC is not a Disaster Recovery solution.
Click through for a good high-level contrast, as these are quite different products.
Recently, I had to use Azure Data Studio to access a application intent read only secondary replica. I had to use Azure Data Studio because I was using a Mac. I usually use SSMS on my Windows machines. If you want to connect with the “applicationintent=readonly” property via SQL Server Management Studio, you do so by typing it out in the “Additional Connection Parameters” as shown in the screenshot below:
Since I am fairly new to Azure Data Studio I was fumbling my way around to find the equivalent setting. And I finally found it…
Read on to see how you can set this in Azure Data Studio.
What does this mean? To have a supported WSFC-based configuration (doesn’t matter what you are running on it – could be something non-SQL Server), you need to pass validation. xFailOverCluster does not allow this to be run. You can create the WSFC, you just can’t validate it. The point from a support view is that the WSFC has to be vetted before you create it. Could you run it after? Sure, but you still have no proof you had a valid configuration to start with which is what matters. This is a crucial step for all AGs and FCIs, especially since AGs do not check this whereas the installation process for FCIs does.
If you look at MSFT_xCluster, you’ll see what I am saying is true. It builds the WSFC without a whiff of Test-Cluster. To be fair, this can be done in non-Azure environments, too, but Microsoft givs you warnings not to do that for good reason. I understand why Microsoft did it this way. There is currently no tool, parser, or cmdlet to examine the output of Test-Cluster results. This goes back to why building WSFCs is *very* hard to automate.
I’m not sure how easy some of these fixes would be, but they’d definitely be nice.
The primary replica is constantly monitoring the state of it’s secondaries. With the use of a continuous ping, the primary node always knows if the secondaries are up or down.
It’s when SQL detects that one of it’s synchronous replicas goes offline is when interesting things can happen.
So here’s the discussion that came up, if a synchronous replica goes offline for whatever reason, SQL won’t be able to commit any transactions and that means we can be confident that the secondary is up to date, right?
Read on to learn the answer. Which is “no.” But David explains why, so you should read that instead of just having me say it.
Last week there was a question on #sqlhelp on Twitter about the status of Query Store for a database in an Availability Group. I’ve written about Query Store and Availability Groups before so if you’re not familiar with QS behavior in an AG, check out that post first. But this question was, I think, specific to the values that shows on a read-only replica and how there were different query store settings between a primary and secondary. Let’s set it up and take a look.
Click through to learn why this may be and why you shouldn’t panic.