Sometimes you know that a problem occurred, but the tools are not giving you the right information. If you ever look at the Cluster Failover Manager for a Windows Cluster, sometimes that can happen. The user interface won’t show you any errors, but you KNOW there was an issue. This is when knowing how to use other tools to extract information from the cluster log becomes useful.
You can choose to use either Powershell or a command at the command prompt. I tend to lean towards Powershell. I find it easier to utilize and gives me the most flexibility.
Click through for an example, including of a method which filters out everything but error messages.
I’m very excited to hear that coming in Windows Server 2019 there will be a few new features in regards to the File Share Witness for the Failover Cluster Quorum. The feature that many of my customers have been asking for about for many years is finally arriving…File Share Witness on a USB stick!
Okay, they didn’t really ask for that specifically, but many of my customers wanted to deploy a simple 2-node cluster in each store location, branch office, etc., and they didn’t want the added expense of a SAN to leverage a Disk Witness and weren’t to keen, or just didn’t have the connectivity, to rely on a Cloud Witness in Azure. Many of these customers just decided to forgo clustering, or they used an alternative clustering solution like the SIOS Protection Suite.
Now they have a viable alternative coming in Windows Server 2019. By leveraging a supported router, a USB disk inserted into the router can be configured with a file share that can be used as the witness. This eliminates the need for a 3rd server or internet connectivity.
I can’t see this being extremely useful in most scenarios, though that could be a lack of imagination on my part.
Well, that is easy to fix, right? Let’s just spin up a VM in Azure, and host the FSW on that machine. Problem solved! Technically yes, that is a viable option. But, let’s consider the cost of that scenario in the breakdown below:
- VM with OS licensed and Disk space allocated for FSW
- NSG/Firewall to protect said resource from outside
Also, you have to figure in the man hours in configuring all of those things(Let’s say 4 hours total. Insert your hourly rate here: Rate x 4 = Setup fee for VM in Azure
Now, here is where Cloud Witness saves the day! The Cloud Witness WSFC Quorum type will utilize BLOB Storage in Azure to act as the point of arbitration. Not sure what that means?
There’s a good walkthrough, but it does look quite easy to do, and a simple blob is going to be a lot cheaper than a VM.
Joey D’Antoni has some notes on clustering a SQL Server instance running on Linux. First, some quick notes:
One other thing that wasn’t in BOL, that I had to troubleshoot my way through is that just like a cluster on Windows, you have a cluster identifier and floating IP address. I had to add that to /etc/hosts on each of my nodes to get it to resolve. The article mentions turning off fencing for non-prod environments—I had to do that in order to get failover working correctly in my environment.
Then some more notes:
It was faster than building a Windows cluster
It took me a while, I laughed, I cried, I cursed a lot, but if I look at the time it took for me to actually build the cluster and install SQL Server, it was a much faster process. Much of this comes down to the efficiency of the SQL Server installation process on Linux, which is as simple as running yum install mssql-server (mostly). Which leads me to my next point..
As Joey notes, SQL Server clustering on Linux is in its infancy. It’s nice that it works right now, but expect improvements over the next version or two.
CSVs still use the concept of a shared disk but with an added layer of abstraction. Instead of having the shared disk be accessible to only one WSFC node at a time, all of the WSFC nodes have their own logical paths to it. This is made possible thru the Server Message Block (SMB) protocol. Since the WSFC nodes are connected to each other thru the heartbeat network (it’s another reason to have a dedicated network for inter-node communication,) CSV can take advantage of this route to the shared storage to send I/O commands.
Don’t be confused. While all of the WSFC nodes have both physical and logical paths to the shared disk, only the node that owns the SQL Server clustered resource can own it, just like when using traditional shared disks. In other words, the owner node dictates how to send the I/O commands – either directly thru its own access path or thru the heartbeat network. This is why you won’t see any direct dependency on the shared disk if you look at the Dependency Report. It also means that the failover process will be faster with CSVs.
If you’re using SQL Server 2014 or later and have built a cluster, check this feature out and see if it fits in your environment.
You know what’s scary as hell? When one node of an important cluster loses its trust relationship with the domain and you see the error “the trust relationship between this workstation and the primary domain failed”. That happened to me late last year with one of my SQL Server 2008 R2 nodes. The scary part was that I just didn’t know what to expect. The fix could be simple, or it could require a node rebuild.
Trust no one.
You will also notice this is the end of your road for this installation without getting a success on this rule since the ‘Next’ button is grayed out.
Sure, you could go back and just create a new instance in the process to install SSRS on this node, but there has to be another way, right? If only we could skip this rule in the installation….
We can using command prompt!
Knowing how to install SQL Server from the command line (or Powershell) is important; this is just one reason why.
Next is just a simple matter of updating the registry value. I launched Registry Editor (RegEdit.exe) and navigated to HKLM > Software > Microsoft > Microsoft SQL Server > MSSQL12.MSSQLSERVER > Cluster
Note: MSSQL12.MSSQLSERVER will vary based on SQL version (SQL 2012 is “MSSQL11”) and instance name (“MSSQLSERVER” is the default instance and named instances are the actual instance name instead of “MSSQLSERVER”).
Within the Cluster key I updated the value of ClusterName to match what was specified in Failover Cluster Manager
This is a good diagnosis and resolution of an issue.