Press "Enter" to skip to content

Category: Clustering

Interpreting V$ and GV$ Views in Oracle RAC

Kellyn Gorman continues a series on Oracle Real Application Clusters:

Furthering on our Oracle Real Application Clusters (RAC) knowledge, we’re going to go deeper into what we watch for a RAC database that may be different than a single instance.  RAC is built for scale and instance resilience, distributing workloads across multiple nodes.  At the same time, what gives it strength introduces monitoring complexity, especially when you’re not just watching a single instance but multiple, interconnected ones. To manage performance effectively in RAC, you need to understand the difference between V$ and GV$ views, what they show you, and how to interpret cluster-level wait events.  Along with performance, the overall health of the RAC cluster and interconnect must be known, too.

Click through for Kellyn’s explanation.

Leave a Comment

Monitoring Node Health in Oracle RAC

Kellyn Gorman continues a series:

After my last blog post on Oracle Real Application Clusters (RAC) I was asked to talk about both health and how performance impact can affect a RAC database. Its architecture enables failover, workload distribution, and offers an option to scale performance, but only when all nodes play well together. When one node drags behind or becomes unstable, RAC has no choice but to protect the rest of the cluster- so help me, Oracle Gods. This protection can come in the form of node eviction, which can be both disruptive and at times avoidable with proactive monitoring and intervention.

Click through to learn how Oracle monitors node health, the types of issues you might run into, and how to prevent node eviction.

Leave a Comment

The Basics of Oracle RAC

Kellyn Gorman gives us a primer:

Oracle Real Application Clusters (RAC) is still one of the most robust instance high-availability and scalability solutions, designed to provide resilience, performance, and continuous service for many Oracle enterprise workloads. Whether deployed on two nodes or a complex multi-node setup, RAC ensures your database infrastructure is both fault-tolerant and responsive under increasing demand.  RAC is an essential part of the Maximum Availability Architecture (MAA) recommended practices, (and in my experience) found in about 40% of small to medium Oracle environments, 98% of large enterprise environments and 100% of Exadata engineered systems. 

In this post, we’ll walk through the architectural foundation of RAC, configuration essentials, and a real-world transactional scenario that highlights the importance of its shared and synchronized architecture.

Read on to learn more.

Leave a Comment

Creating a Postgres Cluster on AWS with pg_cirrus

Salman Ahmed builds a cluster:

pg_cirrus is a simple and automated solution to deploy highly available 3-node PostgreSQL clusters with auto failover. It is built using Ansible and to perform auto failover and load balancing we are using pgpool.

We understand that setting up 3-node HA cluster using pg_cirrus on cloud environment isn’t as simple as setting it up on VMs. In this blog we will guide you in setting up a 3-node HA cluster using pg_cirrus on AWS EC2 instances.

Read on for the step-by-step instructions.

Comments closed

Getting Cluster File Share Witness Sharepath via Powershell

Tom Collins needs some information:

I normally use the Windows reg utility to get the  Cluster File Share Witness sharepath information. This is an example command line on a server returning the File Share Witness details

reg query \\myCluster\HKEY_LOCAL_MACHINE\Cluster\Resources /s /f sharepath

I want to start integrating these sorts of tasks into automated information gathering procedures and migrate this task into the Powershell library. Could you share some Powershell code to get the Cluster File Share Witness sharepath  information

Click through for a Powershell one-liner which gets this information.

Comments closed

A Note on Distributed Network Names

Allan Hirt provides an explanation around Distributed Network Names when building Windows Server Failover Clusters on Windows Server 2019:

The new Windows Server 2019 DNN functionality does have a side effect that does affect Azure-based configurations. When creating a WSFC, Windows Server 2019 detects that the VM is running in Azure and will use a DNN for the WSFC name. This is the default behavior.

I clipped this paragraph specifically because Allan uses both “affect” and “effect” correctly, and I wanted to call that out. Do read the rest of it as well.

Comments closed

Add-ClusterNode Error: Keyset Does Not Exist

Jonathan Kehayias troubleshoots a Windows Server clustering problem:

While working on a video recording for Paul this week I ran into an interesting problem with one of my Windows Server 2016 clusters. While attempting to add a new node to the cluster I ran into an exception calling Add-ClusterNode:

The server ‘SQL2K16-AG03.SQLskillsDemos.com’ could not be added to the cluster.
An error occurred while adding node ‘SQL2K16-AG03.SQLskillsDemos.com’ to cluster ‘SQL2K16-WSFC’.

Keyset does not exist

The windows account I was using was the domain administrator account and I had just recently made modifications that involved the certificate store on this specific VM, so I decided to take a backup of the VMDK and then revert to a snapshot to try again, and this time it worked.  So needless to say I was intrigued as to what I could have done that would be causing this error to happen.

Read on to see what the root cause was and how you can fix it.

Comments closed

Rolling Windows Upgrades with AGs + WSFC

Allan Hirt shows how you can combine Availability Groups with Windows Server Failover Clusters and upgrade the operating system version while keeping your SQL Servers running:

The configuration for a cluster rolling upgrade allows for mixed Windows Server versions to coexist in the same WSFC. This is NOT a deployment method. It is an upgrade method. DO NOT use this for normal use. Unfortunately, Microsoft did not put a time limit on how long you can run in this condition, so you could be stupid and do something like have a mixed Windows Server 2012 R2/2016 WSFC. Fire, ready, aim. The WSFC knows about this and you’ll see a warning with an Event ID of 1548.

Read on for a summary of what Allan has learned in doing this.

Comments closed

Pacemaker Changes Affecting SQL on Linux

Randolph West has an important message if you’re running SQL Server on Linux:

Heads up for SQL Server on Linux folks using availability groups and Pacemaker. Pacemaker 1.1.18 has been out for a while now, but it’s worth mentioning that there was a behaviour change in how it fails-over a cluster. While the new behaviour is considered “correct”, it may affect you if you’ve configured availability groups on a previous version (specifically 1.1.16).

Click through for more details and what you can do about this.

Comments closed

Get Windows Failover Cluster Errors

John Morehouse walks us through the Get-ClusterLog cmdlet in Powershell:

Sometimes you know that a problem occurred, but the tools are not giving you the right information.  If you ever look at the Cluster Failover Manager for a Windows Cluster, sometimes that can happen.  The user interface won’t show you any errors, but you KNOW there was an issue.  This is when knowing how to use other tools to extract information from the cluster log becomes useful.
You can choose to use either Powershell or a command at the command prompt.  I tend to lean towards Powershell. I find it easier to utilize and gives me the most flexibility.

Click through for an example, including of a method which filters out everything but error messages.

Comments closed