• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer navigation

The Geek Diary

  • OS
    • Linux
    • CentOS/RHEL
    • Solaris
    • Oracle Linux
    • VCS
  • Interview Questions
  • Database
    • oracle
    • oracle 12c
    • ASM
    • mysql
    • MariaDB
  • DevOps
    • Docker
    • Shell Scripting
  • Big Data
    • Hadoop
    • Cloudera
    • Hortonworks HDP

Oracle Solaris Cluster : Understanding quorum votes and quorum devices (How to avoid Failure Fencing and Amnesia)

by admin

Need for Quorum Voting

A quorum device is a shared storage device. A quorum server is shared by two or more nodes to contribute votes that are used to establish a quorum. Clusters operate only when a quorum of votes is available.

Describing Quorum Votes and Quorum Devices

The cluster membership subsystem of the Oracle Solaris Cluster software framework operates on a voting system as follows:

  • Each node is assigned exactly one vote.
  • Certain devices can be identified as quorum devices and are assigned votes. The following are types of quorum devices:
    • Directly attached multiported disks: Disks are the traditional type of quorum device and have been supported in all versions of Solaris Cluster.
    • NAS quorum devices
    • Quorum servers
  • There must be a majority (more than 50 percent of all possible votes present) to form a cluster or remain in a cluster.

Benefits of Quorum Voting

Given the rules for quorum voting, it is clear by looking at a simple two-node cluster why you need extra quorum device votes. If a two-node cluster had only node votes, you must have both nodes booted to run the cluster. This defeats one of the major goals of the cluster, which is to be able to survive node failure. But why have quorum voting at all? If there were no quorum rules, you could run as many nodes in the cluster as were able to boot at any point in time. However, the quorum vote and quorum devices solve the following two major problems:

  • Failure fencing
  • Amnesia prevention

These are two distinct problems that are solved by the quorum mechanism in the Solaris Cluster software.

Failure Fencing

If interconnect communication between nodes ceases, either because of a complete interconnect failure or a node crashing, each node must assume that the other is still functional. This is called split-brain operation or split-brain syndrome. Two separate clusters cannot be allowed to exist because of the potential for data corruption. Each node tries to establish a cluster by gaining another quorum vote. Both nodes attempt to reserve the designated quorum device. The first node to reserve the quorum device establishes a majority and remains as a cluster member. The node that fails the race to reserve the quorum device aborts the Oracle Solaris Cluster software because it does not have a majority of votes.

Failure Fencing and Amnesia in Oracle Solaris Cluster

Amnesia Prevention

If it is allowed to happen, a cluster amnesia scenario would involve one or more nodes being able to form a cluster (boot first in the cluster) with a stale copy of the cluster configuration. Consider the following scenario:

  1. In a two-node cluster (Node 1 and Node 2), Node 2 is halted for maintenance or crashes.
  2. Cluster configuration changes are made on Node 1.
  3. Node 1 is shut down.
  4. You try to boot Node 2 to form a new cluster. If this is allowed, the cluster would lose the configuration changes.

Quorum Device Rules

  • Quorum device must be available to both nodes in a two-node cluster.
  • Quorum device information is maintained locally in the CCR.
  • Disk quorum device can contain user data.
  • Maximum and optimal quorum disk votes are N – 1.
  • Quorum device is not required if there are more than two nodes, but it is still recommended.
  • A single-disk quorum can be automatically configured by scinstall for a two-node cluster.
  • Others are configured manually.
  • Disk quorum devices are configured (specified) by using DID devices.

Quorum Mathematics and Consequences

When the cluster is running, it is always aware of the following:

  • The total possible quorum votes (number of nodes plus the number of disk quorum votes defined in the cluster)
  • The total present quorum votes (number of nodes booted in the cluster plus the number of disk quorum votes physically accessible by those nodes)
  • The total needed quorum votes, which is greater than 50 percent of the possible votes.

The consequences are the following:

  • A node will freeze if it cannot find the needed number of votes at boot time, waiting for other nodes to join to obtain the needed vote count.
  • A node will kernel panic if it is booted in the cluster but can no longer find the needed number of votes.

Filed Under: oracle, Oracle Solaris Cluster

Some more articles you might also be interested in …

  1. How to Identify the Last and Next Refresh Dates for a Materialized View
  2. ORA-39170: Schema expression ‘OPS’ does not correspond to any schemas
  3. Oracle Database 12c2 : CPU_COUNT is Wrong
  4. Oracle Interview Questions – Flash Recovery Area
  5. Log file locations for Enterprise Manager Cloud Control 13c (OMS)
  6. How to Enable or Disable Veritas ODM for Oracle database 12.1.0.2, 18c and 19c
  7. Oracle Database 18c : How to Merge Partitions And Subpartitions Online
  8. How Realms Work in Oracle Database Vault
  9. How to recreate an ASM disk group
  10. How to set custom device names using udev in CentOS/RHEL 7

You May Also Like

Primary Sidebar

Recent Posts

  • qm Command Examples in Linux
  • qm wait Command Examples in Linux
  • qm start Command Examples in Linux
  • qm snapshot Command Examples in Linux

© 2023 · The Geek Diary

  • Archives
  • Contact Us
  • Copyright