• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

The Geek Diary

CONCEPTS | BASICS | HOWTO

  • OS
    • Linux
    • CentOS/RHEL
    • Solaris
    • Oracle Linux
    • Linux Services
    • VCS
  • Database
    • oracle
    • oracle 12c
    • ASM
    • mysql
    • MariaDB
    • Data Guard
  • DevOps
    • Docker
    • Shell Scripting
  • Interview Questions
  • Big Data
    • Hadoop
    • Cloudera
    • Hortonworks HDP

Oracle RAC : understanding split brain

By admin

What is a “Split Brain”

Split Brain is often used to describe the scenario when two or more nodes in a cluster, lose connectivity with one another but then continue to operate independently of each other, including acquiring logical or physical resources, under the incorrect assumption that the other process(es) are no longer operational or using the said resources. In simple terms “Split brain” means that there are 2 or more distinct sets of nodes, or “cohorts”, with no communication between the two cohorts.

For example :
Suppose there are 3 nodes in the following situation.
1. Nodes 1,2 can talk to each other.
2. But 1 and 2 cannot talk to 3, and vice versa.
Then there are two cohorts: {1, 2} and {3}.

ORACLE RAC - split brain

Why is this a problem

The biggest risk following a Split-Brain event is the potential for corrupting system state. There are three typical causes of corruption:
1. The processes that were once co-operating prior to the Split-Brain event occurring, independently modify the same logically shared state, thus leading to conflicting views of system state. This is often called the “multi-master problem”.
2. New requests are accepted after the Split-Brain event and then performed on potentially corrupted system state (thus potentially corrupting system state even further).
3. When the processes of the distributed system “rejoin” together it is possible that they have conflicting views of system state or resource ownerships. During the process of resolving conflicts, information may be lost or become corrupted.

In simpler terms, in a split-brain situation, there are in a sense two (or more) separate clusters working on the same shared storage. This has the potential for data corruption.

How does clusterware resolve a “split brain” situation?

In a split brain situation, voting disk will be used to determine which node(s) survive and which node(s) will be evicted. The common voting result will be:

a. The group(cohort) with more cluster nodes survive
b. The group(cohort) with lower node member survive, in case of same number of node(s) available in each group.
c. Some improvement has been made to ensure node(s) with lower load survive in case the eviction is caused by high system load.

Commonly, one will see messages similar to the followings in ocssd.log when split brain happens:

[ CSSD]2011-01-12 23:23:08.090 [1262557536] >TRACE: clssnmCheckDskInfo: Checking disk info...
[ CSSD]2015-01-12 23:23:08.090 [1262557536] >ERROR: clssnmCheckDskInfo: Aborting local node to avoid splitbrain.
[ CSSD]2015-01-12 23:23:08.090 [1262557536] >ERROR: : my node(2), Leader(2), Size(1) VS Node(1), Leader(1), Size(2)
[ CSSD]2015-01-12 23:23:08.090 [1262557536] >ERROR: 
###################################
[ CSSD]2015-01-12 23:23:08.090 [1262557536] >ERROR: clssscExit: CSSD aborting
###################################

Above messages indicate the communication from node 2 to node 1 is not working, hence node 2 only sees 1 node, but node 1 is working fine and it can see two nodes in the cluster. To avoid splitbrain, node 2 aborted itself.

To ensure data consistency, each instance of a RAC database needs to keep heartbeat with the other instances. The heartbeat is maintained by background processes like LMON, LMD, LMS and LCK. Any of these processes experience IPC Send time out will incur communication reconfiguration and instance eviction to avoid split brain. Controlfile is used similarly to voting disk in clusterware layer to determine which instance(s) survive and which instance(s) evict. The voting result is similar to clusterware voting result. As the result, 1 or more instance(s) will be evicted.

Common messages in instance alert log are similar to:

alert log of instance 1:
---------
Mon Dec 07 19:43:05 2011
IPC Send timeout detected.Sender: ospid 26318
Receiver: inst 2 binc 554466600 ospid 29940
IPC Send timeout to 2.0 inc 8 for msg type 65521 from opid 20
Mon Dec 07 19:43:07 2011
Communications reconfiguration: instance_number 2
Mon Dec 07 19:43:07 2011
Trace dumping is performing id=[cdmp_20091207194307]
Waiting for clusterware split-brain resolution
Mon Dec 07 19:53:07 2011
Evicting instance 2 from cluster
Waiting for instances to leave: 
2 
...
alert log of instance 2:
---------
Mon Dec 07 19:42:18 2011
IPC Send timeout detected. Receiver ospid 29940
Mon Dec 07 19:42:18 2011
Errors in file 
/u01/app/oracle/diag/rdbms/bd/BD2/trace/BD2_lmd0_29940.trc:
Trace dumping is performing id=[cdmp_20091207194307]
Mon Dec 07 19:42:20 2011
Waiting for clusterware split-brain resolution
Mon Dec 07 19:44:45 2011
ERROR: LMS0 (ospid: 29942) detects an idle connection to instance 1
Mon Dec 07 19:44:51 2011
ERROR: LMD0 (ospid: 29940) detects an idle connection to instance 1
Mon Dec 07 19:45:38 2011
ERROR: LMS1 (ospid: 29954) detects an idle connection to instance 1
Mon Dec 07 19:52:27 2011
Errors in file 
/u01/app/oracle/diag/rdbms/bd/BD2/trace/PVBD2_lmon_29938.trc  
(incident=90153):
ORA-29740: evicted by member 0, group incarnation 10
Incident details in: 
/u01/app/oracle/diag/rdbms/bd/BD2/incident/incdir_90153/BD2_lmon_29938_i90153.trc

In above example, instance 2 LMD0 (pid 29940) is the receiver in IPC Send timeout.

Filed Under: oracle, RAC Tagged With: RAC, split brain

Some more articles you might also be interested in …

  1. vcs basics – Communication faults, jeopardy, split brain, I/O fencing
  2. How to Move User datafiles between ASM Diskgroups using Incrementally Updated Backups
  3. How To Recover From Lost SYS Password in Oracle Database
  4. RMAN ‘Duplicate From Active Database’ Feature in Oracle 11g
  5. How to extend ASM disk from OS level in CentOS/RHEL
  6. How to upgrade RMAN catalog SCHEMA from 11g to 12.1.0.2 without upgrading the catalog database
  7. Interview Questions : Oracle 12c Multitenant Database Architecture
  8. How to Switch to a New Undo Tablespace in Oracle Database
  9. Oracle Software Group Accounts OSDBA, OSOPER, Oracle Inventory group
  10. Oracle RMAN Backup Shell Script Example

You May Also Like

Primary Sidebar

Recent Posts

  • MySQL: how to figure out which session holds which table level or global read locks
  • Recommended Configuration of the MySQL Performance Schema
  • MySQL: Identify what user and thread are holding on to a meta data lock that is preventing other queries from running
  • MySQL: How to kill a Long Running Query using max_execution_time
  • Archives
  • Contact Us
  • Copyright

© 2021 · The Geek Diary