The Problem
12.1 RAC database error while starting via srvctl, alert log showing database startup and further goes down:
alert_ORCL2.log inside [oracle base]/diag/rdbms/[db name]/[SID name]/trace:
Fri Nov 03 15:06:25 2017 Adjusting the default value of parameter parallel_max_servers from 960 to 486 due to the value of parameter processes (600) Starting ORACLE instance (normal) (OS id: 19684) . . Fri Nov 03 15:10:45 2017 Process startup failed, error stack: Fri Nov 03 15:10:45 2017 Errors in file /u01/app/oracle/diag/rdbms/ORCL/ORCL2/trace/ORCL2_psp0_19706.trc: ORA-27300: OS system dependent operation:fork failed with status: 11 ORA-27301: OS failure message: Resource temporarily unavailable ORA-27302: failure occurred at: skgpspawn3 Fri Nov 03 15:10:46 2017 Shutting down instance (abort) License high water mark = 2 Fri Nov 03 15:10:46 2017 USER (ospid: 22067): terminating the instance Fri Nov 03 15:10:47 2017 Instance terminated by USER, pid = 22067 Fri Nov 03 15:10:47 2017 Instance shutdown complete Errors in file //u01/app/oracle/diag/rdbms/ORCL/ORCL2/trace/ORCL2_mmon_19853.trc (incident=14681): ORA-00600: internal error code, arguments: [KSLGES_3], [], [], [], [], [], [], [], [], [], [], [] ORA-27300: OS system dependent operation:semop failed with status: 43 ORA-27301: OS failure message: Identifier removed ORA-27302: failure occurred at: sskgpwwait1 Incident details in: /u01/app/oracle/diag/rdbms/ORCL/ORCL2/incident/incdir_14681/ORCL2_mmon_19853_i14681.trc Use ADRCI or Support Workbench to package the incident. See Note 411.1 at My Oracle Support for error and packaging details.
ORCL2_psp0_19706.trc:
*** 2017-11-03 15:10:44.989 Process startup failed, error stack: ORA-27300: OS system dependent operation:fork failed with status: 11 ORA-27301: OS failure message: Resource temporarily unavailable ORA-27302: failure occurred at: skgpspawn3 *** 2017-11-03 15:10:45.993 Process startup failed, error stack: ORA-27300: OS system dependent operation:fork failed with status: 11 ORA-27301: OS failure message: Resource temporarily unavailable ORA-27302: failure occurred at: skgpspawn3
ORCL2_mmon_19853.trc:
*** 2017-11-03 15:09:57.908 ***KELR Apply Log: unable to schedule MMON Slave, error 3 *** 2017-11-03 15:10:47.847 Incident 14681 created, dump file: /u01/app/oracle/diag/rdbms/ORCL/ORCL2/incident/incdir_14681/ORCL2_mmon_19853_i14681.trc ORA-00600: internal error code, arguments: [KSLGES_3], [], [], [], [], [], [], [], [], [], [], [] ORA-27300: OS system dependent operation:semop failed with status: 43 ORA-27301: OS failure message: Identifier removed ORA-27302: failure occurred at: sskgpwwait1 KEBM: MMON action policy violation. 'PQ: Adjust Slave Pool' viol=0; err=600 error 0 detected in background process kgxgnsdr: clssgsshdereg: warning: return status 26 (-558242808 ) OPIRIP: Uncaught error 447. Error stack: ORA-00447: fatal error in background process ORA-00600: internal error code, arguments: [KSLGES_3], [], [], [], [], [], [], [], [], [], [], [] ORA-27300: OS system dependent operation:semop failed with status: 43 ORA-27301: OS failure message: Identifier removed ORA-27302: failure occurred at: sskgpwwait1 kgxgnsdr: clssgsshdereg: warning: return status 26 (-813323584 )
Following error reported in OS logs
messages:
2017-11-03T15:10:11.690393+01:00 rachost1 Oracle Audit[7888]: LENGTH : '200' ACTION :[52] 'ALTER DATABASE MOUNT /* db agent *//* {2:39656:2} */' DATABASE USER:[1] '/' PRIVILEGE :[6] 'SYSDBA' CLIENT USER:[6] 'oracle' CLIENT TERMINAL:[0] '' STATUS:[1] '0' DBID:[10] '2949004148' 2017-11-03T15:10:47.700094+01:00 rachost1 kernel: [ 173.788859] cgroup: fork rejected by pids controller in /system.slice/ohasd.service <<<<<<<<<<<<<<<<<<<<<
Database able to startup with sqlplus.
The Solution
This is due the PIDs cgroup controller introduced with SUSE12.
"To control the default TasksMax= setting for services and scopes running on the system, use the system.conf setting DefaultTasksMax=. This setting defaults to 512, which means services that are not explicitly configured otherwise will only be able to create 512 processes or threads at maximum.
For thread- or process-heavy services, you may need to set a higher TasksMax value. In such cases, set TasksMax directly in the specific unit files. Either choose a numeric value or even infinity."
From SLES12 onwards, systemd is used instead of initd and the OHASD server is only allowed to open a maximum of 512 tasks. Configure the value of DefaultTasksMax to 65535 in the file /etc/systemd/system.conf or set the TasksMax value properly for the ohasd systemd service.
For example:
# cat /etc/systemd/system/ohasd.service.d/lunar.conf [Service] TasksMax=16384
# systemctl status ohasd ● ohasd.service - LSB: Start and Stop Oracle High Availability Service Loaded: loaded (/etc/init.d/ohasd; bad; vendor preset: disabled) Drop-In: /etc/systemd/system/ohasd.service.d └─lunar.conf Active: active (exited) since Mon 2017-11-13 14:29:23 CET; 3h 5min ago Docs: man:systemd-sysv-generator(8) Process: 4876 ExecStart=/etc/init.d/ohasd start (code=exited, status=0/SUCCESS) Tasks: 612 (limit: 16384) <<<<<<=================== here was the limit with 512