Troubleshooting failed database startup after GRID Out Of Place (OOP) rollback

3 min read
May 16, 2019 12:00:00 AM

Successful GRID Out-of-Place (OOP) Patching to 18.6

GRID Out Of Place (OOP) patching completed successfully for 18.6.0.0.0.

  • GRID_HOME: /u01/18.3.0.0/grid_2
  • ORACLE_HOME: /u01/app/oracle/12.1.0.1/db1

Here is an example of the inventory after patching:

+ /u01/18.3.0.0/grid_2/OPatch/opatch lspatches 29302264;OCW RELEASE UPDATE 18.6.0.0.0 (29302264) 29301643;ACFS RELEASE UPDATE 18.6.0.0.0 (29301643) 29301631;Database Release Update : 18.6.0.0.190416 (29301631) 28547619;TOMCAT RELEASE UPDATE 18.0.0.0.0 (28547619) 28435192;DBWLM RELEASE UPDATE 18.0.0.0.0 (28435192) 27908644;UPDATE 18.3 DATABASE CLIENT JDK IN ORACLE HOME TO JDK8U171 27923415;OJVM RELEASE UPDATE: 18.3.0.0.180717 (27923415)  + /u01/app/oracle/12.1.0.1/db1/OPatch/opatch lspatches 28731800;Database Bundle Patch : 12.1.0.2.190115 (28731800) 28729213;OCW PATCH SET UPDATE 12.1.0.2.190115 (28729213) 

Run cluvfy was successful, too.

[oracle@racnode-dc1-1 ~]$ cluvfy stage -post crsinst -n racnode-dc1-1,racnode-dc1-2 -verbose Post-check for cluster services setup was successful. CVU operation performed: stage -post crsinst Date: Apr 30, 2019 8:17:49 PM CVU home: /u01/18.3.0.0/grid_2/ User: oracle 

Executing the GRID OOP Rollback via Switch-Clone

GRID OOP Rollback Patching completed successfully for node1.

[root@racnode-dc1-1 ~]# crsctl check cluster -all ************************************************************** racnode-dc1-1: CRS-4537: Cluster Ready Services is online CRS-4529: Cluster Synchronization Services is online CRS-4533: Event Manager is online ************************************************************** racnode-dc1-2: CRS-4537: Cluster Ready Services is online CRS-4529: Cluster Synchronization Services is online CRS-4533: Event Manager is online ************************************************************** [root@racnode-dc1-1 ~]# echo $GRID_HOME /u01/18.3.0.0/grid_2  [root@racnode-dc1-1 ~]# $GRID_HOME/OPatch/opatchauto rollback -switch-clone -logLevel FINEST ... Confirm that all resources have been started from home /u01/18.3.0.0/grid. All resources have been started successfully from home /u01/18.3.0.0/grid. OPatchAuto successful. ... [root@racnode-dc1-1 ~]# /media/patch/findhomes.sh PID NAME ORACLE_HOME 10486 asm_pmon_+asm1 /u01/18.3.0.0/grid/ 10833 apx_pmon_+apx1 /u01/18.3.0.0/grid/ 

GRID OOP Rollback Patching completed successfully for node2.

[root@racnode-dc1-2 ~]# $GRID_HOME/OPatch/opatchauto rollback -switch-clone -logLevel FINEST ... OPatchauto session completed at Fri May 3 01:40:51 2019 Time taken to complete the session 19 minutes, 12 seconds 

Post-Rollback Validation and Discovery of the Outage

GRID OOP Rollback completed successfully for 18.5.0.0.0.

  • GRID_HOME: /u01/18.3.0.0/grid
  • ORACLE_HOME: /u01/app/oracle/12.1.0.1/db1

Validation shows database is OFFLINE:

+ crsctl stat res -w '((TARGET != ONLINE) or (STATE != ONLINE)' -t -------------------------------------------------------------------------------- Name           Target  State        Server                   State details        -------------------------------------------------------------------------------- Local Resources -------------------------------------------------------------------------------- ora.CRS.GHCHKPT.advm                OFFLINE OFFLINE      racnode-dc1-1            STABLE                OFFLINE OFFLINE      racnode-dc1-2            STABLE ora.crs.ghchkpt.acfs                OFFLINE OFFLINE      racnode-dc1-1            STABLE                OFFLINE OFFLINE      racnode-dc1-2            STABLE ... -------------------------------------------------------------------------------- Cluster Resources -------------------------------------------------------------------------------- ora.hawk.db       1        ONLINE  OFFLINE                               Instance Shutdown,STABLE       2        ONLINE  OFFLINE                               Instance Shutdown,STABLE 

Troubleshooting the Database Startup Failure (ORA-01078)

Start database FAILED.

[oracle@racnode-dc1-2 ~]$ srvctl start database -d $ORACLE_UNQNAME PRCR-1079 : Failed to start resource ora.hawk.db CRS-5017: The resource action "ora.hawk.db start" encountered the following error: ORA-01078: failure in processing system parameters ORA-01565: error in identifying file '+DATA/hawk/spfilehawk.ora' ORA-17503: ksfdopn:10 Failed to open file +DATA/hawk/spfilehawk.ora ORA-27140: attach to post/wait facility failed ORA-27300: OS system dependent operation:invalid_egid failed with status: 1 ORA-27301: OS failure message: Operation not permitted ORA-27302: failure occurred at: skgpwinit6 ORA-27303: additional information: startup egid = 54321 (oinstall), current egid = 54322 (dba) 

Root Cause Identified: Binary Permission Mismatch

Incorrect permissions for the Oracle library was the cause. Changing permissions for $GRID_HOME/bin/oracle (chmod 6751 $GRID_HOME/bin/oracle), and stopping and starting CRS resolved the failure.

[oracle@racnode-dc1-1 dbs]$ ls -lhrt $ORACLE_HOME/bin/oracle -rwsr-s--x 1 oracle dba 314M Apr 20 16:06 /u01/app/oracle/12.1.0.1/db1/bin/oracle  [oracle@racnode-dc1-1 dbs]$ ls -lhrt /u01/18.3.0.0/grid/bin/oracle -rwxr-x--x 1 oracle oinstall 396M Apr 20 19:21 /u01/18.3.0.0/grid/bin/oracle  [oracle@racnode-dc1-1 bin]$ chmod 6751 oracle [oracle@racnode-dc1-1 bin]$ ls -lhrt /u01/18.3.0.0/grid/bin/oracle -rwsr-s--x 1 oracle oinstall 396M Apr 20 19:21 /u01/18.3.0.0/grid/bin/oracle 

Reference: RAC Database Can't Start: ORA-01565, ORA-17503: ksfdopn:10 Failed to open file +DATA/BPBL/spfileBPBL.ora (Doc ID 2316088.1)

Conclusion: Lessons Learned from a "Successful" Rollback

In conclusion, while the GRID rollback may have completed successfully, the database was down and resulted in an outage. It may be prudent to check permissions for the Oracle library before a patching or rollback to avoid any disasters.

Oracle Database Consulting Services

Ready to optimize your Oracle Database for the future?

 

On this page

Ready to unlock value from your data?

With Pythian, you can accomplish your data transformation goals and more.