The best times to apply Oracle quarterly patches

2 min read
Nov 15, 2018 12:00:00 AM

When is the best time to apply an Oracle quarterly patch? I suppose the answer is, "It depends." As it turns out, we started to apply October 2018 Grid Infrastructure PSU for AIX the day after it was released and it was a disaster. There were library corruptions for GI and DB with Oracle Restart configuration. Oracle's only suggestion is to rollback which was easier said than done, since simple commands such as crsctl config would fail. Pretty ugly.

The Disaster: Library Corruptions on AIX

After applying the patch, we encountered immediate failures. Commands that should be routine began throwing fatal errors:

# crsctl config has  exec(): 0509-036 Cannot load program crsctl.bin because of the following errors:  rtld: 0712-001 Symbol ztca_Shutdown was referenced   from module crsctl.bin(), but a runtime definition   of the symbol was not found.  rtld: 0712-001 Symbol ztpk_SetKeyInfo was referenced   from module /u01/11.2.0/grid/lib/libhasgen11.so(), but a runtime definition   of the symbol was not found.  rtld: 0712-001 Symbol ztpk_DestroyKey was referenced   from module /u01/11.2.0/grid/lib/libhasgen11.so(), but a runtime definition   of the symbol was not found.  rtld: 0712-001 Symbol ztpk_Sign was referenced   from module /u01/11.2.0/grid/lib/libhasgen11.so(), but a runtime definition   of the symbol was not found.  rtld: 0712-001 Symbol ztpk_Verify was referenced   from module /u01/11.2.0/grid/lib/libhasgen11.so(), but a runtime definition   of the symbol was not found.  rtld: 0712-002 fatal error: exiting. 

Challenges with Oracle Support and Patch Integrity

To make matters worse, there was a breakdown in communication among Oracle's support engineers. I recall one engineer mentioned there were issues with the current patch release and, unfortunately, this was not documented in the SR.

Identifying Potential Points of Failure

Another contribution to the failure is that the patch download was corrupted since the checksum did not match and was not checked. Maybe there are no issues after all. Fast forward: We tried patching the environment again, only to have it fail for the second time. Finally, the same Oracle support engineer who did not think there were issues with patch release has now confirmed there are issues.

Lessons Learned and Best Practices for Patching

This experience served as a significant learning opportunity for our team. Here are the core takeaways:

  1. Don't be too hasty to patch: Waiting a few weeks for the community to "smoke test" a new release can save you from being the first to find a critical bug.
  2. Document everything: Always get the name of the engineer you are speaking with and ensure the conversation is documented in the SR.
  3. Trust, but verify: Oracle engineers are human too, and mistakes can happen.
  4. Always have a backup: This was our saving grace. How many backup GI and DB homes should you have before patching? As many as it takes to ensure a quick recovery.

Happy patching and stayed tuned! In the next post, I will share how we managed to recover from the disaster.

Oracle Database Consulting Services

Ready to optimize your Oracle Database for the future?

On this page

Ready to unlock value from your data?

With Pythian, you can accomplish your data transformation goals and more.