[root@exa1cel01 dbserver_patch_19.190104]# ./patchmgr -dbnodes ~/dbs_group -precheck -iso_repo /tmp/SAVE/p29181093_*_Linux-x86-64.zip -target_version 18.1.12.0.0.190111 -allow_active_network_mounts 2019-08-24 00:57:32 -0400 :Working: DO: dbnodeupdate.sh running a precheck on node(s). 2019-08-24 01:00:31 -0400 :ERROR : dbnodeupdate.sh precheck failed on one or more nodes SUMMARY OF WARNINGS AND ERRORS FOR exa1db01: exa1db01: # The following file lists the commands that would have been executed for removing rpms when specifying -M flag. # exa1db01: # File: /var/log/cellos/nomodify_results.240819004658.sh. # exa1db01: ERROR: Found dependency issues during pre-check. Packages failing: exa1db01: ERROR: Package: 1:dbus-1.2.24-8.0.1.el6_6.x86_64 exa1db01: ERROR: Package: exadata-sun-computenode-exact-18.1.12.0.0.190111-1.noarch (Fails because of required removal of Exadata rpms) exa1db01: ERROR: Package: glib2-devel-2.28.8-9.el6.x86_64 (Custom rpm fails) exa1db01: ERROR: Package: gnutls-devel-2.12.23-21.el6.x86_64 (Custom rpm fails) exa1db01: ERROR: Package: oracle-ofed-release-1.0.0-31.el6.x86_64 (Fails because of required removal of Exadata rpms) exa1db01: ERROR: Consult file exa1db01:/var/log/cellos/minimum_conflict_report.240819004658.txt for more information on the dependencies failing and for next steps.
[root@exa1db01 ~]# rpm -e dbus-1.2.24-8.0.1.el6_6.x86_64 pm-utils-1.2.5-11.el6.x86_64 hal-0.5.14-14.el6.x86_64 - Stopping system message bus: [ OK ]
************************************************************************************************************ NOTE patchmgr release: 19.190104 (always check MOS 1553103.1 for the latest release of dbserver.patch.zip) NOTE WARNING Do not interrupt the patchmgr session. WARNING Do not resize the screen. It may disturb the screen layout. WARNING Do not reboot database nodes during update or rollback. WARNING Do not open logfiles in write mode and do not try to alter them. ************************************************************************************************************ 2019-08-24 01:53:48 -0400 :Working: DO: Initiate precheck on 1 node(s) 2019-08-24 02:00:50 -0400 :Working: DO: Check free space and verify SSH equivalence for the root user to exa1db01 2019-08-24 02:03:17 -0400 :SUCCESS: DONE: Check free space and verify SSH equivalence for the root user to exa1db01 2019-08-24 02:04:18 -0400 :Working: DO: dbnodeupdate.sh running a precheck on node(s). 2019-08-24 02:06:57 -0400 :SUCCESS: DONE: Initiate precheck on node(s). [root@exa1cel01 dbserver_patch_19.190104]# nohup ./patchmgr -dbnodes ~/dbs_group -upgrade -iso_repo /tmp/SAVE/p29181093_*_Linux-x86-64.zip -target_version 18.1.12.0.0.190111 -allow_active_network_mounts -rolling & [root@exa1cel01 dbserver_patch_19.190104]# tail -f nohup.out NOTE patchmgr release: 19.190104 (always check MOS 1553103.1 for the latest release of dbserver.patch.zip) NOTE NOTE Database nodes will reboot during the update process. NOTE WARNING Do not interrupt the patchmgr session. WARNING Do not resize the screen. It may disturb the screen layout. WARNING Do not reboot database nodes during update or rollback. WARNING Do not open logfiles in write mode and do not try to alter them. ************************************************************************************************************ 2019-08-24 02:07:39 -0400 :Working: DO: Initiate prepare steps on node(s). 2019-08-24 02:07:52 -0400 :Working: DO: Check free space and verify SSH equivalence for the root user to exa1db01 2019-08-24 02:08:39 -0400 :SUCCESS: DONE: Check free space and verify SSH equivalence for the root user to exa1db01 2019-08-24 02:10:08 -0400 :SUCCESS: DONE: Initiate prepare steps on node(s). 2019-08-24 02:10:08 -0400 :Working: DO: Initiate update on 1 node(s). 2019-08-24 02:10:08 -0400 :Working: DO: dbnodeupdate.sh running a backup on 1 node(s). 2019-08-24 02:16:04 -0400 :SUCCESS: DONE: dbnodeupdate.sh running a backup on 1 node(s). 2019-08-24 02:16:04 -0400 :Working: DO: Initiate update on exa1db01 2019-08-24 02:16:04 -0400 :Working: DO: Get information about any required OS upgrades from exa1db01. 2019-08-24 02:16:14 -0400 :SUCCESS: DONE: Get information about any required OS upgrades from exa1db01. 2019-08-24 02:16:19 -0400 :Working: DO: dbnodeupdate.sh running an update step on exa1db01. 2019-08-24 02:28:13 -0400 :INFO : exa1db01 is ready to reboot. 2019-08-24 02:28:13 -0400 :SUCCESS: DONE: dbnodeupdate.sh running an update step on exa1db01. 2019-08-24 02:28:29 -0400 :Working: DO: Initiate reboot on exa1db01. 2019-08-24 02:28:56 -0400 :SUCCESS: DONE: Initiate reboot on exa1db01. 2019-08-24 02:28:56 -0400 :Working: DO: Waiting to ensure exa1db01 is down before reboot. 2019-08-24 02:30:08 -0400 :SUCCESS: DONE: Waiting to ensure exa1db01 is down before reboot. 2019-08-24 02:30:08 -0400 :Working: DO: Waiting to ensure exa1db01 is up after reboot. 2019-08-24 02:36:44 -0400 :SUCCESS: DONE: Waiting to ensure exa1db01 is up after reboot. 2019-08-24 02:36:44 -0400 :Working: DO: Waiting to connect to exa1db01 with SSH. During Linux upgrades this can take some time. 2019-08-24 02:57:50 -0400 :SUCCESS: DONE: Waiting to connect to exa1db01 with SSH. During Linux upgrades this can take some time. 2019-08-24 02:57:50 -0400 :Working: DO: Wait for exa1db01 is ready for the completion step of update. 2019-08-24 02:59:02 -0400 :SUCCESS: DONE: Wait for exa1db01 is ready for the completion step of update. 2019-08-24 02:59:08 -0400 :Working: DO: Initiate completion step from dbnodeupdate.sh on exa1db01 2019-08-24 03:10:05 -0400 :SUCCESS: DONE: Initiate completion step from dbnodeupdate.sh on exa1db01. 2019-08-24 03:10:49 -0400 :SUCCESS: DONE: Initiate update on exa1db01. 2019-08-24 03:10:55 -0400 :SUCCESS: DONE: Initiate update on 0 node(s)Note that the removed packages may cause some features from stop working. In my situation we had problems only with LDAP authentication. To fix it, those packages had to be re-installed after the patch and re-configured, so it's highly advisable to review the whole list of custom RPM packages during the prerequisites phase to detect any features that might get broken during the Exadata Patching.
[root@exa1cel01 dbserver_patch_19.190104]# cat ~/dbs_group exa1db03 [root@exa1cel01 dbserver_patch_19.190104]# nohup ./patchmgr -dbnodes ~/dbs_group -upgrade -iso_repo /tmp/SAVE/p29181093_*_Linux-x86-64.zip -target_version 18.1.12.0.0.190111 -allow_active_network_mounts -rolling & [root@exa1cel01 dbserver_patch_19.190104]# tail -f nohup.out ... 2019-08-25 05:49:54 -0400 :Working: DO: dbnodeupdate.sh running an update step on exa1db03. 2019-08-25 06:02:10 -0400 :INFO : exa1db03 is ready to reboot. 2019-08-25 06:02:10 -0400 :SUCCESS: DONE: dbnodeupdate.sh running an update step on exa1db03. 2019-08-25 06:02:22 -0400 :Working: DO: Initiate reboot on exa1db03. 2019-08-25 06:02:48 -0400 :SUCCESS: DONE: Initiate reboot on exa1db03. 2019-08-25 06:02:48 -0400 :Working: DO: Waiting to ensure exa1db03 is down before reboot. 2019-08-25 06:04:10 -0400 :SUCCESS: DONE: Waiting to ensure exa1db03 is down before reboot. 2019-08-25 06:04:10 -0400 :Working: DO: Waiting to ensure exa1db03 is up after reboot. 2019-08-25 06:10:44 -0400 :SUCCESS: DONE: Waiting to ensure exa1db03 is up after reboot. 2019-08-25 06:10:44 -0400 :Working: DO: Waiting to connect to exa1db03 with SSH. During Linux upgrades this can take some time. 2019-08-25 06:31:06 -0400 :SUCCESS: DONE: Waiting to connect to exa1db03 with SSH. During Linux upgrades this can take some time. 2019-08-25 06:31:06 -0400 :Working: DO: Wait for exa1db03 is ready for the completion step of update. 2019-08-25 06:32:10 -0400 :SUCCESS: DONE: Wait for exa1db03 is ready for the completion step of update. 2019-08-25 06:32:16 -0400 :Working: DO: Initiate completion step from dbnodeupdate.sh on exa1db03 SUMMARY OF ERRORS FOR exa1db03: 2019-08-25 06:58:43 -0400 :ERROR : There was an error during the completion step on exa1db03. 2019-08-25 06:58:43 -0400 :ERROR : Please correct the error and run "/u01/dbnodeupdate.patchmgr/dbnodeupdate.sh -c" on exa1db03 to complete the update. ...
- Checked the password file and it did not included the "CRSUSER__ASM_01" user: [oracle@exa1db03 pythian]# . oraenv <<< +ASM3 [oracle@exa1db03 pythian]# asmcmd ASMCMD> lspwusr Username sysdba sysoper sysasm SYS TRUE TRUE TRUE - Re-create the password file [oracle@exa1db03 pythian]# asmcmd pwget --asm +DATA/orapwASM ASMCMD> pwcopy +DATA/orapwASM /tmp/asm.pwd copying +DATA/orapwASM -> /tmp/asm.pwd ASMCMD> pwcreate --asm +DATA/orapwASMnew 'welcome@1' -f ASMCMD> pwget --asm +DATA/orapwasmnew ASMCMD> lspwusr Username sysdba sysoper sysasm SYS TRUE TRUE FALSE ASMCMD> orapwusr --grant sysasm SYS ASMCMD> orapwusr --add ASMSNMP Enter password: *********<<<<<<<<<<<<<<<<<<<<<welcome@1 ASMCMD> orapwusr --grant sysdba ASMSNMP ASMCMD> lspwusr Username sysdba sysoper sysasm SYS TRUE TRUE TRUE ASMSNMP TRUE FALSE FALSE - Find out user name and password for CRSD to connect [oracle@exa1db03 pythian]# crsctl query credmaint -path ASM/Self Path Credtype ID Attrs credmaint is an internal option and therefore undocumented. It is used by internal scripts in configuring various services. Dump the OCR contents as below [oracle@exa1db03 pythian]# $GRID_HOME/bin/ocrdump /tmp/ocr.dmp PROT-310: Not all keys were dumped due to permissions. [oracle@exa1db03 pythian]# vi /tmp/ocr.dmp - Search for below SYSTEM.ASM.CREDENTIALS.USERS.CRSUSER__ASM_001] ORATEXT : 3889b62c95b64f9bffae7aa8eaa6001d:oracle<<<<<<<<<<<<<<<<<<<<< orapwusr --add CRSUSER__ASM_001 Enter password: *****************************<<<<<<< lspwusr Username sysdba sysoper sysasm SYS TRUE TRUE TRUE ASMSNMP TRUE FALSE FALSE CRSUSER__ASM_001 FALSE FALSE FALSE ASMCMD> orapwusr --grant sysdba CRSUSER__ASM_001 ASMCMD> orapwusr --grant sysasm CRSUSER__ASM_001 ASMCMD> lspwusr Username sysdba sysoper sysasm SYS TRUE TRUE TRUE ASMSNMP TRUE FALSE FALSE CRSUSER__ASM_001 TRUE FALSE TRUE [oracle@exa1db03 pythian]# srvctl config asm ASM home: Password file: +DATA/orapwasmnew Backup of Password file: ASM listener: LISTENER ASM instance count: 3 Cluster ASM listener: ASMNET1LSNR_ASM NOTE: Type the password received from Step 2, Copy and Paste won't work. In flex ASM environment, we should add ORACLE_001 user after recreating password file (ORATEXT of SYSTEM.ASM.CREDENTIALS.USERS.oracle_001 from ocrdump). [oracle@exa1db03 pythian]# crsctl get credmaint -path /ASM/Self/aa2cab13f990ef21bfaa8db7080b462d/oracle -credtype userpass -id 0 -attr user oracle_001 [oracle@exa1db03 pythian]# crsctl get credmaint -path /ASM/Self/aa2cab13f990ef21bfaa8db7080b462d/oracle -credtype userpass -id 0 -attr passwd sx29ipxWu8MIPENqral1znxE94VtD [oracle@exa1db03 pythian]# asmcmd lspwusr Username sysdba sysoper sysasm SYS TRUE TRUE TRUE ASMSNMP TRUE FALSE FALSE CRSUSER__ASM_001 TRUE FALSE TRUE [oracle@exa1db03 pythian]# asmcmd orapwusr --add oracle_001 Enter password: ***************************** Note: In this case the password is = sx29ipxWu8MIPENqral1znxE94VtD therefore it needs to be provided thru the Enter password: prompt above. [oracle@exa1db03 pythian]# asmcmd orapwusr --grant sysdba oracle_001 [oracle@exa1db03 pythian]# asmcmd lspwusr Username sysdba sysoper sysasm SYS TRUE TRUE TRUE ASMSNMP TRUE FALSE FALSE CRSUSER__ASM_001 TRUE FALSE TRUE ORACLE_001 TRUE FALSE FALSE ORACLE_001 user provides the authentication for the DB instances in a Flex ASM configuration.
[root@exa1db03 pythian]# /u01/dbnodeupdate.patchmgr/dbnodeupdate.sh -c -s (*) 2019-08-25 07:12:50: Initializing logfile /var/log/cellos/dbnodeupdate.log ... Continue ? [y/n] y (*) 2019-08-25 07:13:09: Unzipping helpers (/u01/dbnodeupdate.patchmgr/dbupdate-helpers.zip) to /opt/oracle.SupportTools/dbnodeupdate_helpers (*) 2019-08-25 07:13:12: Collecting system configuration settings. This may take a while... Active Image version : 18.1.12.0.0.190111 Active Kernel version : 4.1.12-94.8.10.el6uek Active LVM Name : /dev/mapper/VGExaDb-LVDbSys1 Inactive Image version : 12.1.2.3.5.170418 Inactive LVM Name : /dev/mapper/VGExaDb-LVDbSys2 Current user id : root Action : finish-post (validate image status, fix known issues, cleanup, relink and enable crs to auto-start) Shutdown stack : Yes (Currently stack is up) Logfile : /var/log/cellos/dbnodeupdate.log (runid: 250819071250) Diagfile : /var/log/cellos/dbnodeupdate.250819071250.diag Server model : ORACLE SERVER X6-2 dbnodeupdate.sh rel. : 19.190104 (always check MOS 1553103.1 for the latest release of dbnodeupdate.sh) The following known issues will be checked for but require manual follow-up: (*) - Yum rolling update requires fix for 11768055 when Grid Infrastructure is below 11.2.0.2 BP12 Continue ? [y/n] y (*) 2019-08-25 07:15:43: Verifying GI and DB's are shutdown (*) 2019-08-25 07:15:44: Shutting down GI and db ... (*) 2019-08-25 07:24:16: All post steps are finished. [root@exa1db03 pythian]#
Error: "kfod op=cellconfig Died at crsutils.pm line 15183" :
2019/08/25 23:40:25 CLSRSC-595: Executing upgrade step 4 of 19: 'GenSiteGUIDs'. 2019/08/25 23:40:25 CLSRSC-180: An error occurred while executing the command '/u01/app/18.1.0.0/grid/bin/kfod op=cellconfig' Died at /u01/app/18.1.0.0/grid/crs/install/crsutils.pm line 15183.
> CLSRSC-595: Executing upgrade step 4 of 19: 'GenSiteGUIDs'. >End Command output 2019-08-25 23:40:25: CLSRSC-595: Executing upgrade step 4 of 19: 'GenSiteGUIDs'. 2019-08-25 23:40:25: Site name for Cluster: exa1-cluster 2019-08-25 23:40:25: It is non-extended cluster. Get node list from NODE_NAME_LIST, and site from cluster name. 2019-08-25 23:40:25: NODE_NAME_LIST: exa1db01,exa1db02,exa1db03,exa1db04 2019-08-25 23:40:25: The site for node exa1db01 is: exa1-cluster 2019-08-25 23:40:25: The site for node exa1db02 is: exa1-cluster 2019-08-25 23:40:25: The site for node exa1db03 is: exa1-cluster 2019-08-25 23:40:25: The site for node exa1db04 is: exa1-cluster 2019-08-25 23:40:25: leftVersion=12.1.0.2.0; rightVersion=12.2.0.0.0 2019-08-25 23:40:25: [12.1.0.2.0] is lower than [12.2.0.0.0] 2019-08-25 23:40:25: ORACLE_HOME = /u01/app/18.1.0.0/grid 2019-08-25 23:40:25: Running as user grid: /u01/app/18.1.0.0/grid/bin/kfod op=cellconfig 2019-08-25 23:40:25: Removing file /tmp/XD5jvGA_2v 2019-08-25 23:40:25: Successfully removed file: /tmp/XD5jvGA_2v 2019-08-25 23:40:25: pipe exit code: 256 2019-08-25 23:40:25: /bin/su exited with rc=1 2019-08-25 23:40:25: kfod op=cellconfig rc: 1 2019-08-25 23:40:25: execute 'kfod op=cellconfig' failed with error: Error 49802 initializing ADR ERROR!!! could not initialize the diag context 2019-08-25 23:40:25: Executing cmd: /u01/app/18.1.0.0/grid/bin/clsecho -p has -f clsrsc -m 180 '/u01/app/18.1.0.0/grid/bin/kfod op=cellconfig' 2019-08-25 23:40:25: Executing cmd: /u01/app/18.1.0.0/grid/bin/clsecho -p has -f clsrsc -m 180 '/u01/app/18.1.0.0/grid/bin/kfod op=cellconfig' 2019-08-25 23:40:25: Command output: > CLSRSC-180: An error occurred while executing the command '/u01/app/18.1.0.0/grid/bin/kfod op=cellconfig'
[grid@exa1db04 pythian]$ /u01/app/18.1.0.0/grid/bin/kfod op=cellconfig
Error 49802 initializing ADR
ERROR!!! could not initialize the diag context
[grid@exa1db04 pythian]$ strace /u01/app/18.1.0.0/grid/bin/kfod
execve("/u01/app/18.1.0.0/grid/bin/kfod", ["/u01/app/18.1.0.0/grid/bin/kfod"], [/* 21 vars */]) = 0
brk(0) = 0x15b9000
....
access("/u01/app/grid/crsdata/debug", W_OK) = 0
stat("/u01/app/grid/crsdata/debug/kfod_trace_needed.txt", 0x7ffecd968190) = -1 ENOENT (No such file or directory)
/u01/app/grid/crsdata/debug/kfod* /u01/app/grid/diag/kfod/exa1db04/kfod/log/* /u01/app/18.1.0.0/grid/log/diag/
[grid@exa1db04 grid]$ /u01/app/18.1.0.0/grid/bin/kfod op=cellconfig cell_data_ip192.168.10.54;192.168.10.55cell_nameexa1cel07cell_management_ip10.108.106.12cell_id1710NM781Vcell_versionOSS_18.1.12.0.0_LINUX.X64_190111cell_make_modelOracle Corporation ORACLE SERVER X6-2L_EXTREME_FLASHdiscovery_statusreachablecell_site_id00000000-0000-0000-0000-000000000000cell_site_namecell_rack_id00000000-0000-0000-0000-000000000000cell_rack_name cell_data_ip192.168.10.52;192.168.10.53cell_nameexa1cel06cell_management_ip10.108.106.11cell_id1710NM781Gcell_versionOSS_18.1.12.0.0_LINUX.X64_190111cell_make_modelOracle Corporation ORACLE SERVER X6-2L_EXTREME_FLASHdiscovery_statusreachablecell_site_id00000000-0000-0000-0000-000000000000cell_site_namecell_rack_id00000000-0000-0000-0000-000000000000cell_rack_name cell_data_ip192.168.10.50;192.168.10.51cell_nameexa1cel05cell_management_ip10.108.106.10cell_id1710NM781Jcell_versionOSS_18.1.12.0.0_LINUX.X64_190111cell_make_modelOracle Corporation ORACLE SERVER X6-2L_EXTREME_FLASHdiscovery_statusreachablecell_site_id00000000-0000-0000-0000-000000000000cell_site_namecell_rack_id00000000-0000-0000-0000-000000000000cell_rack_name cell_data_ip192.168.10.48;192.168.10.49cell_nameexa1cel04cell_management_ip10.108.106.9cell_id1710NM7826cell_versionOSS_18.1.12.0.0_LINUX.X64_190111cell_make_modelOracle Corporation ORACLE SERVER X6-2L_EXTREME_FLASHdiscovery_statusreachablecell_site_id00000000-0000-0000-0000-000000000000cell_site_namecell_rack_id00000000-0000-0000-0000-000000000000cell_rack_name cell_data_ip192.168.10.46;192.168.10.47cell_nameexa1cel03cell_management_ip10.108.106.8cell_id1710NM780Mcell_versionOSS_18.1.12.0.0_LINUX.X64_190111cell_make_modelOracle Corporation ORACLE SERVER X6-2L_EXTREME_FLASHdiscovery_statusreachablecell_site_id00000000-0000-0000-0000-000000000000cell_site_namecell_rack_id00000000-0000-0000-0000-000000000000cell_rack_name cell_data_ip192.168.10.44;192.168.10.45cell_nameexa1cel02cell_management_ip10.108.106.7cell_id1710NM781Dcell_versionOSS_18.1.12.0.0_LINUX.X64_190111cell_make_modelOracle Corporation ORACLE SERVER X6-2L_EXTREME_FLASHdiscovery_statusreachablecell_site_id00000000-0000-0000-0000-000000000000cell_site_namecell_rack_id00000000-0000-0000-0000-000000000000cell_rack_name cell_data_ip192.168.10.42;192.168.10.43cell_nameexa1cel01cell_management_ip10.108.106.6cell_id1710NM7815cell_versionOSS_18.1.12.0.0_LINUX.X64_190111cell_make_modelOracle Corporation ORACLE SERVER X6-2L_EXTREME_FLASHdiscovery_statusreachablecell_site_id00000000-0000-0000-0000-000000000000cell_site_namecell_rack_id00000000-0000-0000-0000-000000000000cell_rack_name [root@exa1db04 ]# /u01/app/18.1.0.0/grid/rootupgrade.sh Check /u01/app/18.1.0.0/grid/install/root_exa1db04.example.com_2019-08-26_00-50-48-539460928.log for the output of root script [root@exa1db04 ~]# sudo su - [grid@exa1db04 ~]$ . oraenv <<< +ASM4 [grid@exa1db04 ~]$ crsctl query crs softwareversion Oracle Clusterware version on node [exa1db04] is [18.0.0.0.0] [grid@exa1db04 ~]$ crsctl query crs activeversion Oracle Clusterware active version on the cluster is [18.0.0.0.0]Patching an Exadata is hardly what it used to be, but even when we have some issues, most of the errors described here were fixed by analyzing the log files generated by "patchmgr" and "dbnodeupdate.sh", so even something goes wrong we still have enough information to troubleshoot it and proceed with the patching.
Ready to optimize your Oracle Database for the future?