Oracle Silent Mode, Part 6: Removing a Node From a 10.2 RAC

This sixth post describes how to remove a node from a 10.2 RAC cluster in silent mode. It differs from the associated documentation in that it will show how to remove a node, even if it has been made unavailable for any reason, including an error by a DBA or a SA.
Here is the complete series agenda:
- Installation of 10.2 And 11.1 Databases
- Patches of 10.2 And 11.1 databases
- Cloning Software and databases
- Install a 10.2 RAC Database
- Add a Node to a 10.2 RAC database
- Remove a Node from a 10.2 RAC database (this post!)
- Install a 11.1 RAC Database
- Add a Node to a 11.1 RAC database
- Remove a Node from a 11.1 RAC database
- A ton of other stuff you should know
Now for the substance of this part.
Backup the Voting Disk
A good way to start is usually to think about the worse thing that could happen and how to step back if you mess something up. There is probably less risk when you remove a node than when you add a new node. However, make sure you have a backup of the voting disk and the OCR. To proceed, run the dd
and ocrsconfig
commands as below:
rac-server5$ cd $ORA_CRS_HOME/bin rac-server5$ ./crsctl query css votedisk rac-server5$ mkdir -p /home/oracle/backup rac-server5$ dd if=/dev/sdb5 \ of=/home/oracle/backup/votedisk.bak \ bs=4k rac-server5$ ./ocrconfig -showbackup
Remove the Services from the Instance You Plan to Delete
Using srvctl modify
, update the services so that they don’t contain the instance you plan to remove:
rac-server5$ # The command below lists the services rac-server5$ srvctl config service -d ORCL rac-server5$ # The command below modify the OLTP service rac-server5$ srvctl modify service -d ORCL \ -s OLTP -n \ -i "ORCL1,ORCL2,ORCL3,ORCL4"
Remove the Instance from the Node with DBCA
DBCA is probably the easiest and fastest way to delete an instance from a RAC database. If we assume you want to remove rac-server5
from the RAC configuration we’ve built in post 4 and post 5, you’ll start by running the command below from any of the servers:
rac-server5$ . oraenv ORCL rac-server5$ dbca -silent -deleteInstance \ -gdbName ORCL \ -instanceName ORCL5 \ -sysDBAPassword xxx
You need to use the SYS
password or at least to know the one from a SYSDBA
user.
Manually Remove the Instance from the Node
In some situations, it’s useful to know how to manually delete an instance from a database. In that case, the steps to follow are these.
Step 1: Stop the instance you want to remove *
Use srvctl
as below. In the case the server is gone, it’s very likely you don’t have to stop the instance.
srvctl stop instance -d ORCL -i ORCL5
Step 2: Remove the instance from the clusterware
Once the instance is stopped, you can delete it from the database’s list of instances by running the following command below from any one of the cluster nodes:
rac-server1$ srvctl remove instance -d ORCL -i ORCL5
Step 3: Remove the init.ora
and password file *
Connect on the server you are removing and delete those two files:
rac-server5$ cd $ORACLE_HOME/dbs rac-server5$ rm initORCL.ora rac-server5$ rm orapwORCL
Step 4: Remove the parameters from the spfile
For the instance you’ve stopped, display the parameters that are set at the instance level with the query below on any of the remaining instances:
SQL> col name format a30 SQL> col value format a78 SQL> set lines 120 SQL> select name, value from v$spparameter where sid='ORCL5';
Reset all those parameters from the spfile
as below:
SQL> alter system reset thread scope=spfile sid='ORCL5'; SQL> alter system reset instance_number scope=spfile sid='ORCL5'; SQL> alter system local_listener scope=spfile sid='ORCL5'; SQL> alter system undo_tablespace scope=spfile sid='ORCL5';
Step 5: Remove the UNDO tablespace
From Step 4, you should have figure out what the UNDO
tablespace from the instance was. You can check the tablespace and drop it:
SQL> drop tablespace UNDOTBS5 including contents and datafiles;
Step 6: Drop the redo log thread
From Step 4, you should have figured out what the instance thread was. You can disable it and drop the associated redo log groups:
SQL> alter database disable thread 5; SQL> select group# from v$log where thread#=5; SQL> alter database drop logfile group 13; SQL> alter database drop logfile group 14; SQL> alter database drop logfile group 15;
Step 7: Change the TNS aliases
The easiest way to manage the network files is to have the exact same files on each one of the servers. The entries you’d want to have in the tnsnames.ora
file are:
LISTENER_<server_name>
for each one of the servers. These aliases point to the VIP end of the listener from each one of the servers and are used in thelocal_listener
parameter of each instance.LISTENERS_<gDbName>
is an alias that points to all the listener VIPs, and is used by theremote_listener
parameter.<gDbName>
is an alias that points to all the listener VIPs to connect to the database.<Instance_Name>
are aliases that point to the local listener and specify theinstance_name
parameter to force the connection to a specific instance.
Edit the tnsnames.ora
files on all the nodes and remove the VIP alias from the ORCL
and LISTENERS_ORCL
aliases. Also remove the ORCL5
and LISTENER_ORCL5
aliases.
Step 8: Delete the administration directories *
Locate the various administration directories for the instance you are removing, and remove them from the server:
rac-server5$ cd /u01/app/oracle/admin rac-server5$ rm -rf ORCL
Step 9: Remove the instance prefix *
Edit the oratab
file and delete the entry for the RAC database on the node you are removing.
Remove ASM from the Node
None of the assistants will give you a hand in this, but deleting an ASM instance is straightforward. It consists in (1) stopping the ASM instance*; (2) removing the ASM instance from the OCR; and (3) deleting the ASM init.ora
file*. Execute the commands below:
rac-server1$ srvctl stop asm -n rac-server5 rac-server1$ su - rac-server1# srvctl remove asm -n rac-server5 rac-server1# exit rac-server1$ ssh rac-server5 rac-server5$ rm $ORACLE_HOME/dba/init+ASM5.ora
Remove the Listener from the Node
The only supported way to remove the listener with 10g is to use NETCA. If the server is still part of the cluster, up and running, you can just run the command below:
export DISPLAY=:1 rac-server5$ netca /silent /deinst /nodeinfo rac-server5
If you don’t have all the nodes up and running, this command will fail, and the only way to remove the listener with 10.2, even if not supported, will be to run crs_unregister
as below from one of the remaining nodes. (Does anybody want to comment that practice?):
rac-server1$ cd $ORA_CRS_HOME/bin rac-server1$ ./crs_stat |grep lsnr rac-server1$ ./crs_unregister \ ora.rac-server5.LISTENER_RAC-SERVER5.lsnr
Be careful! It works with the listeners but it won’t with any other Oracle Resource. If you run that command and it fails for any reason, you’ll have to restore the OCR.
Remove the Database Software
By removing the software, there are two separate things to do: (1) update the Oracle Inventory on all the nodes that remain so that you remove all the software installer links to the server; (2) remove the software from that node*. This second operation is required only if the node is to be reused.
Update the inventory of the remaining nodes
Oracle Universal Installer (OUI) can do this from any of the remaining nodes. What you’ll declare in that specific case is that only rac-server1
, rac-server2
, rac-server3
, and rac-server4
will still be part of the clustered installation. In order to update the inventories of these nodes, run:
rac-server1$ export ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1 rac-server1$ cd $ORACLE_HOME/oui/bin rac-server1$ ./runInstaller -silent -updateNodeList \ ORACLE_HOME=$ORACLE_HOME \ "CLUSTER_NODES={rac-server1,rac-server2,rac-server3,rac-server4}"
Once you’ve updated the inventory, OUI will never again prompt you for rac-server5 when used from any of these four nodes.
Delete the Database Software from the node to be removed *
One of the limits of the OUI with RAC is that you can not use it for one node only. It’s probably a good thing in a way, as you cannot apply a Patch Set on one node only and mess up everything. Unfortunately, when it happens that you want to remove the software from one server only, you have to work around that limit. The way to do it is to update the inventory of that node only and let it think it is the only node of the cluster. Once you’ve done that, you’ll be able to execute the OUI with various syntaxes such as detachHome
or deinstall
. To change the inventory on the node to be removed, connect to it and run:
rac-server5$ export ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1 rac-server5$ cd $ORACLE_HOME/oui/bin rac-server5$ ./runInstaller -silent -updateNodeList \ ORACLE_HOME=$ORACLE_HOME \ "CLUSTER_NODES={rac-server5}" \ -local
Once the inventory updated, all the runInstaller
commands you give will touch only the local ORACLE_HOME
(assuming it is not shared). You can then, as described in the documentation, remove the ORACLE_HOME
you want and withdraw it from the cluster:
rac-server5$ cat /etc/oraInst.loc rac-server5$ cd /u01/app/oraInventory/ContentsXML rac-server5$ grep NAME inventory.xml rac-server5$ export ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1 rac-server5$ cd $ORACLE_HOME/oui/bin rac-server5$ .runInstaller -silent -deinstall -removeallfiles \ "REMOVE_HOMES={/u01/app/oracle/product/10.2.0/db_1}"
You can also detach the ORACLE_HOME
from the inventory with the -detachHome
syntax as below:
rac-server5$ cat /etc/oraInst.loc rac-server5$ cd /u01/app/oraInventory/ContentsXML rac-server5$ grep NAME inventory.xml rac-server5$ export ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1 rac-server5$ cd $ORACLE_HOME/oui/bin rac-server5$ .runInstaller -silent -detachHome \ ORACLE_HOME="/u01/app/oracle/product/10.2.0/db_1" \ ORACLE_HOME_NAME="OraDB102Home1"
This second approach allow you to keep the ORACLE_HOME
and delete its content only if or when you want:
rac-server5$ rm -rf /u01/app/oracle/product/10.2.0/db_1
If that’s the last database software installed on the server, you can delete the oratab
file too:
rac-server5$ rm /etc/oratab
Remove the ONS Configuration
In order to remove the ONS subscription from the server, you can first query the ons.config
file as below:
rac-server5$ cd $ORA_CRS_HOME/opmn/conf rac-server5$ grep remoteport ons.config
If you cannot access the server because it is not available anymore, you can also dump the OCR with ocrdump
and look at the content of the ONS config from that file. Once you know what port has to be deleted from the configuration, remove it from the cluster registry from any of the nodes:
rac-server1$ cd $ORA_CRS_HOME/bin rac-server1$ ./racgons remove_config rac-server5:6200
Remove the NodeApps
The nodeapps includes the GSD, the ONS, and the VIP. You can simply remove them from any of the nodes with the srvctl remove nodeapps
command:
rac-server1$ srvctl stop nodeapps -n rac-server5 rac-server1$ su - rac-server1# ./srvctl remove nodeapps -n rac-server5
You can check that the nodeapps has been removed by querying their status with srvctl
or by checking the resources named ora.rac-server5
as below:
rac-server1$ cd $ORA_CRS_HOME/bin rac-server1$ ./crs_stat |grep "ora.rac-server5"
Remove the Clusterware Software
Removing the clusterware software is very similar to removing the database software–there are two separate things to do: (1) update the Oracle Inventory on all the nodes that remain; and (2) remove the clusterware from that node*. This second operation is required only if the node is to be reused.
Update the inventory of the remaining nodes
The syntax is identical to that of the database removal, except you have to add the CRS=TRUE
directive as below:
rac-server1$ export ORA_CRS_HOME=/u01/app/crs rac-server1$ cd $ORA_CRS_HOME/oui/bin rac-server1$ ./runInstaller -silent -updateNodeList \ ORACLE_HOME=$ORA_CRS_HOME \ "CLUSTER_NODES={rac-server1,rac-server2,rac-server3,rac-server4}" \ CRS=TRUE
Once you’ve updated the inventory, OUI will never again prompt you for rac-server5 when used from any of those four nodes.
Delete the Clusterware Software from the node to be removed *
To delete the clusterware from the node you want to delete, you must first stop it, if you haven’t already:
rac-server5$ su - rac-server5# cd /u01/app/crs/bin rac-server5# ./crsctl stop crs rac-server5# ./crsctl disable crs
Then, update the inventory as below:
rac-server5$ export ORA_CRS_HOME=/u01/app/crs rac-server5$ cd $ORA_CRS_HOME/oui/bin rac-server5$ ./runInstaller -silent -updateNodeList \ ORACLE_HOME=$ORA_CRS_HOME \ "CLUSTER_NODES={rac-server5}" \ CRS=TRUE \ -local
Finally, run the OUI with the deinstall
and CRS=TRUE
directives as below:
rac-server5$ cat /etc/oraInst.loc rac-server5$ cd /u01/app/oraInventory/ContentsXML rac-server5$ grep NAME inventory.xml rac-server5$ export ORA_CRS_HOME=/u01/app/crs rac-server5$ cd $ORA_CRS_HOME/oui/bin rac-server5$ .runInstaller -silent -deinstall -removeallfiles \ "REMOVE_HOMES={/u01/app/crs}" \ CRS=TRUE
Additional cleanup *
If the node you are removing is still accessible, and you plan to reuse it (say, for another cluster) there is some additional cleanup to do:
- delete any file that would remain in the Clusterware Home
- delete any specific entries in the oracle
.profile
file - delete any specific entries in the crontab
- delete the
oraInv.loc
file and the inventory - replace the
inittab
with the one backed up before the clusterware install:inittab.no_crs
- delete the
/var/tmp/.oracle
directory - delete the Startup/Shutdown services, i.e., with Oracle or Redhat Enterprise Linux, all the
/etc/init.d/init*
and/etc/rc?.d/*init.crs
files - delete the clusterware and
ocr.loc
files, i.e., with Oracle or Redhat Enterprise Linux, the/etc/oracle
directory - delete the storage-specific configuration files to prevent altering the shared storage from that node (
/etc/fstab
for NFS,udev
for ASM or theraw devices
)
Remove the Node from the Cluster Configuration
Everything has been removed, but if you connect to any of the remaining nodes and run olsnodes
, you’ll see the server is always registered in the OCR:
rac-server1$ cd /u01/app/crs/bin rac-server1$ ./olsnodes -n -i rac-server1 1 rac-server1-priv rac-server1-vip rac-server2 2 rac-server2-priv rac-server2-vip rac-server3 3 rac-server3-priv rac-server3-vip rac-server4 4 rac-server4-priv rac-server4-vip rac-server5 5 rac-server5-priv
In order to remove that server from the OCR, connect as root
on any of the remaining nodes and use its name and number with the rootdeletenode.sh
script as below:
rac-server1$ su - rac-server1# cd /u01/app/crs/install rac-server1# ./rootdeletenode.sh rac-server5,5 rac-server1# exit rac-server1$ cd /u01/app/crs/bin rac-server1$ olsnodes -n -i
More to come
This is it! In these three last posts, you’ve installed a 10.2 RAC, added one node, and removed one node without any X Display. Any comments for now? I hope you’ll agree that it’s pretty easy once you’re use to it. You’ll probably start (if it wasn’t the case before) to leverage RAC’s ability to scale up and down accordingly to your needs.
In the parts to follow, we’ll do the same with an 11.1 RAC.
* If you cannot access the server you are removing, don’t run this step. BACK