Oracle RAC on the Cloud, Part 3

Nov 13, 2013 / By Marc Fielding

Tags: , , , , ,

This is part 3 of a multipart series of getting Oracle RAC running on a cloud environment. In part 1, we set up a NFS server for shared storage. In part 2, we set up OS components for each RAC server. Now we finish up the OS configuration and move to the Oracle grid infrastructure.

Passwordless SSH, take two

Now that we have Oracle users on both rac01 and rac02, we need to configure passwordless SSH between them. (It’s also possible to do it from the installer, but I prefer to do it myself)

On rac01-pub as Oracle:

cd ~/.ssh
scp rac02-pub:$PWD/id_rsa.pub rac02.pub
(enter oracle user password, and confirm the hostkey addition)
cat rac02.pub >> authorized_keys

And on rac02-pub, again as oracle:

cd ~/.ssh
scp rac01-pub:$PWD/id_rsa.pub rac01.pub
(shouldn't have a password prompt, but you can confirm hostkey addition)
cat rac01.pub >> authorized_keys

Getting RAM for the install

Before we run the Oracle installer, we should expand the physical RAM for each machine. This can be done from the Gandi control panel for each server. When I first tried this I got a quota error, and had to raise a support ticket (and wait for a response) to get the quota raised. A second issue with the RAM is that the the VM doesn’t see the full amount of RAM allocated: when I tried firing up a 4GB instance, Linux only saw 3667716k available, and the Oracle installer promptly complained about insufficient memory.

So instead of 4096MB of memory, we’re going to adjust rac01 and rac02 to have 4800MB. After adjusting in the control panel, you may see that the operation is complete within a minute or so, but the server didn’t consistently get more memory. So while logged onto rac01 as oracle, have a look:

for host in rac01-pub rac02-pub; do echo $host; ssh $host free; done

If each node shows 4388612 total memory, you’re golden. Otherwise, reboot the nodes.
(And yes, 700mb seems like an awful amount of memory to simply not be available to the OS; I’m wondering what’s using the space?)

Getting ready for the installer

By now the Oracle software download should be complete, and we need to give the downloaded files .zip extensions and install an unzipper to use it. (Note to Oracle packagers: unzip isn’t _all_ that common in the Linux world, and gzip provides better compression rates anyways. Why not do tarballs?)

Back on rac01:

cd /srv/datadisk01/dl
for file in *zip?AuthParam*[0-9a-f]; do sudo mv "$file" "$file".zip; done
sudo yum -y install unzip
for file in *.zip; do sudo unzip $file; done

Now setting up a remote VNC so that we can actually run the Oracle installer, as well as a firewall rule to let us connect:

sudo yum -y install tigervnc-server xterm twm
sudo iptables -I INPUT 4 -p tcp --dport 5901 -j ACCEPT

Starting the server; you’ll need to supply a password the first time around. (still logged in as oracle)

vncpasswd
vncserver :1

Now we start a VNC viewer locally. If you don’t have one already, you can download one from www.realvnc.com

Grid Infrastructure Install

Connecting to display 1 of the server external IP, you should get an xterm window if all went well. Running the installer:

cd /srv/datadisk01/dl/grid
./runInstaller

Skipping software updates, and doing a cluster install, using a standard cluster. Doing an advanced install. Using the default language. Under “grid plug and play” we need to set up the node naming. Using cluster name “rac-cluster”, and SCAN name “rac-cluster” as defined in /etc/hosts earlier. On the cluster node screen, we should see that rac01-pub has been detected. Adding rac02-pub too, with rac02-pub-vip as its VIP address.

Now comes the validation, where we learn if SSH, naming etc were properly set up. If all goes well, you’ll make it to the network interface usage screen. Here we need to make changes: eth0 shouldn’t be used, eth1 is public, and eth2 is private. The management repository is a choice: it takes up memory and install time, but it does allow us to use such things as QoS management, and it can only be created at install time. I chose to skip.

For storage, we’re using a shared file system: the NFS we created. Using external redundancy since it’s a single disk anyway. Doing the same for the voting disk.

Not using IPMI. We’ll also leave the ASM oper group blank, and accept the warning.

Using the default “/u01/app/oracle” and “/u01/app/12.1.0/grid” directories for ORACLE_BASE and grid home. Using /u01/app/oraInventory for oraInventory. You can either run sudo yourself or let the installer do it. I like to run it myself to have more control over re-running and deconfigs if required.

Now it’s time for the prerequisite checks. If all previous steps have succeeded, you shouldn’t see any warnings at all.

Saving the response file and kicking off the install itself.

Running the orainstRoot.sh from the oraInventory, plus root.sh from the grid home. Running on rac01 first.

Just got errors starting ASM:

PRCR-1079 : Failed to start resource ora.asm
CRS-2672: Attempting to start 'ora.asm' on 'rac01-pub'
CRS-5017: The resource action "ora.asm start" encountered the following error:
ORA-00600: internal error code, arguments: [SKGMHASH], [1], [18446744073549507196], [0], [0], [], [], [], [], [], [], []
. For details refer to "(:CLSN00107:)" in "/u01/app/12.1.0/grid/log/rac01-pub/agent/ohasd/oraagent_oracle/oraagent_oracle.log".

CRS-2674: Start of 'ora.asm' on 'rac01-pub' failed
CRS-2679: Attempting to clean 'ora.asm' on 'rac01-pub'
CRS-2681: Clean of 'ora.asm' on 'rac01-pub' succeeded

CRS-2674: Start of 'ora.asm' on 'rac01-pub' failed
2013/11/05 21:28:33 CLSRSC-113: Start of ASM instance failed

Preparing packages for installation...
cvuqdisk-1.0.9-1
2013/11/05 21:28:54 CLSRSC-325: Configure Oracle Grid Infrastructure for a Cluster ... succeeded

But there is no ASM here, so ignoring the error.

On rac02, it didn’t even try starting ASM with root.sh.

Running the rootinventory and root.sh since we have sudo running. It does take some time to run as the grid infrastructure is shut down and started up a few times.

Database install

Now that the grid infrastructure is in place, we can move onto the actual database install. We can re-use the same VNC window to install:

Skipping software updates and skipping the DB creation too (software only). Picking a RAC install. At this point we should see both nodes detected. Using the default language.

Installing enterprise edition with default home locations. In the group selection, it won’t let me select dba, even though the group was installed by the preinstall RPM. For now I’ll select oinstall.

The rest are default.

Running root.sh, which this time is very short.

Database creation assistant

With a database home we can run the creation assistant. But first, working on a hugepage configuration. /proc/meminfo is missing the HugePages lines entirely, and it does look like, regrettably, the supplied kernel does not support hugepages:

[oracle@rac01-pub ~]$ zgrep HUGETLB /proc/config.gz
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set

And a quick web search seems to show that, while custom kernel support has been a long-standing user request at Gandi, it’s still not available.

So onto the install. From the same VNC session:

/u01/app/oracle/product/12.1.0/dbhome_1/bin/dbca

Creating a new database. Using the advanced install with:

  • RAC database (default)
  • Admin-managed
  • General purpoase/transaction processing
  • DB name: racdb
  • non-PDB
  • Selecting both nodes to run on
  • Configuring EM express
  • Running CVU periodically
  • Picking a password
  • File system storage
  • /srv/datadisk01/oradata/oradata – default
  • Default FRA, using the default size of 5G
  • Archiving disabled
  • Skipping sample schemas and database value
  • Unselecting automatic memory management
  • Leaving the remaining parameters default

And we’re installed and have a database. It can be tested via SQL*Plus:

export ORACLE_SID=racdb
export ORAENV_ASK=NO
. oraenv
export ORACLE_SID=racdb1
sqlplus "/ as sysdba"

If all went well, you should see a SQL prompt:

[oracle@rac01-pub ~]$ sqlplus "/ as sysdba"

SQL*Plus: Release 12.1.0.1.0 Production on Wed Nov 6 23:38:51 2013

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Advanced Analytics
and Real Application Testing options

SQL>

And that’s it for the series. I made it through with 50,000 credits remaining in my Gandi account to play with.

Feel free to ping me in case of issues getting running. Some of these steps are a combination of several iterations as bugs were worked out, so it’s likely that there are some gremlins still lurking, and I’ll try and incorporate fixes as issues are discovered.

Lessons Learned

  • Yes, Oracle RAC can installed cleanly on a cloud environment, and at $17, the price is right
  • True shared storage from a cloud provider is still hard to come by, limiting the high-availability potential
  • There are quite a few extra steps required to satisfy the RAC installer and its prerequisite checks
  • In the Gandi environment, you need to overallocate RAM as not all of it is visible to the OS
  • The lack of hufepage support in the Gandi kernel (and complete lack of custom kernel support) further increases memory requirements
  • A dummy oracle-release RPM is all we need to keep the OS prerequisite checks happy

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>