OS Migration on Cloud
Recently a client asked to upgrade their Oracle EBS 12.1 running on RHEL6 to RHEL7 hosted on Amazon Cloud. On the surface, it didn’t look like a job for an Application DBA, but rather, for a SysAdmin or CloudOps team. In collaboration with CloudOps, we managed to create an effective solution. Here’s how we did it.
The cloud offers a wide range of flexibility in all aspects, from instance type to storage type, network configuration and more. Oracle E-Business Suite (EBS) feels like a relic compared to modern cloud applications because it doesn’t use any fancy features. However, that basic infrastructure setup made the job rather interesting and easy.
Choosing right method
In the past, as part of in-house solutions, we would evaluate two possible scenarios:
- In-place OS upgrade: this solution allows you to keep existing hardware and configurations, but doesn’t leave much room for a contingency plan.
- Migration to new hardware: this is a good solution if the existing hardware lifecycle is close to the end and it takes care of the contingency plan. However, it introduces the complexity of reconfiguring existing integrations and basic rapid cloning might need a long downtime window.
In this case, while running on Amazon Cloud, we found storage snapshots to be spot-on. By design, those Linux machines were developed for holding all application data separately from the root filesystem that makes this job even possible. The database server was running Oracle Grid Infrastructure to support Oracle Automatic Storage Management (ASM) and there were multiple Amazon EBS volumes as ASM disks. Added to these was an Oracle Data Guard configuration. Overall, Oracle EBS configuration isn’t complicated, but still requires some attention to detail.
The whole migration can be split into two parts: preparing for activity and downtime.
The tasks didn’t involve advanced techniques that would require significant downtime, so we had room to work comfortably. Also, we observed that sometimes it’s hard to estimate how long some cloud activities will take. For example, if full RMAN backup takes one hour, transfer another hour and restore another hour, it’s easy to estimate that the whole project will take approximately three hours. With cloud snapshots, there’s no metric to calculate how long it will take. However, we observed that it’s important to take snapshots in advance to warm up the storage facility so that once it’s time for the last snapshot, the rest of the timing is more predictable. This migration also gave us the opportunity to upgrade from gp2 to Amazon EBS gp3 with increased throughput and to do it cheaper.
In the preparation phase, we performed several tasks, including instance provisioning, OS requirement and kernel parameter. Ansible is very handy in this phase because it consistently performs throughout the dev, uat and prod phases and handles all configurations.
Once a host is available and Ansible playbooks are run, those systems are practically ready to start the Oracle application. Oracle Apps DBA knows how sensitive Oracle EBS is to a machine hostname, so it’s important to keep the same hostnames and network on new machines to ensure the migration works. Then, once the system starts in new hosts, there’s no need to reconfigure Oracle applications from scratch.
Another benefit of using Amazon is security groups. Usually, networking and firewall rules are attached to those and once the host has all the required security groups, it’s ready to substitute. Usually, FTP-based interfaces will rely on SSH keys and that location is under/home which is the root filesystem, so these are some configuration pieces that should be transferred manually from source to destination.
Cutover – Downtime
Here’s the interesting part. Once downtime started, our general approach was to stop the system, create snapshots of all filesystems tied to Oracle EBS application and database and create a new filesystem into gp3 from those snapshots. The next step was to attach those newly created file systems into new Linux instances and start up applications.
From an Oracle perspective, the trickiest part is Oracle Clusterware as this piece requires deconfig and configuration. However, it doesn’t take much time and is surprisingly smart enough to map ASM disks accordingly. Worth noting in this step is the fact that after deconfig all service control (srvctl) services, such as ASM and Listener, need to be registered again.
Since the same virtual hostnames were used from an Oracle EBS perspective, only autoconfigs were run. Then, the system was ready to start with a few tweaks at the Amazon load balancer to redirect traffic to newly created instances.
As mentioned already, this isn’t the definitive “minimal downtime” approach, but one of the most efficient from a cost and labor perspective. Throughout the entire migration process, we had an original Oracle EBS instance untouched and stopped, ready to operate as a contingency plan if something went south. On top of that, we managed to migrate to cheaper and slightly improved storage. From a cost perspective, there were two production copies at the same time and after running new systems for a week and feeling comfortable (having a full regular backup cycle in place), the original Linux system could be dropped, along with all the snapshots.
It’s always worth knowing which tools are available and to pick those that best suit the job. In this case, I think we leveraged many cloud opportunities, but they come at the cost of getting used to them.
Feel free to drop any questions in the comments. Don’t forget to sign up for the next post here.