ODA Re-imaging Could Take Anywhere Between 20 and 120 Minutes

May 21, 2013 / By Yury Velikanov

Tags: , , , , , , ,

20 mins vs 2 hours

Recently, I have noticed that re-imaging process on the second Oracle Database Appliance node took significantly less time compared with the first node. The difference was so significant that I started to suspect that there was something wrong with either particular set of hardware or that some of the re-imaging process steps had failed on the second node. On the first node the process had completed in 120 minutes, but on the second, it took only around 20 minutes.

I spent quite a bit of time trying to understand that exactly was happening. But before I tell you, can I ask you what theoretical options you would have come up with given the behavior I just described? Please share those with me in the comment section below. :)

Any mystery can be solved

The question is: Are we ready to pay for it? Sometimes, it takes quite a bit of effort to get to the truth, and very often, we don’t have the time, interest, or budget to find it. In this particular case, I was so curious that I had spent a good part of my weekend looking for a clue. Along the way, I had to learn a bit about “Anaconda (installer)“, SquashFS file system, how to rebuild ISO image, and how ODA re-imaging process works. The purpose of this paragraph is to encourage you to be curious and not leave mysteries unresolved. Invest  some time, and you will learn a lot on the way :)

NOTE: I will try to share the way I troubleshot this problem in my future blog posts.

Bug in the “post-install” script

It appears that the problem is in the way the ISO:/Extras/setupodaovm.sh post install script checks if software RAID has completed re-synchronization of 4 internal HDD partitions (md devices) between 2 physical disks. The following check is at the very end of the script:
mdadm --wait /dev/md1
mdadm --wait /dev/md2
mdadm --wait /dev/md3

Each of the lines is designed to check if the software RAID completed synchronizing an md device (partition). The following is part of man page for mdadm utility:

       -W, --wait
              For  each  md  device  given, wait for any resync, recovery, or reshape activity to finish before returning.  mdadm will return with success if it actually waited for
              every device listed, otherwise it will return failure.

During the re-imaging process, all 4 volumes have to be rebuilt and need to be synchronized by the software RAID. It is worth mentioning that software RAID on ODA is configured to re-synchronize one device at the time. This leaves other devices just sitting and waiting their turn with the status DELAYED.  The problem is that if a device is in the state resync=DELAYED the “mdadm –wait” check will not stop and wait for it. Therefore, just one of the mdadm checks will wait until re-synchronization process finishes. Others successfully pass even if a device isn’t synchronized yet (resync=DELAYED). Now let’s have a look on the devices’ sizes and associated synchronization times:

Name Size  Function Sych-time
md0 60M /boot few seconds
md1 17G / 10 mins
md2 217G /OVS 90 mins
md3 4G swap ~2min

Just to make life a bit more interesting, the software RAID picks up the next device to be re-synchronized randomly. This means that luck decides which device will get processed next. If it is md1 device (17GB), then the whole re-imaging process takes 20 minutes. However, if the software RAID synchronizes md2 device (217GB) during the execution of the mdadm check, the re-imaging process takes about 120 minutes.

A way to fix the problem

I am not a great expert in the Linux System Administration area (I am an Oracle DBA after all), and would rather let Oracle folks make the final call. However, it seems to me that in order to make sure that all 4 devices got re-synchronized before the re-imaging process finishes, the check should look like the following:

mdadm --wait /dev/md0 /dev/md1 /dev/md2 /dev/md3

Conclusion

To conclude, until the issue is fixed know that:

  1. You may face different ODA nodes’ re-imaging times.
  2. To be on the safe side, you should check if md devices’ re-synchronization process is finished by running “cat /proc/mdstat” command before running any business critical processes on your ODA.

Yury
View Yury Velikanov's profile on LinkedIn

PS “Stay Hungry Stay Foolish” - Steve Jobs

2 Responses to “ODA Re-imaging Could Take Anywhere Between 20 and 120 Minutes”

  • Hi Yury,

    I’m sat here looking at a completed node 2 re-image, and a “seemingly hung” node 1 re-image. Nice to know it will finish (and has actually finished while typing this!).

    Thanks.

  • [...] After having waited for all the patches it was yet time to wait again. This time for the reimaging process to the virtualized ODA environment. This basically reinstalls the base OS with Dom0 and the management tools. I don’t have much to complain here, this works quite well but be prepared to sit in front of a blue screen that says “running post-install scripts” for up to two hours. You can spend that time reading Yury’s analysis of what is happening during that time. [...]

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>