Total Recall by DUDE: Extracting Oracle Data After RAID Corruption
Jul 13, 2007 / By Alex Gorbachev
Last Friday, a customer paged me to say that they could not open an Oracle 8.0.5 database after the RAID array crashed. Before continuing the story, I should mention that this database wasn’t under our Pythian umbrella and my team had no idea of its existence — some customers choose to let us manage everything, others try to find some balance. Well, it’s their choice. But I digress.
It turned out that this database was in NOARCHIVELOG mode and there was not even a cold backup. Furthermore, they didn’t even have a database export. It’s always Friday when stuff like this happens!
First of all, initial crash recovery failed with corrupted online redo logs. Then the client tried manual recovery and that didn’t help. Unfortunately, they didn’t make any cold backups right after the crash, so I couldn’t be sure that all the attempts didn’t do some harm.
I tried to recovery by slowly incrementing the SCN, trying different redo log members and few other tricks. However, I got a stuck recovery with ORA-600 , a.k.a. “missed writes”.
OPEN RESETLOGS with
_allow_resetlogs_corruption succeeded reseting the log files, but the database stayed unopened, returning fancy errors messages — some objects not found, ORA-600, etc. I wish this were as pleasant as tasting ORA-600 chocolates (photo courtesy of Doug Burns):
In a nutshell, it seemed like the SYSTEM tablespace was not in good enough shape to open the database.
I had a few more ideas, but it was definitely the time to recall Kurt Van Meerbeeck and his all-time-hit tool — the DUDE. DUDE stands for Database Unloading by Data Extraction. What it does is extract data from a database that cannot be opened, including in cases of database corruption. Of course, it depends how much your database is corrupted, but let me tell you — it worked like magic in our case.
About a year ago, Doug Burns did an excellent post, “DUDE, Where’s My Data?“. You can find a lot of details there, including an example of the data extraction and its capabilities. I had a pleasure of watching “Mr. DUDE” himself presenting during MSDBF 07, and if you have a chance to see his presentation somewhere, by all means, don’t miss it. For more details, refer to Kurt’s whitepaper — the DUDE Primer.
If you or your customers have landed in such unlucky circumstances that they need this tool, you can find more details on www.ora600.be. Basically, you can either get a specially crafted version working on your database for seven days, or use the data extraction service. Since I knew already how to use DUDE, we went with the first method .
It took me only few hours to collect some data for Kurt to produce a demo version. Our customer first wanted to make sure that the tool worked before shelling out for it, even though the price for such a tool is more than affordable than what Oracle consulting would charge.
With the demo version I was able to extract all tables but only the first 10 rows of data. After final confirmation that everything worked, the client got the full seven-days license. It took me only few more minutes to re-run the extraction commands, and then DUDE magically produced dump files for all tables. After that I only had to copy them over to the another machine and wrap up a small shell script to import them all.
I should add that DUDE also generated all sequences and PL/SQL objects. I could probably have extracted indexes and constraints by dumping a few SYS objects, but since the client had another database with the same data model (but different data), it was easier to extract the definitions from there.
The client was very impressed by how we were able to recover from a deadly situation like that, and it was only possible thanks to the DUDE and his creator: thanks a lot Kurt — for helping make Friday a relatively short working day!
One Response to “Total Recall by DUDE: Extracting Oracle Data After RAID Corruption”
Leave a Reply