Metalink Note on Datafile Recovery Will Corrupt Database

Nov 8, 2007 / By Christo Kutrovsky

Tags: , , , , ,

Thinking I had something new, I wrote this article about recovering deleted files. However, it turns out Frits Hoogland had already blogged about recovery of deleted files on linux, as Frits pointed out in a comment on my blog, where he also mentioned a metalink note on this matter.

The note ID is: 444749.1 “Retrieve deleted files on Unix / Linux using File Descriptors”. I went and looked into it and the procedure it describes.

Although it does explain how to recover the deleted file, this procedure will leave the database in an inconsistent state. It will corrupt your database. Queries will produce the wrong results randomly, depending on cache usage, how busy the database is, et cetera.

(Before you read the details, I would like to point out that this metalink note is not fully reviewed, as it states in the very beginning of the note:)

“This document is being delivered to you via Oracle Support’s Rapid Visibility (RaV) process, and therefore has not been subject to an independent technical review.”

The procedure outlined in the note describes how to recover the deleted file and put it in the same location as the deleted file. The problem is that it doesn’t include offlining/onlining the file, so the database ends up with two distinct copies of the file:

  1. The deleted file is opened by dbwriter and whatever other user processes (connections) that had to read from the file.
  2. The new, recovered file is open by all new sessions, or existing sessions that did not need to read from the file.

The problem with this state is that all write operations will go through the dbwriter, and thus end up in the deleted file. They will also end up in the Oracle buffer cache. They will be readable from all other sessions, as long as they are in the buffer cache. However if they get aged out of the buffer cache, “new” sessions will be reading the recovered file thus will not see these changes.

Here’s the test case with inline explanations:

/ra5a/orabkp/test/TEST/datafile> sqlplus test/test

SQL*Plus: Release 10.2.0.3.0 - Production on Thu Nov 8 12:45:40 2007

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Release 10.2.0.3.0 - 64bit Production

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50070

SQL> select file_name from dba_data_files where tablespace_name='LOST';

FILE_NAME
---------------------------------------------------
/ra5a/orabkp/test/TEST/datafile/lost.dbf

That was the setup. We know which file to delete.

This is freshly-started database. We will need multiple sessions to best show the effect. We will call the current session “SESSION A”.

Running this query opens the file lost.dbf in the current session, and loads the buffer cache:

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50070

SQL> !rm /ra5a/orabkp/test/TEST/datafile/lost.dbf

We can still query the table as it is in the Oracle buffer cache:

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50070

We can still query the table after a buffer cache flush, as the current session has the file handle open (just as dbwriter does).

SQL> alter system flush buffer_cache;

System altered.

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50070

SQL>

Now let’s start another session and see what happens. We will call this one “SESSION B”.

/app/oracle> sqlplus test/test

SQL*Plus: Release 10.2.0.3.0 - Production on Thu Nov 8 12:50:17 2007

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Release 10.2.0.3.0 - 64bit Production

The file is currently removed, but we can still query the table from the Oracle buffer cache:

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50070

However, flushing the cache, we can no longer query the table, as this session cannot find the file we deleted earlier:

SQL> alter system flush buffer_cache;

System altered.

SQL> select count(*) from lost_table;
select count(*) from lost_table
*
ERROR at line 1:
ORA-01116: error in opening database file 6
ORA-01110: data file 6: '/ra5a/orabkp/test/TEST/datafile/lost.dbf'
ORA-27041: unable to open file
Linux-x86_64 Error: 2: No such file or directory
Additional information: 3

SQL>

Just for fun, we’re back in “SESSION A”, which still has the file handle open. We can read it and load the buffer cache:

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50070

SQL>

And now we can read it again from SESSION B (which cannot open the file):

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50070

Now we are going to implement the fix as described in Metalink Note:444749.1.:

/app/oracle> cat /proc/23937/fd/23 > /ra5a/orabkp/test/TEST/datafile/lost.dbf

/app/oracle> sqlplus test/test                 

SQL*Plus: Release 10.2.0.3.0 - Production on Thu Nov 8 12:54:10 2007

Copyright (c) 1982, 2006, Oracle.  All Rights Reserved.

Connected to:
Oracle Database 10g Release 10.2.0.3.0 - 64bit Production

Connecting to a brand new session, we can query the table (still from buffer cache):

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50070

SQL> alter system flush buffer_cache;

System altered.

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50070

At this point we should be okay, right? The table is readable from both new and old sessions, even after flushing the cache.

Wrong. The database is in a inconsistent state. Here’s a test.

Back in SESSION A (with the open file)

SQL> 
SQL> insert into lost_table select * From lost_table where rownum <=1;

1 row created.

SQL> commit;

Commit complete.

We inserted a row, now we count it and see it:

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50071

In “SESSION B”, we can also see it (through the buffer cache).

SQL> 
SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50071

However if we flush the cache:

SQL>  alter system flush buffer_cache;

System altered.

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50070

…we see the old data — unchanged. That is because the dbwriter process wrote to the file that we have deleted. When we “restored it” we restored a new “entity” of that file. I.e., we copied its current contents to a new file. However the old file is still there, and that’s where the changes went.

In my first item on this, my solution was to create a symbolic link to the deleted file, thus using the same file.

To further show how wrong this is — even though we still have the correct file still open, if we query the table, we will be reading from the buffer cache, which was populated from the Oracle session that had the incorrect file open:

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50070

SQL> alter system flush buffer_cache;

System altered.

SQL> select count(*) from lost_table;

  COUNT(*)
----------
     50071

Flushing the cache reads the “correct” data again.

Even if you insert the data in “SESSION B” (the one with the “new” file), it will still write to the old file, as the dbwriter process is writing all changes.

These are very simple cases to illustrate the mess this will create. Now consider transactions, undo data for non-existing data, and partial results from new and old file. I won’t even mention direct path reads and writes.

The critical step that’s missing is offlining the file, recovering it, and onlining it again. By offlining the file, dbwriter (as well as all other processes) will release the handle for that file, and its entity will cease to exist. Recovering the file will apply all changes that were missed.

4 Responses to “Metalink Note on Datafile Recovery Will Corrupt Database”

  • Christo Kutrovsky says:

    I’ve been thinking a bit more about this.

    You would also need to do “begin backup” before the cat and “end backup” after the cat. Otherwise you may get fractured blocks.

  • Frits Hoogland says:

    Excellent follow up chris!

    In the examples in my blog post (using the online redologfiles) I’ve already addressed the ‘open filedescriptors trap’ by shutting down the database during that example.

    This just shows once again that careful planning of a recovery and knowing your stuff pays off, and that taking a shortcut could cost much more in the long run!

  • David Russell says:

    You’ve written a good article. I currently am working in a MS shop and only have two instances on Linux and do not plan on messing with them anytime soon. I still found this article interesting and the concepts should apply to other platforms, as well, even if not written up on Metalink.

    What I would like to see, to complete the article, is for it not to be left in the state of “the critical step that’s missing is…” and see a definition of how to apply the critical step. From your article I cannot tell whether it is too late at this point to take those steps, or whether a totally different approach would be required because you “can’t get there from here”, or what?

  • Alex Gorbachev says:

    And if that doesn’t work, Pythian can always help to recover data from a database that cannot be opened!

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>