What Happens When You Leave an Oracle Database in Backup Mode
While I was at that fine conference in Scotland, one of our clients did some maintenance on their Windows server where several databases were running. We had just begun supporting this machine, and hadn't had a chance to test a reboot. And for some reason, backup/recovery wasn't -- until recently -- a DBA responsibility at that organization, so that wasn't under our supervision.
The Ghost in the Windows Server
So one fine evening, there was a scheduled maintenance, and one of the databases didn't shutdown cleanly (thanks to mis-configured Windows services, if I recall correctly). Consequently, the database crashed and later didn't come back up.
When Crash Recovery Fails
That's a bit odd -- crash-recovery should have worked with no problems, but instead it required media recovery. My team-mate Neil tried recovery, and found that the database was requesting two-weeks-old archivelogs. Weird. We tried to restore from tape, but you know how it goes if it's someone else who does the backups and knows how tape manager is configured, and all those details. After a while is was clear that rushing didn't make any sense in the middle of the night, and the storage people were not available until the morning.
Investigating the SCN Discrepancy
When I looked at it in the morning, the error message rang a bell: "Datafile 1 needs media recovery" in combination with the request for very old archivelogs. My immediate guess -- the database is started with an old copy of the controlfile (I had seen that happen before, after someone messed around with relocations and screwed up init.ora).
Identifying the "Frozen" Headers
On closer examination, we figured out that the controlfile SCN was actually current while the SCNs of the datafiles were way off. There were no copies of the datafiles on the server so it seemed like someone had restored the datafiles. Weird… After more detailed investigation, Alex Fatkulin figured out that the database had been put into backup mode two weeks ago and, bingo, the datafile headers were frozen. (By the way, Alex has just joined Pythian and started in my team. A great addition I should say!)
An attempt to restore all archivelogs failed: a few gaps couldn't be restored from tape. What a surprise! Anyway, to this day, we can't fully explain why that happened, or what was going on with backups. But, at least the responsibility for backup/recovery is moving to the DBA team. Who would have thought of that? ;-)
The Moral of the Story: Ownership and Monitoring
The moral of the story: do not leave datafiles in backup mode. If you use hot backups outside of RMAN, such as snapshot technologies, take care to implement monitoring so that the database doesn't stay in backup mode for much time. We usually set up this check in our monitoring tool when backup mode is used.
Another moral: let everyone do his job. Database backup/recovery is part of the DBA's responsibilities. Another interesting story is how someone lost 5 databases, but that might be a good topic for another post.
Oracle Database Consulting Services
Ready to optimize your Oracle Database for the future?
Share this
Share this
More resources
Learn more about Pythian by reading the following blogs and articles.
How to fix EBS weblogic admin console when not accessible after CPU patching
Weblogic patch rollback issues in Oracle EBS R12.2
JSP Cache Issues in 11i and R12
Ready to unlock value from your data?
With Pythian, you can accomplish your data transformation goals and more.