RMAN Recipes: How to Switch Oracle Logs Automatically

Feb 27, 2007 / By Paul Moen

Tags: ,

So you have swallowed the standby bait. You have used RMAN duplicate to create a sparkling new standby and things are looking rosy. Then, when you check the lag between the primary and standby (in MAXIMUM PERFORMANCE mode), you discover the standby is miles behind.

You were expecting a solution that kept the standby as close to the primary as possible without the expense of making the primary wait until all changes are applied on the standby, as happens in MAXIMUM PROTECTION mode or MAXIMUM AVAILABILITY mode.

Oracle (at least from 9iG) rides to the rescue again. The parameter ARCHIVE_LAG_TARGET tells Oracle to make sure to switch a log every n seconds. It can be dynamically set using ALTER SYSTEM, e.g.:

ALTER SYSTEM SET ARCHIVE_LAG_TARGET = 600 SCOPE=BOTH;

This sets the maximum lag to 10 mins.

No more stuffing around with shell scripts to run essentially: ALTER SYSTEM SWITCH LOGFILE;. Oracle does it for you.

Feedback for Oracle: make the default something other than 0, maybe 1800. I am a crusty old DBA (well, not that crusty compared to some) and nowadays I expect stuff to work like this out-of-the-box. DWIM!

Have Fun!

Paul.

Update: Thanks to Howard for the correction, this has been around since 9i

Links: More info about Oracle data protection modes

9 Responses to “RMAN Recipes: How to Switch Oracle Logs Automatically”

  • JASON ARNEIL says:

    If you really, really want your standby as up-to-date as possible you can use real time apply:

    sql> alter database recover managed standby database using current logfile disconnect from session;

    I am using this and consistently see an apply lag of under 20 seconds.

    Of course setting the standby so up-to-date ain’t gonna help with human error as this is going to just be blindly applied downstream on the standby, so maybe an archive lag of several hours could save you more pain.

    jason.

  • Paul says:

    Thanks Jason for the hint.

    Here is the link to the doco.

    You can also set a delay on the LOG_ARCHIVE_DEST parameter so that the standby can be delayed in applying the redo, to give you time to stop human error.

  • Howard Rogers says:

    Just wanted to remind people that ARCHIVE_LAG_TARGET was a 9i invention and that it can be used regardless of whether you use standby databases or not. In the absence of a standby, the parameter simply forces a log switch at a specified interval, and that can be very useful.

    My main production database, for example, was log switching sometimes every 60 seconds and sometimes not for 45 minutes or more. Performance was all over the place, and recovery issues were clouded (just how much ‘work time’ was in that current log that’s just been lost?). Well, I simply created all logs to be 2GB (rather larger than the 45+ minute logs had ever represented), and set archive_lag_target to be 1800. I now have a predictable, constant rate of log switching which more or less rules out the redo subsytem from consideration when there’s a performance problem or a recovery question from management… In short, more people than just standby database users should be using this very handy parameter.

  • Jerry Lee says:

    Thanks for sharing this Howard – this is quite a neat trick.
    I see you said “more or less rules out …” – i would say less :)
    Since most performance issues that arises involving the redo subsystem comes from too much redo (contention on redo allocation/copy latches) and consequently lgwr not being able to keep up with the volume of redo records that it needs to write to the redo logs. Performance issue could only be compounded by the “stall” in processes writing in the redo buffer (since processes are prevented from writing to the log buffer in the event of a log switch). But it’s just a small piece of the pie, at least in my experience – and the performance issue arising from the stall when a big burst of write requests to the log buffer after a lot of processes queue up to write is just a consequence of too much redo. Or maybe it’s just me :)

    But what your technique does to make recovery time more or less easy to estimate based only on how much archived logs are generated is abolutely cool…

  • Rob says:

    Does this give you a way to set things up (in 10gR2) so that you can combine maximum_performance in DataGuard with real time apply? I want to apply the logs to the physical replica as soon as possible, but I don’t want to wait until the commit is actually made (as in maximum_availaiblity or maximum_protection) as that would involve too much delay overhead. Basically I need to keep the replica within 1 minute of the primary. So a log switch setting of 50 seconds would do it, but if there’s a way to just use real-time apply that would be even better. Is there a way to do that?

  • Rob,
    If you have instance crash or failover to standby in the middle of uncommitted transactions, they will be rolled back. Why would you want to guarantee that redo is being applied synchronously before commit occurs?

  • Rob says:

    Alex,

    The point is that the primary database should be feeding the failover database as often as possible, and not pausing to guarantee a sync on commit (as with max protection or max availability). That way the primary does not suffer decreased performance waiting for the remote system(s) to handshake. Is this possible by using max performance with the ASYNC and LGWR options? Is it a supported configuration? Because its not really one of the three dataguard modes mentioned in all of the documentation, but it seems to be implied by the options available.

    Real-Time apply doesn’t really matter here, as I don’t care if the failover is kept in sync in real time. Just that the data is being transfered to the standby logs as quickly as possible.

  • Stephen Skarupa says:

    Can anyone reply to Rob’s last question?
    I am curious about a response to Rob. Thanks.

    Stephen

  • PaulM says:

    The documentation which explains maximum performance lists ASYNC and LGWR are options.

    http://download.oracle.com/docs/cd/B19306_01/server.102/b14239/log_transport.htm#sthref559

    Rob would have the changes in the standby redo logs as quickly as Oracle can send them. But given it is ASYNC, there is no guarantee.

    I wouldn’t use ARCHIVE_LAG_TARGET as a method to switch the logs every 1 minute or less. Use standby logs and real-time apply and Oracle will keep you in sync as best it can.

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>