Standby Log Apply Elapsed Time

This has been discussed before by my colleague Paul Moen in his article on Oracle standby recovery rate monitoring, but I recently made a discovery that makes it easier to generate both statistics on log apply performance, and more useful stats too.
First, let me say that this discovery is based on my observations and has not been verified with Oracle Support nor by any insider. If you know one who can confirm this, please ask him or her.
The discovery is a hidden column in the x$kcclh
table, which is the underlying table for gv$log_history
(and therefore v$log_history
). It’s the only column not exposed in the fixed views. Why? I don’t know but it sure would be nice to have it exposed and clearly (officially) defined. The column is x$kcclh.lhtsm
. It stores the timestamp of when the log began to be applied. The same timestamp as the one printed in the alert log for the “Media Recovery Log…” message and the same timestamp as the one recorded in the v$dataguard_status.message
text column.
The advantage of accessing this information from the x$kcclh
table is that you get access to more historical data (the number of days will vary based on your redo log switch rate and the size of the control files and possibly on other factors).
The time difference between two consecutive logs, say 1 and 2, gives us the time spent applying the log 1. At least that’s the theory. Some factors that we can’t really take into account are the log delay set in the log_archive_dest_n
parameter on the primary db (it’s an optional setting), and any unexpected events, such as an MRP0 process getting hung or a FAL server having trouble to send a log, etc.
Nevertheless, I feel it is very useful information. Any anomaly can be easily identified as it would come up with an unusual elapse time.
Here’s the query I use to get the log apply elapse time:
set linesize 80 select b.lhseq sequence#, b.lhtsm fromtime, a.lhtsm totime, ((to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') - to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS')) * 1440) minutes_to_apply from sys.x$kcclh a, sys.x$kcclh b where a.lhrid = b.lhrid + 1 order by a.lhrid ;
Sample output:
SEQUENCE# FROMTIME TOTIME MINUTES_TO_APPLY --------- ------------------- ------------------- ---------------- 97077 09/18/2008 03:28:32 09/18/2008 03:43:31 14.9833333 97078 09/18/2008 03:43:31 09/18/2008 03:58:31 15 97079 09/18/2008 03:58:31 09/18/2008 04:13:31 15 97080 09/18/2008 04:13:31 09/18/2008 04:28:31 15 97081 09/18/2008 04:28:31 09/18/2008 04:43:33 15.0333333 97082 09/18/2008 04:43:33 09/18/2008 04:58:33 15 97083 09/18/2008 04:58:33 09/18/2008 05:13:36 15.05 97084 09/18/2008 05:13:36 09/18/2008 05:28:33 14.95 97085 09/18/2008 05:28:33 09/18/2008 05:43:31 14.9666667 97086 09/18/2008 05:43:31 09/18/2008 05:58:31 15
Here’s a query to show the min, max and average elapse time:
set linesize 140 col min_log_date newline col max_log_date newline col max_log_apply_duration for a30 newline col min_log_apply_duration for a30 newline col avg_log_apply_duration for a30 newline select min(lhlot) min_log_date, max(lhlot) max_log_date, numtodsinterval(max(to_number(a_lhtsm - b_lhtsm)),'DAY') max_log_apply_duration, numtodsinterval(min(to_number(a_lhtsm - b_lhtsm)),'DAY') min_log_apply_duration, numtodsinterval(avg(to_number(a_lhtsm - b_lhtsm)),'DAY') avg_log_apply_duration from ( select a.lhlot lhlot, to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') a_lhtsm, to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS') b_lhtsm, a.lhrid from sys.x$kcclh a, sys.x$kcclh b where a.lhrid = b.lhrid + 1 ) order by lhrid;
Sample output:
MIN_LOG_DATE -------------------- MAX_LOG_DATE -------------------- MAX_LOG_APPLY_DURATION ------------------------------ MIN_LOG_APPLY_DURATION ------------------------------ AVG_LOG_APPLY_DURATION ------------------------------ 09/18/2008 04:28:30 10/21/2008 15:06:56 +000000000 04:16:55.999999999 +000000000 00:00:05.999999999 +000000000 00:11:47.666911225
The following query shows logs that took over 30 minutes to apply:
select a.lhseq, b.lhtsm fromtime, a.lhtsm totime, ((to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') - to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS')) * 1440) minutes_to_apply from sys.x$kcclh a, sys.x$kcclh b where to_date(a.lhlot,'MM/DD/RR HH24:MI:SS') > sysdate - 60 and a.lhrid = b.lhrid + 1 and ((to_date(a.lhtsm,'MM/DD/RR HH24:MI:SS') - to_date(b.lhtsm,'MM/DD/RR HH24:MI:SS')) * 1440) > 30 order by a.lhrid ;
Sample output:
LHSEQ FROMTIME TOTIME MINUTES_TO_APPLY ------ -------------------- -------------------- ---------------- 98276 09/28/2008 08:04:41 09/28/2008 11:04:41 180 98278 09/28/2008 11:25:24 09/28/2008 12:07:06 41.7 98280 09/28/2008 12:15:35 09/28/2008 14:23:01 127.433333 98294 09/28/2008 14:33:29 09/28/2008 15:13:33 40.0666667 98297 09/28/2008 15:51:18 09/28/2008 16:31:42 40.4 98423 09/28/2008 19:36:32 09/28/2008 20:20:56 44.4 98533 09/28/2008 22:16:22 09/28/2008 22:54:11 37.8166667 98662 09/29/2008 02:29:11 09/29/2008 03:37:19 68.1333333 98663 09/29/2008 03:37:19 09/29/2008 04:13:37 36.3 98664 09/29/2008 04:13:37 09/29/2008 04:44:14 30.6166667 98669 09/29/2008 05:54:33 09/29/2008 07:19:25 84.8666667 100220 10/13/2008 07:50:41 10/13/2008 08:50:27 59.7666667 100422 10/15/2008 06:10:58 10/15/2008 10:27:54 256.933333 100537 10/16/2008 08:40:24 10/16/2008 10:05:32 85.1333333
I hope you find these queries useful for monitoring your standby database log apply rate and efficiency.
Enjoy!
Marc.