Oracle GoldenGate Extract Internals, Part II

Jan 26, 2010 / By Alex Fatkulin

Tags: , ,

Today we continue looking at various aspects of how the Oracle GoldenGate extract process works.

One of the follow up questions to part I was about the way the Extract process reads from ASM storage. I’ve provided the answer, however, today we’re going get a detailed look at how the Extract process interacts with an ASM instance and what kind of implications may result.

Let’s take a look at what changes in the Extract process loop when it reads from ASM storage:

nanosleep({1, 0}, NULL)                 = 0
...
write(18, "\37\333\0\0\6\0\0\0\0\0\3^\v)\4\4\0\1\0\0\0\376\377\377\377\377\377\377\377C\0\0"..., 8155) = 8155
write(18, "\37\333\0\0\6\0\0\0\0\0\1\1\0Z\7\n\0\0\0\0\0\1\2\0\0W\7\n\0\237\0\0"..., 8155) = 8155
write(18, "\37\333\0\0\6\0\0\0\0\0\1\0\4\0\0\0\237\0\0\1u\7\n\0\0\0\0\0\1\2\235\336"..., 8155) = 8155
write(18, "\22\367\0\0\6\0\0\0\0\0\0.\1\300\0\204\0\n\0\0\200\0\0\217\7\n\0\237\0\0\1\232"..., 4855) = 4855
read(18, "\37\333\0\0\6\0\0\0\0\0\v\1\5\276\4\0\0\0\0\0\1\0\0\0\0\0\0\0\260[\211\312"..., 8208) = 8208
read(18, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8155) = 8155
read(18, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8155) = 8155
read(18, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8155) = 4540
...
nanosleep({1, 0}, NULL)                 = 0

As before, to make the loop a bit more understandable, I’ve removed a bunch of syscalls. Let’s take a look at the file descriptor 18:

[root@gg1 fd]# ls -l 18
lrwx------ 1 oracle oinstall 64 Jan 24 20:56 18 -> socket:[24439]

Let’s find out more about this socket…

[root@gg1 fd]# lsof -p 7725 | grep 18u
extract 7725 oracle   18u  IPv6              24439                TCP [::1]:16428->[::1]:ncube-lm (ESTABLISHED)
[root@gg1 fd]# lsof -i :16428
COMMAND  PID   USER   FD   TYPE DEVICE SIZE NODE NAME
extract 7725 oracle   18u  IPv6  24439       TCP [::1]:16428->[::1]:ncube-lm (ESTABLISHED)
oracle  7753 oracle   16u  IPv6  24440       TCP [::1]:ncube-lm->[::1]:16428 (ESTABLISHED)
[root@gg1 fd]# ps -f -p 7753
UID        PID  PPID  C STIME TTY          TIME CMD
oracle    7753     1  0 20:56 ?        00:00:00 oracle+ASM (LOCAL=NO)

In place of the read syscalls there is now some sort of communication going on with ASM instance.

Alas, when it comes to tracing, an ASM instance is not the most friendly thing you can work with. But there are other ways. Enabling SQL*Net trace is enough to see what replaced the read syscall:

BEGIN dbms_diskgroup.read(:handle, :offset, :length, :buffer); END;

This is nothing other than one of the ASM internal packages which can be used to read directly from ASM disk group. How is the :handle being obtained? If we go a little bit up the trace, we can find the answer:

BEGIN dbms_diskgroup.getfileattr('+REDO/gg1/onlinelog/group_2.2547.708989191', :filetype, :filesize, :lblksize); END

The above code is used to get file attributes which are necessary to call the dbms_diskgroup.open procedure:

BEGIN dbms_diskgroup.open('+REDO/gg1/onlinelog/group_2.257.708989191', 'r', :filetype, :lblksize, :handle, :pblksize, :filesize); END;

This call will return a :handle which can later be used in dbms_diskgroup.read. The name of the online redo log file to read is being fetched by querying the RDBMS instance:

SELECT member,        DECODE(status, 'CURRENT', 1, 0),        DECODE(archived, 'YES', 1, 0)   FROM (select lf.member,        l.status,        1.archived        from v$logfile lf, v$log l        WHERE lf.group# = l.group# AND        l.thread# = :ora_thread AND        l.sequence# = :ora_seq_no AND        (lf.status NOT IN        ('INVALID','INCOMPLETE','STALE') OR        lf.status is null)        order by lf.member DESC )        where rownum = 1

Note that this query is a bit different form the one used when fetching the log file name on a file system. Both dbms_diskgroup.getfileattr and dbms_diskgroup.open calls happen when a new log file needs to be accessed (during startup or after a log file switch, for example), i.e. we do not execute these as part of a main loop listed above.

The next thing to find out is what happens in the ASM dedicated process as a result of dbms_diskgroup.read call:

read(16, "\37\333\0\0\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8208) = 8155
read(16, "\37\333\0\0\6\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8208) = 8208
read(16, "\1\0\0\0\0\0\0\0\0\0\0\v\1\33\5\0\0\31\2\2\r\0\0\0\0\0\0\372\0\300\0M"..., 8155) = 4803
pread(15, "\1\"\0\0\22 \0\0\21\0\0\0H\200\305\365\5\24\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 28672, 136324096) = 28672
write(16, "\37\333\0\0\6\0\0\0\0\0\v\1\5\326\4\0\0\0\0\0\1\0\0\0\0\0\0\0\240/V\232"..., 8155) = 8155
write(16, "\37\333\0\0\6\0\0\0\0\0\v\5\1\0\4\0\0\0\237\0\0\1g\251\v\0\0\0b\251\1\2"..., 8155) = 8155
write(16, "\37\333\0\0\6\0\0\0\0\0\0\2\0\3\0\4\0\5\0\6\0\7\0\10\0\t\0\n\0\v\0\f"..., 8155) = 8155

If you take a loot at the lsof output above, the file descriptor 16 is a socket back to the Extract process so these syscalls are related to SQL*Net messages. What we’re really interested in is a call on line# 4 with a pread syscall from the file descriptor 15:

[root@gg1 fd]# ls -l 15
lrwx------ 1 oracle oinstall 64 Jan 24 20:58 15 -> /dev/sdb1

Of course, /dev/sdb1 is a device under out REDO diskgroup:

SQL> select d.path, g.name
        from v$asm_disk d, v$asm_diskgroup g
        where d.group_number = g.group_number
                and d.path='/dev/sdb1';  2    3    4

PATH            NAME
--------------- ----------------
/dev/sdb1       REDO

What else can we say? First of all, the read size is significantly smaller, just 28672 bytes, compared to 1000K read size when the log has been located on a cooked file system. If you get such a small read size and the fact that the data needs to go through the network and SQL*Net stacks… I would say that I expect this to be much less efficient compared to how online redo logs are being read from a file system. One of the immediate things to realize is that, in case you’re running the Extract process on the same machine as your ASM instance, it probably makes total sense to configure the connection string to ASM instance (the one which is being specified in the Extract process parameters) using bequeath protocol so the traffic can go through a pipe instead of a socket which should provide better performance.


Share this article



6 Responses to “Oracle GoldenGate Extract Internals, Part II”

  • Vit Spinka says:

    I guess the reason for much smaller read in ASM comes from the fact that the :buffer parameter passed to dbms_diskgroup.read is a RAW, and PL/SQL does not support more than 32k (not even for LONG RAW).
    And trying using BLOB ends in “ORA-01220: file based sort illegal before database is open”.
    Perhaps they could get around this limitations using x$kffxp, but were not brave enough to handle ongoing metadata changes due to rebalances, failover to secondary extents etc.

    On a bit different note – GoldenGate does not support ASM for archivelogs, at least per documentation. Any idea, why? dbms_diskgroup seems to work on them fine…

  • Vit,

    32K limit sounds like a reasonable assumption as an upper limit for a single call. The stacks when reading from a file system vs ASM are quite different and a small read call is just part of the story as now the Extract have to rely on a dedicated server process to do part of the lifting. There is still an option to sent more than 32K of data by using multiple buffers but this may not yield any tangible improvements unless we’re using network and the latency is high. Alas, the highest value of ASMBUFSIZE is 28672.

    Archivelogs are not supported in ALO mode only. Technology-wise I don’t see why there would be any show stoppers in doing this. They also do not support DDL replication in ALO mode so I wonder whether these limitations resulted from a different codepaths used and will be lifted eventually.

  • […] This is the third post in Oracle GoldenGate Extract Internals series (links to part I and part II). […]

  • […] 4-Internals of goldengate extract process when ASM is used Alex Fatkulin-Oracle GoldenGate Extract Internals, Part II […]

  • hayata says:

    I want to read offline archive log contents through Golden Gate. Will it display same contents like oracle logminor. i want to read individual archive log file contents. what are the steps. if you could reply to my email.

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>