Oracle GoldenGate Extract Internals, Part III

Feb 3, 2010 / By Alex Fatkulin

Tags: , ,

This is the third post in Oracle GoldenGate Extract Internals series (links to part I and part II).

In this post, we’re going to take a closer look at various queries that the Extract process uses against the database. As before, we will start by examining the strace output:

nanosleep({1, 0}, NULL)                 = 0
...
read(20, "\1\"\0\0\255\1\0\0\217\0\0\0H\200\366\256\5\24\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1024000) = 1024000
...
write(16, "\0$\0\0\6\0\0\0\0\0\3N'\7\0\0\0\2\0\0\0`\0\0\0\0\0\0\0\7\0011"..., 36) = 36
read(17, "\0\351\0\0\6\0\0\0\0\0\6\1\"\375\2\0\0\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0"..., 8208) = 233
write(16, "\0 \0\0\6\0\0\0\0\0\3N(\10\0\0\0\2\0\0\0`\0\0\0\0\0\0\0\7\0011", 32) = 32
read(17, "\0\343\0\0\6\0\0\0\0\0\6\1\"\7\1\0\0\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0"..., 8208) = 227
write(16, "\0K\0\0\6\0\0\0\0\0\3N)\t\0\0\0\2\0\0\0`\0\0\0\0\0\0\0\7,/"..., 75) = 75
read(17, "\0\341\0\0\6\0\0\0\0\0\6\1\"\375\1\0\0\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0"..., 8208) = 225
write(16, "\0Q\0\0\6\0\0\0\0\0\3N*\n\0\0\0\2\0\0\0`\0\0\0\0\0\0\0\7,/"..., 81) = 81
read(17, "\0\254\0\0\6\0\0\0\0\0\4\1\0\0\0)\0\1\0\0\0\0{\5\0\0\0\0\n\0\0\0"..., 8208) = 172
lseek(20, 227328, SEEK_SET)             = 227328
nanosleep({1, 0}, NULL)                 = 0

File descriptors 16 and 17 are the pipes for one of the bequeath connections we have with the database. There are four queries being submitted each cycle (following the same order as those being sent to a database):

SELECT DECODE(archived, 'YES', 1, 0), status  FROM v$log WHERE thread# = :ora_thread AND sequence# = :ora_seq_no

SELECT MAX(sequence#)  FROM v$log WHERE thread# = :ora_thread AND status in ('INVALIDATED', 'CURRENT', 'ACTIVE')

SELECT DECODE(status, 'STALE', 1, 0) FROM v$logfile WHERE member = :log_name

SELECT 1  FROM V$LOGFILE WHERE(STATUS NOT IN ('STALE', 'INVALID') OR STATUS IS NULL) AND MEMBER <> :log_name AND EXISTS ( SELECT 1 FROM V$LOG WHERE GROUP#  = V$LOGFILE.GROUP# AND THREAD# = :ora_thread AND SEQUENCE# = :ora_seq_no ) AND ROWNUM = 1

The purpose of these statements is to constantly keep an eye on what’s happening inside the database by regularly polling the contents of the above views. What’s worth mentioning about the above queries is that all of them will cause extra I/O to the controlfile. On my test database, that equaled 640KB each cycle (40 I/O requests, 16KB each). In most cases, this is nothing to worry about–just keep the additional I/O in mind in case your controfile is already a hot spot.

The redo log stores object identifiers (a number), which means that when the Extract process encounters a supported operation, it needs a way to find out more details. This is achieved by a couple of statements against the data dictionary. The following statement will be issued first:

SELECT u.name, o.name, o.dataobj#, o.type#, (SELECT bitand(t.property, 1) FROM sys.tab$ t WHERE t.obj# = :ora_object_id) FROM sys.obj$ o, sys.user$ u WHERE o.obj# = :ora_object_id  AND decode(bitand(o.flags, 128), 128, 'YES', 'NO') = 'NO'  AND o.owner# = u.user# AND decode(bitand(o.flags, 16), 0, 'N', 16, 'Y', 'N') = 'N' AND (o.type# in (1, 19, 20, 34) OR EXISTS (SELECT 'x' FROM sys.tab$ t WHERE t.obj# = :ora_object_id))

In case the object turns out to be a table, it will be checked whether it is an overflow segment for an IOT:

SELECT nvl(iot_name, 'NULL')   FROM all_tables WHERE owner = :owner AND table_name = :object_name

This allows the Extract process to figure out whether it needs to process changes, in case the overflow segment belongs to an IOT from which we’re capturing the data. In case the object in question turns out to be an index, a corresponding check will be made to see whether it’s an underlying index for an IOT:

SELECT table_owner, table_name FROM all_indexes WHERE index_name = :object_name AND         owner = :owner AND index_type = 'IOT - TOP'

This is required so that the changes made to an IOT can be captured, in case it belongs to an interested tables list.

The above queries will be executed regardless of whether or not you’re interested in capturing changes from the particular object, because the queries are required before you can make that decision. In case this is something we’re interested in, additional information will be requested:

select object_type, object_name, subobject_name from dba_objects where object_id = :ora_object_id

The above statement is necessarily in case we’re dealing with the partitioned object and, depending on the result, one of the following two statements will be executed:

select ts.bigfile from dba_tablespaces ts,  all_tables t  where t.table_name = :ora_object_name and  t.tablespace_name = ts.tablespace_name and rownum = 1

select t.bigfile from dba_tablespaces t,  all_tab_partitions p  where p.partition_name = :ora_subobject_name and   p.tablespace_name = t.tablespace_name and rownum=1

There is obviously an issue with the above two statements. Neither of them specify the object owner and, in case you have two (or more) objects with the same name but in different schemas, the above statements may return incorrect information, if these objects are located in different tablespace types.

It’s interesting how the issue is shoved away using rownum = 1 condition. What could the potential impact be? One thought that immediately comes to mind is that the way ROWIDs are organized is different between small- and big-file tablespaces (the part being used for a relative data file number in a small file tablespace is used for a block number in case of a big file tablespace), so some functionality that potentially relies on that could be affected. I’ve got a couple of ideas, but I’ll hold these until I do some testing.

What else is interesting? The big thing is that none of the information on columns is being resolved. All I’m going to say right now is that column information will be fetched by the Replicat process using the destination system’s data dictionary. Combine this with the fact that an online data dictionary does not store historical information about an object’s metadata, and you have a perfect recipe to get yourself into various nasty situations (which is exactly the reason why Oracle Streams relies on MVDD instead of the online data dictionary). But more on that when we get to the Replicat process internals series.

6 Responses to “Oracle GoldenGate Extract Internals, Part III”

  • Narayana Rao says:

    Hi

    question on , golden gate extract process.

    how do i fornce/ensure the extract process reads from the archvies only.

    Thanks
    NR

  • Haluk Guncer says:

    Hi,

    I get high consistent gets for

    SELECT nvl(iot_name, ‘NULL’) FROM all_tables WHERE owner = :owner AND table_name = :object_name

    Buffer gets ( in 30 mins): 199,569,171 blocks.
    Executions (in 30 Mins) :134,267 blocks.
    Gets/Exec (in 30 Mins) : 1,486.36

    becoming the most top query. Replicated Table is only 1 table. And is not IOT but partitioned. How can I get rid of this high cost?

    I have about 20 more tables to be defined for replication.

    Thanks.

  • Haluk,

    1. How many tables do you have in your database?
    2. How often you restart the Extract process?

  • Jayanta says:

    I have a schema namely CPF1234 in server with IP xx.xx.xx.33
    I have a schema namely CPF1234 in server with IP xx.xx.xx.89

    I have installed Golden Gate in both of them and configured for bi-directional
    replication.

    I have created a table with primary key in both the servers –

    create table EMPDATA (
    empid number(10) primary key,
    ename varchar2(30),
    age number(2,0),
    salary number(7,2)
    );

    I have inserted a record in .33 server as –
    insert into EMPDATA values ( 100, ‘ABC’, 35, 25000 ); commit;
    data got inserted in .33 and also got replicated in .89 server

    I have inserted a record in .89 server as –
    insert into EMPDATA values ( 101, ‘XYZ’, 49, 36000 ); commit;
    data got inserted in .89 and also got replicated in .33 server

    I have updated in .33 as –
    update EMPDATA set salary = 25999 where empid = 100; commit;

    Now my pump process in .33 server got slow and got a lag of 10 minutes
    it got updated in .33 but DID NOT REPLICATED in .89 due to the lag

    In .89 server I made an update –
    update EMPDATA set salary = 30000 where empid = 100; commit;
    It got updated in .89 and got instantly replicated in .33 server
    So the salary for empid 100 in both .33 and 89 is currently 30000

    Now as soon as the lag resolved to zero in pump process in .33 server
    the salary in .89 changed to 25999 where as the salary in .33 server is 30000

    Can this be prevented ? I want the latest update to be reflected in both
    which should be 30000 in both .33 and .89 server

    KIndly help

  • Pavan says:

    Hi

    How do we configure when we want golden gate to be read it from archive log only?

    what query it uses to query from v$achived_log?

    What will be the impact of I/O on control file read ?

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>