Concurrent Processing Issues with Distributed Transactions Glitch

May 8, 2014 / By Raivis Saldabols

Tags: , , ,

This is my first post in Pythian blog, and I wanted to share interesting use-cases that we’ve dealt with recently. I believe someone may see similar cases in the future, so these are the troubleshooting steps to get to the source and apply workaround to fix it. So, the root problem was that concurrent processing doesn’t start up, and Internal Manger log file reports error (same error also throwing Oracle forms trying to access Concurrent -> Administer form)

CP-GENERIC ORACLE ERROR (ERROR=ORA-01591: lock held by in-dou) (ROUTINE=get_lk_handle)


ORA-01591 error clearly identifies lock held by in-doubt distributed transaction string; however, DBA views DBA_2PC_PENDING and DBA_2PC_NEIGHBORS return no rows (also sys.pending_trans$ and sys.pending_sessions$ have no rows). If any of these views return any row, then most probably these transactions can be committed, rolled back or purged and the problem would be resolved. However, this wasn’t the case – so this must be some kind of glitch.

The first document to review is Master Note for Troubleshooting Oracle Managed Distributed Transactions (Doc ID 100664.1), but this gives no valuable information, and neither does the much detailed Manually Resolving In-Doubt Transactions: Different Scenarios (Doc ID 126069.1). Next step is to find out where ORA-01591 is coming from. Here, a database trace is very useful tool. I was lucky this time as error is thrown in forms, so it’s 2 clicks away from getting trace. If it’s not so easy in your case, the SERV_MOD_ACT_TRACE_ENABLE procedure can be used to get trace. In my case it was:


=====================
PARSING IN CURSOR #139830878745608 len=66 dep=1 uid=0 oct=3 lid=0 tim=1398844275853403 hv=1493897133 ad='1c121bb90' sqlid='19x1189chq3xd'
SELECT LOCKID FROM DBMS_LOCK_ALLOCATED WHERE NAME = :B1 FOR UPDATE
END OF STMT
PARSE #139830878745608:c=0,e=137,p=0,cr=0,cu=0,mis=1,r=0,dep=1,og=1,plh=0,tim=1398844275853402
BINDS #139830878745608:
Bind#0
oacdty=01 mxl=128(128) mxlc=00 mal=00 scl=00 pre=00
oacflg=03 fl2=1206001 frm=01 csi=31 siz=128 off=0
kxsbbbfp=7f2ce9db2d80 bln=128 avl=11 flg=05
value="FNDCPLK_ICM"
...
ERROR #139830878745608:err=1591 tim=1398844275855101
...
=====================

Next, trying to execute the same SQL manually in sqlplus I got the following error:

SQL> select * from DBMS_LOCK_ALLOCATED;
select * from DBMS_LOCK_ALLOCATED
*
ERROR at line 1:
ORA-01591: lock held by in-doubt distributed transaction 35.15.13775

So now I have some details on ORA-01591 error. I know which transaction is holding it – but it’s not displayed in any of DBA_2PC _ views. Also trying to purge, commit or rollback this transaction will end in error as there is no full data about this transaction:

SQL> execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY ('35.15.13775');
BEGIN DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY ('35.15.13775'); END;
*
ERROR at line 1:
ORA-06510: PL/SQL: unhandled user-defined exception
ORA-06512: at "SYS.DBMS_TRANSACTION", line 96
ORA-06512: at line 1
SQL> COMMIT FORCE '35.15.13775';
COMMIT FORCE '35.15.13775'
*
ERROR at line 1:
ORA-02058: no prepared transaction found with ID 35.15.13775
SQL> rollback force '35.15.13775';
rollback force '35.15.13775'
*
ERROR at line 1:
ORA-02058: no prepared transaction found with ID 35.15.13775
SQL>

Trying to dig deeper, MoS note DBMS_LOGSTDBY.BUILD Seems to Hang And Does Not Return To SQL prompt. (Doc ID 747495.1) got my attention. Standby database is not used in this configuration, but the fact that x$ktuxe internal view contains information about transactions and query against it soon gave some results:

SELECT KTUXEUSN, KTUXESLT, KTUXESQN, KTUXESTA Status, KTUXECFL Flags FROM x$ktuxe WHERE ktuxesta!='INACTIVE';
KTUXEUSN KTUXESLT KTUXESQN STATUS FLAGS
---------- ---------- ---------- ---------------- ------------------------
35 15 13775 PREPARED SCO|COL|REV|DEAD

This is exact transaction where we identified earlier with trace (ORA-01591: lock held by in-doubt distributed transaction 35.15.13775). Once transaction is identified there is a way to get rid of it, and there are already a few options:

Using the same technique, I inserted dummy data into pending_trans$   and pending_sessions$ tables. Then purge_lost_db_entry solved the case, and the dummy entry was removed, along with references in x$ktuxe table (see examples in above sources I mentioned).  According to the oracle-L comments, this is Oracle Support provided workaround to deal with a particular SR case – so most probably this is not a supported way to deal with such cases.

In my case, we didn’t know what created the half-purged transaction, and this test system was needed ASAP for training, so this method really saved the day.

Leave a Reply

  • (will not be published)

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>