DBAsupport.com Forums - Powered by vBulletin
Results 1 to 6 of 6

Thread: !Please HELP! lock held by in doubt distributed transaction!PLEASE HELP!

  1. #1
    Join Date
    Jul 2001
    Posts
    10

    Exclamation

    Hello , I need help despartly.

    I was running a stored preceedure and it came back with an error then I ran a commit by accident. Then the error I got was ORA-01591 lock held by in doubt distributed transaction.
    So I check my error books and it told me to match the transaction number in the mesasge with the GLOBAL_TRAN_ID column of the dba_2pc_pending table, so I look and there was only one row, the transaction that it had locked by me. Our DBA is gone on vaction and stupid me delete that row from the dba_2pc_pending table thinking that would fix my problem, but it never I still get the same error and now I have nothing i the dba_2pc_pending table to even try to fix, I tried a rollback but that did not work. Does Anybody have any ideas??? I am really desparate.

    Scott

  2. #2
    Join Date
    Aug 2000
    Location
    Belgium
    Posts
    342

    Lightbulb

    Take a look a note 100664.1 on Metalink.

    Tittle : HOW TO TROUBLESHOOT DISTRIBUTED TRANSACTIONS

    Hope this helps
    Gert

  3. #3
    Join Date
    Jul 2001
    Posts
    10
    Our DBA is the only person who knows the login info, he is gone and I can't get a hold of him, I found some info on forcing a commit but I think where I delete the record form the dba_2pc_pending table, that it does not find the transaction to force the commit on. I think I am gonning up the river without a paddle.


    I have shut down the DB and started it back up but the lock is still there.

    Got any more ideas??

    Thanks
    Scott

  4. #4
    Join Date
    Mar 2001
    Location
    Ireland/Dublin
    Posts
    688
    Somethink interseting.
    Do you have another server?
    Type OS?

  5. #5
    Join Date
    Aug 2000
    Location
    Belgium
    Posts
    342

    Wink

    Here it is :

    HOW TO TROUBLESHOOT DISTRIBUTED TRANSACTIONS
    =============================================

    You need to resolve a distributed transaction erorr because of
    errors received. But before you try to implement any other steps,
    please diagnose that it is or it is not a REAL distributed
    transaction.


    STEPS TO TROUBLESHOOT DISTRIBUTED TRANSACTIONS.

    1. Get errors from alert.log.


    2. Make sure that network is up and all db_links are valid (V$DBLINK AND
    GV$DBLINK show the db_links currently used by distributed transactions)

    SVRMGR> desc v$dblink
    Column Name Null? Type
    ------------------------------ -------- ----
    DB_LINK VARCHAR2(128)
    OWNER_ID NUMBER
    LOGGED_ON VARCHAR2(3)
    HETEROGENEOUS VARCHAR2(3)
    PROTOCOL VARCHAR2(6)
    OPEN_CURSORS NUMBER
    IN_TRANSACTION VARCHAR2(3)
    UPDATE_SENT VARCHAR2(3)
    COMMIT_POINT_STRENGTH NUMBER

    SVRMGR> desc gv$dblink
    Column Name Null? Type
    ------------------------------ -------- ----
    INST_ID NUMBER
    DB_LINK VARCHAR2(128)
    OWNER_ID NUMBER
    LOGGED_ON VARCHAR2(3)
    HETEROGENEOUS VARCHAR2(3)
    PROTOCOL VARCHAR2(6)
    OPEN_CURSORS NUMBER
    IN_TRANSACTION VARCHAR2(3)
    UPDATE_SENT VARCHAR2(3)
    COMMIT_POINT_STRENGTH NUMBER


    3. Run query against DBA_2PC_PENDING:

    SELECT
    LOCAL_TRAN_ID, GLOBAL_TRAN_ID, STATE, MIXED, HOST, COMMIT#
    FROM
    DBA_2PC_PENDING
    WHERE
    LOCAL _TRAN_ID = '??.';

    LOCAL _TRAN_ID is Local transactionID (number that is given when error is
    reported). If LOCAL _TRAN_ID = GLOBAL_TRAN_ID that means that it means this
    site is Global Coordinator, i.e. where transaction originated from.

    You can also get LOCAL _TRAN_ID from the alert.log.


    4. Run query against DBA_2PC_NEIGHBORS view on all Nodes:

    SELECT LOCAL_TRAN_ID, IN_OUT, DATABASE, INTERFACE
    FROM DBA_2PC_NEIGHBORS;


    5. Get values for COMMIT_POINT_STRENGTH init.ora parameter on all Nodes.
    If you are not aware of what other Nodes are participating in the distributed
    transaction, query DBA_2PC_NEIGHBORS.
    The Node with COMMIT_POINT_STRENGTH that has highest value in init.ora
    has to commit first.


    6. Look for LOCAL_TRAN_ID and GLOBAL_TRAN_ID from DBA_2PC_PENDING.
    If LOCAL_TRAN_ID = GLOBAL_TRAN_ID it means this site is Global Coordinator,
    i.e. where transaction originated from. Please note this site for recovery
    issues.


    7. Check STATE column from DBA_2PC_PENDING.
    If STATE column states COMMIT, it means local database has successfully
    committed.

    That also means you most likely do not have to force anything on that Node.
    Or if other Nodes are not commited and this particular Node happens to be a
    Global Coordinator - you need to note COMMIT# i.e. SCN number from
    DBA_2PC_PENDING view as it is THE SCN# you will need to use if you need to
    perform COMMIT FORCE.


    8. Take the GLOBAL_TRAN_ID and COMMIT# (SCN number from DBA_2PC_PENDING) and
    compare it to the value on the other nodes.
    On Nodes where there are no entries for this COMMIT# and transactionID in
    DBA_2PC_PENDING, RECO process has resolved the problem and we don't have to
    do anything.

    Note: RECO is always automatically started whenever Distributed Transactions
    are enabled.

    If you issue a UNIX command
    ps - ef | grep -i reco

    and see RECO running, distributed transactions
    were correctly enabled.

    If you combine this output with values of init.ora parameter
    DISTRIBUTED_TRANSACTONS (which should be higher than 0 for distributed
    environment), you may safely say that customer is running in a distributed
    environment even if your particular customer has not set it up.

    Database links have to also be valid and operational. If they seem not to be
    operational, please check V$DBLINK AND GV$DBLINK (if active transaction is
    currently using db_links) because you are operating in a distributed
    environment we have to apply distributed environment principles
    to troubleshoot the problem.

    9. If the STATE (from DBA_2PC_PENDING view) is PREPARED then this Node has not
    completed transaction.

    Take GLOBAL_TRAN_ID (from DBA_2PC_PENDING) and COMMIT# (SCN number) and
    compare it to the value on the other nodes.

    You need to also check DBA_2PC_NEIGHBORS and see whether there are any other
    Nodes that reference same transactionID in their DBA_2PC_PENDING.
    If you see it referenced, that means there are children and they need to be
    resolved (either committed or rollback) as well. Please take 'children' i.e.
    Nodes that other Nodes (not Global Coordinator) into account.

    Note: for clarification on terminilogy please see · Note:13229.1

    10. If there are no children, it is safe to manually interfere and either
    commit force or rollback force. For global intergrity, use the same COMMIT#
    (SCN) when you force the transaction. To force the transaction please see
    below.

    11. After transaction is forced, make sure that both DBA_2PC_PENDING and
    DBA_2PC_NEIGHBORS data dictionary views are empty.

    If for some reason it is necessary to clean those views, use packaged
    procedure DBMS_TRANSACTION.purge_lost_db_entry.
    See [NOTE:1012842.102] ORA-2019 ORA-2058 ORA-2068 ORA-2050: FAILED
    DISTRIBUTED TRANSACTIONS

    Note: Be very careful with that procedure as it is always more reliable and
    more consistent to let RECO resolve distributed transaction conflict.
    Please attempt to use DBMS_TRANSACTION.purge_lost_db_entry only as a last
    resort after every other measure was implemented including restarting the
    database to resolve locking conflict.

    12. If even then there are entries in the views and errors still appear, please
    check for bug 684157.



    SYNTAX NOTES:
    ==============

    COMMIT FORCE command
    --------------------
    Example of syntax when highest committed SCN is 88123887
    (from dba_2pc_pending) and local transaction ID is 1.13.5197
    (from either dba_2pc_pending or alert.log):

    SVRMGR> COMMIT FORCE 'your local transactionID on this node',
    'highest SCN from already committed site';

    SVRMGR> COMMIT FORCE '1.13.5197', '88123887';


    ROLLBACK FORCE command
    ----------------------
    Example of syntax local transaction ID is 1.13.5197
    from either dba_2pc_pending or alert.log):

    SVRMGR> ROLLBACK FORCE 'your local transactionID on this node';

    SVRMGR> ROLLBACK FORCE '1.13.5197';


    PURGING VIEWS:
    --------------
    Example of syntax for lost local transaction ID 1.13.5197 (from either
    dba_2pc_pending or alert.log):

    SVRMGR> Execute
    DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY ('1.13.5197');
    NOTE: Run as SYS where 1.13.5197 is the local_tran_id

    Example of syntax for mixed local transaction ID 1.13.5197
    (from either dba_2pc_pending or alert.log):

    SVRMGR> Execute
    DBMS_TRANSACTION.PURGE_MIXED ('1.13.5197');

    NOTE: Run as SYS

    Run PURGE_MIXED Procedure only if significant reconfiguration has occurred so
    that automatic recover (RECO) can not resolve transaction.
    · Example of appropriate cases when it can be used:
    · Total loss of the remote database, reconfiguration of software resulting in
    loss of two-phase commit capability.


    SCN RECOVERY STEPS
    ------------------omplete Recovery:

    1. At the down site recover completely if possible (treat as regular recovery).

    2. SCN will appear in alert.log after recover is done on the crashed Node.

    Incomplete Recovery:
    1. If time-based or cancel-based recovery was used on crashed node, other
    sites must be placed back to the same point in time for global consistency.
    Get last SCN from crashed node alert.log.

    2. At each node shut down normal or immediate.

    3. Perform a cold backup.

    4. Restore control file if necessary.

    5. Restore last backup of all datafiles along with archived redo logs.

    6. Choose which tool to use to perform SCN recovery - either Server Manager
    line mode or RMAN.

    7. Connect internal, startup mount, select * from v$datafile to make sure all
    datafiles are online. Alter database datafile '?/?/?' online; is several
    datafiles are offline.

    8. Issue the following command using latest SCN from alert.log on the Node
    that had to be recovered:
    RECOVER DATABASE UNTIL CHANGE '1.13.5197';

    NOTE: If for some reason (e.g.when issuing commit force command) automatic
    recovery (RECO process) needs to be disabled, use the following command:

    ALTER SYSTEM DISABLE DISTRIBUTED RECOVERY;
    (will put RECO to sleep).

    Make sure to wake RECO up after that:
    ALTER SYSTEM ENABLE DISTRIBUTED RECOVERY;



  6. #6
    Join Date
    Jul 2001
    Posts
    10
    I have tried all the above, but I guess where I deleted the file from the dba_2pc_pending table there svrmgr does see any pending jobs. I have also restarted the entire unix box, and then when I go to do a dbstart, it starts the instances and says 'instance name' warm start. Would a differrent kind of start work, like a cold start if so how do you do this. Every time I run the commands you told me it says did not find any pending job with the local tan id that I speciifed, there is nothing in my dba_2pc_pending table, thus it can not force a commit on somehting that it can't see.

    What do I do?
    I am starting to get worried now!

    THanks
    Scott

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width