RMAN - A "delete archivelog.." incorrectly got rid of an archivelog?!
DBAsupport.com Forums - Powered by vBulletin
Results 1 to 8 of 8

Thread: RMAN - A "delete archivelog.." incorrectly got rid of an archivelog?!

  1. #1
    Join Date
    Oct 2002
    Posts
    807

    RMAN - A "delete archivelog.." incorrectly got rid of an archivelog?!

    One of my "delete archivelog all backed up 2 times to sbt" scripts behaved whackily all of a sudden. It appears to have gotten rid of an archivelog file (1_3884.dbf) that had never been backedup!

    Here's the background -
    I've got 2 scripts scheduled via cron. One, backups up archivelogs to tape. The other deletes archivelogs that have been backed up twice already. In this instance, the latter incorrectly got rid of an archivelog file that was NEVER backedup to tape!

    Note : I messed up with the scheduling..both the scripts ran at the SAME time (at 10am today). However, that doesn't explain why the latter script would get rid of an archivelog (1_3384.dbf) that had NEVER been backed up. The relevent archivelog 1_3384 got generated at 10am as well.

    I've attached the relevent logfiles in a textfile.

    1) Script 1 ( DWDEV_periodic_arch_bkup.rman) is simply

    run
    {
    allocate channel t1 type 'sbt_tape' parms
    'ENV=(TDPO_OPTFILE=/opt/tivoli/tsm/client/oracle/bin64/DWDEV/tdpo.opt)';
    allocate channel t2 type 'sbt_tape' parms
    'ENV=(TDPO_OPTFILE=/opt/tivoli/tsm/client/oracle/bin64/DWDEV/tdpo.opt)';
    BACKUP
    format 'dwdev_periodic_arch_bkup_%t_%s_%p'
    tag='DWDEV periodic arch log bkup'
    ARCHIVELOG all not backed up 2 times;
    release channel t1;
    release channel t2;
    }
    ==========================================================
    2) Script 2 ( rm_bkedup_arch_log.rman) is

    run {
    delete noprompt archivelog all backed up 2 times to sbt;
    }
    ==================================

    I've opened a TAR..it has been in an open status for the past hour and a half. Hasn't even got assigned to an analyst! I give up on them.
    Attached Files Attached Files

  2. #2
    Join Date
    Oct 2002
    Posts
    807
    Update - Oracle recognizes this as a bug. They've not seen it before and need me to generate bucket loads of diagnostic information.

    Just great! I get a warm and fuzzy feeling about my backups now.

  3. #3
    Join Date
    Jun 2000
    Location
    Madrid, Spain
    Posts
    7,447
    and you just become their beta testers

    I had 6 TARs in last 4 weeks and had to do all sort of things for them, spending my work time (yeah and customer is paying for it) work for Oracle

    Low level peeps

  4. #4
    Join Date
    Oct 2002
    Posts
    807
    I hear you. Atleast the TAR got assigned to a reasonably good analyst this time! He's reviewer of some Oracle books..and right off the bat admitted that it was a bug.

    I was expecting them to close the TAR with a "unable to reproduce issue inhouse" message. But that wasn't the case. I was plesantly surprised this time.

  5. #5
    Join Date
    Oct 2000
    Location
    Saskatoon, SK, Canada
    Posts
    3,925
    Axr2,

    Having said your problem, you had forgot to mention the database Version and the OS info. Giving those informations would be v.helpful to trace for those who encounter such issues in their site.

    Sam
    Thanx
    Sam



    Life is a journey, not a destination!


  6. #6
    Join Date
    Oct 2002
    Posts
    807
    target - 9.2.0.4
    target OS - SunOS 5.8

    catalog - 9.2.0.5
    catalog OS - AIX 5.1

    FYI -
    I'm guessing the issue had something to do with the "timing" or scheduling of the 2 scripts. For now, as a work around - I've incorporated both "backup archivlog" and "delete archivelog all backed up 2 times" into a SINGLE rman script (instead of splitting them into 2 separate jobs). That'll hopefully prevent it from occuring on production atleast.

    I've left things 'as is' (with rman debug) on DEV to simulate the failure again.

    Btw, the debug feature is quite enlightening..for those that haven't tried it - give it a shot, and look at the tracefile generated. It makes interesting reading. You get to see the different rman procedures being called, along with the bind variables. I hadn't seen this before. It seems neat.

    Adding a debug is typically done like this -

    rman trace=file_name
    connect target /
    connect catalog .....
    run
    {
    debug on;
    allocate channel c1 type ...... debug=5

    debug off;
    }
    Last edited by Axr2; 08-10-2004 at 05:23 PM.

  7. #7
    Join Date
    Oct 2000
    Location
    Saskatoon, SK, Canada
    Posts
    3,925
    Yes, I have tried it in the past and was really good to see what exactly is going on behind the scenes.

    Thanx,
    Sam
    Thanx
    Sam



    Life is a journey, not a destination!


  8. #8
    Join Date
    Oct 2002
    Posts
    807
    An update :
    Bug 3844804 has been filed. It is supposedly fixed in 10.2.


    Per Oracle development :
    "Thanks for providing a good debugging traces. We have found the problem.

    We found that this problem can happen with DELETE command using option BACKED UP .. TIMES.

    For eg. DELETE ARCHIVELOG ALL BACKED UP 1 TIMES TO DISK

    can delete archivelog that aren't backed up when an archivelog is created during this command.

    Workaround
    ==========
    Use COMPLETED BEFORE option with BACKED UP option. For eg.

    DELETE ARCHIVELOG ALL BACKED UP 1 TIMES TO DISK COMPLETED BEFORE 'SYSDATE-1';

    This will make sure that RMAN doesn't delete the archivelog that was created during past one day.

    Bug is fixed in Oracle 10g (10.2)"

    ================================================

    ME :
    As for the workaround - has the proposed solution actually been "tested"? I know what the new command/syntax means. But has it been tested? Did development prepare a test case that caused the original command to fail and the new command to work as expected? Or did they figure it out intuitively from the debug files? If so, I'd like to see the piece of code that helped them nail the issue. I just want to make sure that the proposed solution actually works. I
    don't want to be in a situation where after a couple of months, I run into the same thing again (despite the new rman command)."

    ORACLE SUPPORT :
    Neither Development nor I were able to create a reproducible test case. The problem was found from the debug information you provided.

    As for the 'piece of code' they found to be the issue, this is Oracle proprietary information and can not be shared.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width