Control File Question
This might be a dumb question, but I can't find explicit information about it.
The question is this:
I have my controlfiles multiplexed across different disks (or in this case across a number of mounted NFS volumes). If one of the those volumes became unavailable and made one of the controlfile copies unavailable, would it cause the database to crash?
I have never had an experience were one of the volumes hosting a copy of the controlfile became unavailable. I was always under the assumption that multiplexing controlfiles, redo, and archive would prevent a crash as long as there was one copy of each file available.
Can someone confirm?
Your database will not be available but you can recover easily from a lost controlfile if they are multiplexed. You can either remove the unavailable controlfile from your pfile/spfile and startup the database. Or you can shutdown the database, copy the controlfile to another location, register the changes in your pfile/spfile and then startup the database
Agree with Nuushona. To summarize, yes, your database will go down
Ok... so you are guys are telling that...
"...once any copy (multiplexed) control file goes missin'... your db will come down!"
So.. I've tried to put this to the test. On a sandbox, I started the instance. It's got 3 redo log groups, each with 3 members multiplexed across 3 different volumes. It's controlfile is multiplexed 3-way across 3 different volumes.
I deleted ALL the datafiles, one of the controlfiles, and on redo member from each group.
The database is still running.
I've done some DML's since then. I've also done a "logfile switch" and forced a "checkpoint". And then did some more DMLs, but the database is still running.
What am I not simulating correctly to prove your point that the database will come down?
Also, I see errors in the alert log about missing redo logs when I do a logfile switch, but I see no errors when I force a checkpoint. Shouldn't I see errors regarding missing control files in one of both of these commands?
Any feedback would realy help...
BTW: It's a 9206 database.
your database will come down eventually, dont worry about that. It will be when the database closes and then opens the file again
redo logs you can lose one (as long as they are mutiplexed) without the database going down
I found (I think) my answer at Metalink...
It has something to do with being on a Unix box and Oracle still having a hold on the Inode even after I "rm" one of the controlfiles.
I guess my real quest in this thread is to find a way to provide point-of-failure recovery using new hardware I just received. And I think I found my answer, but please confirm....
I thought I needed the following to be able to have complete (point-of-failure) recovery in the event of media failure:
- Data Files (from last backup)
- Archived Redo Logs (since last backup)
- At least one member of any of the online redo log group
- AND... a copy of the controlfile
But as it turns out (when I tested this) I don't need a copy of the controlfile at all. As long as I has a trace of of control file which reflects all the current datafiles, then I can run a create control file script to recreate the control file and having the online redo (along with Archived ones) will allow for the database to recover to point-of-failure.
How I tested this:
/u01 volume contains: datafiles, controlA, redologA
/u02 volume contains: redologB
/u03 volume contains: backup, archived_redo_logs
I ran DMLs to insert 5 records.
I removed all files in /u01.
I aborted the database.
I replaced datafile in /u01 from backup in /u03.
Ran create control script using controlfile trace that was taken from previous backup.
Opened the the database.
All previously-inserted five records are still in the database.
If all this is true, then I guess control file has less factor when you have copy of online redo log.
Again, please confirm...
Thanks in advance.
what if you added a file after the last controlfile to trace backup? add more redo groups, change any paths?
Always multiplex the control file then if you lose one, just copy a good one over the top of where it should be.
Protect the disk with RAID1 as well so if the disk dies it is mirrored ok.
Always have more than one member in a log group for the same reasons
Understood. These changes wouldn't be captured in the trace of the control, but not impossible to overcome. And this is really why one should perform a backup after any major changes, such as these.
Originally Posted by davey23uk
At any rate, I think my point is the same and true that controlfiles don't play a huge factor in point-of-failure recovery... at least NOT if you have the other components I previously listed.
This thread really started because I thought I needed to multiplex controlfiles to a remote location to ensure point-of-failure recoverability. But there could be connectivity issues with that remote location. I wanted to multiplex the control file to another building, but didn't want any hick-ups in the network to cause a database outage due to missing control file.
But now, I think I've confirmed that I don't need to multiplex the control to the remote location - I mean, I still need to multiplex it, but only on local storage where connectivity is not an issue. I only need to multiplex the online redo logs and archive location to the remote location. Should connectivity issue arise, the database can tolerate not finding one of the members of a redo group and one of the archived destination.
correct, sounds like you could use dataguard though - have a second database which your primary database send archive logs too.
Might be overkill for your requirements though
Yeah.. I think the complexity introduced would be unjustifable. Beside, I have existing infrastructure that I can take advantage of that would meet the same requirements with a much simpler implementation.
Hey! Thanks for all the feedback....
Click Here to Expand Forum to Full Width