-
How to achieve 99.99% uptime?
We need to design a web/application/DB servers architecture that can achieve so called 99.99% uptime.
With redundent web/application servers, it was propsed to use a cluster hardware such as HP Proliant DL380 running Windows 2000 Advanced Server with Oracle Enterprise 8.1.7 to ensure the uptime. However, my understanding about hardware cluster is that it has two identical "systems" connected to the same storage. The problem is if say an Oracle datafile is corrupted, the DB on this whole cluster is temporarily unavailable. Correct?
As an alternative, I could have two identical web/application servers with two identical DB servers. Each application server connects to a DB server and two-way replication runs between the two DB servers. In this scenerio, unless the replication is almost real time, I can't guarantee user A (via first set of application/DB server) will see what user B (via second set of application/DB server) just modified to the DB. Correct?
Finally, I could have two identical web/application servers connect to the same DB server A, which does one-way replication to DB server B(initially no application server connects to it). When the DB server A fails, I switch the application servers to point to DB server B. This method does not really guarantee 99.99% uptime, though.
Does anyone have an even better idea on how to achieve this 99.99% uptime?
-
Re: How to achieve 99.99% uptime?
Originally posted by a128
The problem is if say an Oracle datafile is corrupted, the DB on this whole cluster is temporarily unavailable. Correct?
define "corrupted". Do you mean by a hardware failure? Do you mean by the Oracle Software?
Jeff Hunter
-
And do you really mean 99.99% uptime? 525600 minutes in a year. You're allowed to be down almost 53 minutes for the entire year.
-
I meant by Oracle software or any possibility that the file is no longer accessible (not due to a disk failed completely) such as sectors where the datafile resides is corrupted.
Assuming there is some RAID protection like RAID 0+1 in place so the chance of having more than 2 disks down at the same time is very small. This makes me wonder: what happen if one of the mirrored volumes has corrupted sectors? Could Oracle read from the volume with the bad sectors? Or the volume manager can detect the bad sectors right away thus prevent Oracle from accessing it?
Thanks,
-
Originally posted by a128
I meant by Oracle software or any possibility that the file is no longer accessible (not due to a disk failed completely) such as sectors where the datafile resides is corrupted.
In that case your "corruption" would be duplicated on another system as well.
Assuming there is some RAID protection like RAID 0+1 in place so the chance of having more than 2 disks down at the same time is very small. This makes me wonder: what happen if one of the mirrored volumes has corrupted sectors? Could Oracle read from the volume with the bad sectors? Or the volume manager can detect the bad sectors right away thus prevent Oracle from accessing it?
Yes, your volume manager should take care of this. Oracle doesn't even have to know there has been a failure.
What I am trying to get at is that a risk of corruption by the Oracle software is very low. If you want to plan for every possible scenario, then you are talking about a pretty hefty price tag. For a high degree of availability (probably more than most will need), go with a two or three node cluster. If you need a greater degree of uptime, go with two two-node clusters and use log-based replication (something like quest shareplex) to keep the two in sync.
Jeff Hunter
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|