DBAsupport.com Forums - Powered by vBulletin
Results 1 to 5 of 5

Thread: How to achieve 99.99% uptime?

  1. #1
    Join Date
    Apr 2001
    Posts
    257

    How to achieve 99.99% uptime?

    We need to design a web/application/DB servers architecture that can achieve so called 99.99% uptime.

    With redundent web/application servers, it was propsed to use a cluster hardware such as HP Proliant DL380 running Windows 2000 Advanced Server with Oracle Enterprise 8.1.7 to ensure the uptime. However, my understanding about hardware cluster is that it has two identical "systems" connected to the same storage. The problem is if say an Oracle datafile is corrupted, the DB on this whole cluster is temporarily unavailable. Correct?

    As an alternative, I could have two identical web/application servers with two identical DB servers. Each application server connects to a DB server and two-way replication runs between the two DB servers. In this scenerio, unless the replication is almost real time, I can't guarantee user A (via first set of application/DB server) will see what user B (via second set of application/DB server) just modified to the DB. Correct?

    Finally, I could have two identical web/application servers connect to the same DB server A, which does one-way replication to DB server B(initially no application server connects to it). When the DB server A fails, I switch the application servers to point to DB server B. This method does not really guarantee 99.99% uptime, though.

    Does anyone have an even better idea on how to achieve this 99.99% uptime?

  2. #2
    Join Date
    Nov 2000
    Location
    greenwich.ct.us
    Posts
    9,092

    Re: How to achieve 99.99% uptime?

    Originally posted by a128
    The problem is if say an Oracle datafile is corrupted, the DB on this whole cluster is temporarily unavailable. Correct?
    define "corrupted". Do you mean by a hardware failure? Do you mean by the Oracle Software?
    Jeff Hunter

  3. #3
    Join Date
    May 2002
    Posts
    2,645
    And do you really mean 99.99% uptime? 525600 minutes in a year. You're allowed to be down almost 53 minutes for the entire year.

  4. #4
    Join Date
    Apr 2001
    Posts
    257
    I meant by Oracle software or any possibility that the file is no longer accessible (not due to a disk failed completely) such as sectors where the datafile resides is corrupted.

    Assuming there is some RAID protection like RAID 0+1 in place so the chance of having more than 2 disks down at the same time is very small. This makes me wonder: what happen if one of the mirrored volumes has corrupted sectors? Could Oracle read from the volume with the bad sectors? Or the volume manager can detect the bad sectors right away thus prevent Oracle from accessing it?

    Thanks,

  5. #5
    Join Date
    Nov 2000
    Location
    greenwich.ct.us
    Posts
    9,092
    Originally posted by a128
    I meant by Oracle software or any possibility that the file is no longer accessible (not due to a disk failed completely) such as sectors where the datafile resides is corrupted.

    In that case your "corruption" would be duplicated on another system as well.


    Assuming there is some RAID protection like RAID 0+1 in place so the chance of having more than 2 disks down at the same time is very small. This makes me wonder: what happen if one of the mirrored volumes has corrupted sectors? Could Oracle read from the volume with the bad sectors? Or the volume manager can detect the bad sectors right away thus prevent Oracle from accessing it?
    Yes, your volume manager should take care of this. Oracle doesn't even have to know there has been a failure.

    What I am trying to get at is that a risk of corruption by the Oracle software is very low. If you want to plan for every possible scenario, then you are talking about a pretty hefty price tag. For a high degree of availability (probably more than most will need), go with a two or three node cluster. If you need a greater degree of uptime, go with two two-node clusters and use log-based replication (something like quest shareplex) to keep the two in sync.
    Jeff Hunter

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width