DBAsupport.com Forums - Powered by vBulletin
Results 1 to 8 of 8

Thread: What will be Zero Down Time Setup?

  1. #1
    Join Date
    Nov 1999
    Location
    Kuwait
    Posts
    122

    Question What will be Zero Down Time Setup?

    Hello Friends,

    I came across to a setup at one of my friends workplace and was wondering if they are doin it rite, also now they want to implement zero down time topology. So I thought of giving it a try and draw up couple of scenarios for this site. I would definitely needing your valuable comments and advices before working out a final plan for him. So goes the details:

    Current Setup is as following:

    2 nodes Redhat OS clustered with each node having
    2 x Inter Dual Core 2.67 GHz with 667 MHz BUS Speed
    4 GB RAM.

    Node1 Called Oracle1, Running OVPI and OVSD as 2 Oracle 9i Instances and Databases with following file system mounted : /U02 and /U03 Each oracle instance have one user OVPI and OVSD in the other instance.

    Node2 Called Oracle2, Running OVO (OPENVIEW) and OVNNM as 2 Oracle 10g Instances and Databases with following file system mounted : /U04 and /U05 with 1 user in each database openview and ovnnm

    Shared Storage with files system /u01, /u02, /u03, /u04, /u05
    /U01 => Oracle 9i and 10g Binaries
    /U02 => Data files for OVPI Database
    /U03 => Data files for OVSD Database
    /U04 => Data files for OVO Database
    /U05 => Data files for OVNNM Database

    At Any Given Time Each node will be running 2 Oracle Databases and if One nodes go down, the databases are relocated to Other Node which means that Single Node will be running 4 oracle DBs 9i 2 databases and 10g 2 databases.

    Now what are the possibilities of making this setup better… and implement ZERO downtime topology?

    Node1 Oracle1 (9i)will have only 1 Database with 2 users OVPI and OVSD and Node2 Oracle2 (10g)will have only 1 Database with 2 users OVO and OVNNM. Use RAC? Or Dataguard? What will be the failover effect and what could be done to have no single point of failure?

    I can think of 3 scenarios:

    a) Node1 Oracle1 (9i)will have only 1 Database with 2 users OVPI and OVSD and Node2 Oracle2 (10g)will have only 1 Database with 2 users OVO and OVNNM. Use RAC for load balancing and availability, if one node codes down the database moves to the available node.

    b) Have Node1 running 10g and 9i Database and Node2 will have the replica database which will be updated through dataguard.

    c) What else can be here?


    I really appreciate your time and ideas.


    Sincerely,
    NK
    NK
    ====================================================
    Stand up for your principles even if you stand alone!
    ====================================================

  2. #2
    Join Date
    Jun 2006
    Posts
    259
    You can minimize your downtime, but not completely eliminate it.

    1. Locate the servers in disparate geographic locations. At a minimum the two systems should be in differing data centers and even better if they are located a good distance apart.
    Reasoning, avoid system downtime due to fire, water, power failure, Natural disaster or Human error.

    2. If you agree that item 1 is reasonable. Then this means that RAC is eliminated as an option. Leaving you with DG, Streams, Advanced Replication or another vendors Replication solution, such as Ixion's, Golden Gate, or Quest.

    Each soulution comes with its own set of advantages, disadvantages and costs.

  3. #3
    Join Date
    Nov 1999
    Location
    Kuwait
    Posts
    122
    I'm mainly not concerned about DR site now, just wanted to brainstorm on what possible good options we have to make some what MAA
    NK
    ====================================================
    Stand up for your principles even if you stand alone!
    ====================================================

  4. #4
    Join Date
    Nov 2000
    Location
    greenwich.ct.us
    Posts
    9,092
    There's no such thing as zero downtime.

    To limit your downtime, you have to limit your single points of failure. The closest thing to zero downtime is redundant nodes using RAC. In your case you could have one instance or two and separate them out to their primary use. However, then you're not really taking advantage of the scalability of RAC, you just have a passive cluster. However, RAC, by definition, needs shared storage. Your single points of failure here are storage subsystem, environmental, and maybe network.

    Or, you could setup a DG in maximum protection mode where each node participates in the transaction. Each node would have a dedicated storage subsystem. This is still a passive setup as only one node can actually do the work and failover is probably not automatic. Your single points of failure here are environmental and maybenetwork. You can limit your environmental by locating the standby node in another geographic location, but then you are more dependant on the network.

    Or, you could have a Multi-master replication setup where updates happen on both nodes in different geographic locations. Your single point of failure in this case may be the network, or it may be nothing. However, multi-master replication is a complex topic not suited for the everyday DBA.

    Or, you could use a combination of the above to achieve your goals. For example, you have two RAC Clusters in two geographic locations. Each RAC Cluster participates in multi-master replication. Each RAC has two standby dbs; one local and one in yet another geographic location.

    The question is are you happy with 99% uptime, or 99.9999% uptime and how much do you want to spend to get it?
    Jeff Hunter

  5. #5
    Join Date
    Mar 2007
    Location
    Ft. Lauderdale, FL
    Posts
    3,555
    Excelent post Jeff, you keep raising the bar.

    Having the business requirements and the monies to pay for it I would go with your last option, local RAC doing replication to a Disaster Avoidance remote location.
    Pablo (Paul) Berzukov

    Author of Understanding Database Administration available at amazon and other bookstores.

    Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.

  6. #6
    Join Date
    Nov 1999
    Location
    Kuwait
    Posts
    122
    Quote Originally Posted by marist89
    There's no such thing as zero downtime.

    I agree but there are combinations of course which have its own cost by which we can eliminate every single point of failure rite?

    If we remove
    storage subsystem, environmental, and maybe network
    then what is there we can enhance in current environment? If we have Node1 and Node2 each running 2 DB's then have them on RAC? or have Node1 running all 4 DB's and have Node2 replicated by Data Guard? Do you think this will help?

    I was thinkin to merge these existing databases into one, I guess there is a script shipped with Oracle which can be used to check the compatibility rite?

    But about the replication on storage level means OS thingy or something from Oracle? Also Hot backup standby technique can be archived by Data guard? True?

    Its an open ended discussion, and I really wanted to have inputs from friends around the globe to share thoughts on this kind on discussion
    NK
    ====================================================
    Stand up for your principles even if you stand alone!
    ====================================================

  7. #7
    Join Date
    Nov 2000
    Location
    greenwich.ct.us
    Posts
    9,092
    Quote Originally Posted by nabeel
    If we remove then what is there we can enhance in current environment? If we have Node1 and Node2 each running 2 DB's then have them on RAC? or have Node1 running all 4 DB's and have Node2 replicated by Data Guard? Do you think this will help?
    First off, if you're using RAC, you have a single point of failure - the storage subsystem. Many of today's high-end storage subsystems have redundant components, but if that storage array goes away because of a botched firmware upgrade or something else, you can kiss your db goodbye.

    In your scenario, you've got one active node (node1) and two inactive nodes (node2 and standby-node1). Seems like a waste to me. Why not have node1 and node2 active all the time and have standby-node1 as your standby database in a different location.

    I was thinkin to merge these existing databases into one
    Sounds like a smart plan to me if you're going to use RAC.

    But about the replication on storage level means OS thingy or something from Oracle?
    OS replication is where your storage subsystem copies changed blocks to another server. Not sure what the point is since the technology comes bundled with Oracle.

    Also Hot backup standby technique can be archived by Data guard? True?
    All I can tell you to do is read. They're two separate things.
    Jeff Hunter

  8. #8
    Join Date
    Jun 2006
    Posts
    259
    Jeff, I mostly agree. Great first post by the way.

    Quote Originally Posted by marist89
    First off, if you're using RAC, you have a single point of failure - the storage subsystem. Many of today's high-end storage subsystems have redundant components, but if that storage array goes away because of a botched firmware upgrade or something else, you can kiss your db goodbye.
    I totally agree with this! In fact I've had shared storage subsystem failures. Even some that have taken several hours to restart. And others that have been so horribly botched that some DB files had to undergo recovery.

    Quote Originally Posted by marist89
    Sounds like a smart plan to me if you're going to use RAC.
    Just be careful and test. Although oracle touts RAC as a scallable solution there are pitfalls to RAC as well. One thing that comes to mind is global locking.

    Also what happens in the RAC configuration when you hit a bug? Typically both sides of the cluster must be restarted, especially if the bug is in the lock management layer... So much for zero downtime with rac.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width