Introduction:
Today's Database managers are more challenged than ever to
find the appropriate tools and technologies for their mission-critical
databases. When these databases are very large it requires a tremendous amount
of effort to create a High Availability architecture that provides a
continuously available solution that is robust and that works even during upgrades
and maintenance with zero downtime. One of the
challenges in designing a highly available Database infrastructure is examining
and addressing all of the possible causes of downtime. Many of the world's critical
databases that support mind-boggling transactions, whether it is EBay or Amazon,
employ replicated database features, such that the failure of any component of
a single database, including DISK, CPU, RAM or NETWORK, will not have a
catastrophic failure on the entire system. In addition to replication, systems
are developed to allow in-flight transactions to be automatically reconnected
to another fully replicated database management system, which is most often
located at an entirely different geographical location than the primary
database. However, replication involves much more than just setup, as marriage
involves much more than a honeymoon to survive, and besides designing the
applications for in-flight transactions requires meticulous planning. This is
where third party software will come into play. In this article, let us examine
available replication options that are viable for your critical database.
Oracle's Replication:
Replication is the
process of creating and maintaining a replicated version of database objects. Replication
also improves performance by accessing a local database, rather than a remote
server, to reduce the network traffic. Replication improves high availability
of an application to function if parts of the distributed databases are down, since
replicas of data might be accessible. Oracle server supports two forms of
replication: Basic and Advanced replication. Basic replication is implemented using the standard
“create materialized view” statement, which replicates only data; copies of
this replicated data are read only. Oracle's advanced
replication supports various configurations of updatable snapshot and update anywhere
replication, but it is more difficult to configure. Advanced replication allows
database objects like indexes and procedures, triggers, views and synonyms to
be replicated; however, objects like sequences and data types like LOBS can't
be replicated. In an asynchronous replicated environment, local updates are
stored in a deferred transaction queue until the remote site becomes available
and the transactions are applied when the remote site comes back up. Oracle
also replicates the replication catalog or metadata that determines which
objects are being replicated and how often their changes are propagated. This
will ensure that there is no single point of failure in a replicated
environment. If two sites change the same data within the data replication interval,
you will have an update conflict. Oracle resolves the conflict by using the
conflict resolution method configured when you first replicated that object. In
addition, a DBA can resolve the conflict using DBMS_REPCAT package with
procedures like ADD_UPDATE_RESOLUTION.
Oracle's Multi-master replication allows
multiple sites, acting as peers, to maintain replicated master copies of
objects. Each replicated site maintains a complete updateable copy of the
replicated object. A DBA can configure participating sites as either
Synchronous, or Non-Synchronous. These benefits do not come without a price. Configuring
multimaster replication is a complex process. Its complexity is compounded by
the need to implement conflict resolution processes, especially in the case of
asynchronous propagation of changes. Larger installations could require a DBA to
spend hundreds of hours configuring replication and may require a dedicated DBA
to manage the environment.
Though the concept of Oracle replication is a good fit for
certain applications (for example, to replicate a limited number of tables that
are small in size), allowing many applications to depend on replication could
be costly, due to less efficiency when large tables are replicated. Oracle's replication is
relatively simple to setup, avoids 3rd party vendor complexities and
it is free. The main disadvantage is that it does not support all data types, such
as LOBs. Its complete refresh configuration for large tables (multi-million
rows) is unacceptable. When
configured for Fast Refresh, there is an increased risk of replication failure
with large refresh volumes.
Therefore, in conclusion, Oracle's Replication may be a viable option only if used in a limited way, for a relatively smaller number of transactions. For Very Large Databases (VLDB) that require millions of records to be replicated in order to maintain a High Availability (HA) architecture, this may not be the ideal configuration. Let us discuss in the next phase if there any other ways to for a large number of tables to be replicated.
Oracle's Streams Replication:
Oracle 9i introduced a more flexible
and efficient way of implementing replication using Streams. Contrary to what happens
with Advanced Replication, there is no a Master Definition Site and all the
sites play the same role. Every database involved in the Streams replication is
in charge of capturing those changes involved in the replication and
propagating them to the other sites. Oracle's
Streams is a flexible, unified solution for information sharing across databases.
Oracle Streams has three basic elements; capture, staging and consumption. These
elements can be configured in different ways, depending on business
requirements, so that replication of data takes place from one database to one
or more databases even if the databases have different structure. One flexibility
of streams is that it can handle database variations in hardware platforms or
database releases or even charactesets easily.
In a nutshell, replication using Streams is implemented in the following way.
- A background capture process is configured to capture changes made to tables, schemas, or the entire database. The capture process captures Changes from the redo log and formats each captured change into a logical Change record (LCR).
- The capture process uses logminer to mine the redo/archive logs to Format LCRs.
- The capture process enqueues LCR events into a queue that is specified. This queue is scheduled to propagate events from one queue to another in a target database. A background apply process dequeues the events and applies them at the destination database.
Another advantage of using streams for replication is that streams provide the ability to transform the stream. Oracle provides some easy to use declarative transformations, for the most common transformations such as changing the name of table at the destination. Besides, the data at the destination site can be subsetted based on the content. For example, you could implement a rule that only changes to the employee table, based on the department identifier column, be applied to particular table. Oracle Streams automatically manages these changes. However, main disadvantage of Streams is that it does not support all data types. If table has any single column with LONG or BLOB or ROWID datatypes, then Streams can't capture changes from that table. As a workaround, you can try creating a shadow table without the unsupported column at the source and replicate it. The DBA can populate it via triggers on the original table. Then set up Streams Replication between the shadow table and the destination table, possibly using rule-based transformations or DML handlers to resolve naming issues. Restrictions on Streams do not end there. There are many commands that Streams is not able to capture. Commands such as “ALTER DATABASE”, “ALTER SYSTEM”, “SET ROLE”, “SET TRANSACTION”,” SET CONSTRAINT” and some other commands that make changes to index-organized tables, changes involving user-defined types, are not captured. Oracle plans to extend the list of data types supported by Streams with each new release. Having discussed these limits in implementing Streams for Replication, it is worth discussing other third party replication products.
Veritas Database Replication:
With its focus on High Availability,
Veritas Software makes storage management products that improve the
availability of Oracle databases. One important approach Veritas employs for creating
high availability systems is File System and volume-level replication. Veritas
Software's replication products are VERITAS Storage Replicator for Volume
Manager (SRVM) and the VERITAS Database Replication
Option for Oracle (DRO). Although there are some functional differences
between these products, they differ in both design and intent, and as such are
suited for different replication applications. Let us discuss them briefly here,
as implementation and troubleshooting of these products are not in the scope of
this article. Please refer to Veritas knowledge base for installation.
The Database Replication Option for Oracle (DRO): It is used to replicate production
data periodically to a secondary system at file system level. This replicated Database
system can then be used for off-host processing, such as backups, data mining,
reporting and decision support database systems. The Database Replication
Option requires Veritas Database Edition for Oracle software packages to be
installed on the primary and target (replicated) database servers. After an
initial, full replication, DRO uses VERITAS' unique Block Level Incremental
(BLI) technology to track and replicate only changed file system blocks between
the primary and the secondary system. These replication operations occur at
periodic intervals designated by the System Administrator On the secondary
(replicated) system, database administrators can use Storage Checkpoint and
Storage Rollback operations to make changes to the replicated data and then
“undo” the changes. This means that you cannot only read the replicated data,
you can write to it. This is very helpful for testing any production changes
before we rollout them into the primary database server. Resynching the data is
quick and does not downgrade the production Database server's performance. DRO
supports occasional replication - it may be as frequent as several times a day.
Another important aspect of this product is the DBA can update the replicated
data (for testing or training purposes) then roll back to the previous image.
The Storage Replicator for Volume Manager (SRVM): This Veritas product is a
real-time, ongoing data replication solution at Logical Volume level. This
means that replication is completely transparent to the applications involved. Storage
Replicator for Volume Manager is implemented as a Veritas Volume Manager (VxVM)
module. Whenever data is written to the replicated volume, SRVM automatically
sends it to one or more sites (depending on the configuration). SRVM works over any IP network, LAN or WAN, supporting
up to 32 nodes in a replicated network. One of the advantages of SRVM is that the integrity of data on
all systems is maintained through a variety of techniques, including write
order fidelity and In-Band Control.
It is an ideal solution for applications requiring
up-to-the-minute data replication.
SharePlex Replication:
Quest software has been
providing an Oracle Data replication product since 1997 and it is called
SharePlex. Many enterprises deploy this product. SharePlex uses the transaction
details in the redo log files for the data replication. It does not use the
database engine to replicate, there by reducing the load on the primary
database server. As soon as the transactions are written to redo logs, the SharePlex
module extracts them. Thus, it can update the changes at the destination site
very quickly. Therefore, it is a real-time solution that replicates
transactions as they occur without waiting for the commit. SharePlex software provides
high-speed, log-based replication between Oracle instances. As stated earlier, SharePlex updates
the target database in real-time, providing a reliable copy of the production
instance that can be used for reporting, queries, extracts, backups, and high
availability. If a transaction on the primary
database is cancelled, Share-Plex replicates the rollback so that the target
database is an accurate representation of the source database.
As changes are made to primary database, SharePlex
continuously replicates them to an Oracle
Instance on the target (replicated) system. When target data
is identical to source data, taking into account varying differences caused by
delays in transport over a network, the two sets of data are synchronized. A
source system can have many target systems. In addition, a source system can
simultaneously serve as a target system, receiving data from other source systems.
Likewise, a target system can receive data from many source systems, and it can
simultaneously serve as a source system, sending data to other target systems.
The advantage of SharePlex for Oracle is its tolerance for outages.
If the target Oracle database is unavailable, SharePlex queues data on the
target system, allowing transactions to accumulate until an Oracle connection can
be re-established. If the target system itself is unavailable, or if there are
network problems, SharePlex stores the transactions on the source system until
operations are restored. The DBA can control when the
transactions are applied to a target system or the DBA can stop the post process.
When ready for the transactions to be applied, the DBA can start the posting
process again.
SharePlex enables a DBA to customize replication to specific
needs. For example, the DBA can replicate an entire table, or a subset of its
data (columns) beyond a firewall while protecting other, more sensitive data. The
DBA can replicate different records to different locations and can configure
SharePlex to interact with PL/SQL procedures that transform data before, or
instead of, posting it to a target database.
How SharePlex works
SharePlex replicates data using a set of replication processes;
Capture, Read, Export, Import and post. These processes start
automatically, as they are needed, but they can be controlled with commands as
well. Let us discuss these processes to understand how SharePlex replication actually
works.
Capture Process: The Capture process reads the redo logs or archive logs on
the source systems for changes to objects. Capture writes the data to the
capture queue where it accumulates until the next
SharePlex process is ready for it. There is a separate Capture process for each
data source being replicated.
Read Process: The
Read process operates on the source system to read data from the capture queue
and add routing information to the data; after processing the data it sends it
to the next queue. There is a separate Read process for each data source.
Export Process: The Export
process operates on the source system to read data from the export queue and
send it across the network to the target system. By default, a source system
has one Export process for each of its target systems. If there are two target
systems, there are two Export processes.
Import Process: The Import process is the second half of the Export/Import transport
pair, operating on the target system to receive data and build a post queue. If
there are two source systems replicating to a target system, there are two Import
processes
Post Process:
The Post process
operates on the target system to read the
post queue, construct SQL statements for replicated
operations, and apply them to
target objects. There is a Post process for each post queue
on a target system, and
multiple Post processes can operate simultaneously on a
system.
SharePlex determines which transactions/objects to replicate
based on the information provided in the configuration file. SharePlex creates
one or more messages that are sent from Capture to the Read process and from
Read to all of the other replication processes in sequence. A message can
reflect a SQL operation like INSERT, UPDATE, DELETE, and COMMIT, TRUNCATE. Large
operations, like those on LONG or LOB columns, can require more than one
message because a message has a size limitation. Other operations, such as
array inserts of small records, have the inverse effect: There could be one
record for numerous operations. For example, an array insert of 100,000 rows
might be accommodated with only 1000 messages, depending on the data. On the
target system, the Post process receives the messages and constructs standard SQL
statements to post replicated changes to the target database.
Much of this process is controlled by internal tables that
were created during the installation of SharePlex. The installation process of
SharePlex on the source system and target system takes an hour or two and it is
relatively simple. It is easy to administer and troubleshoot.
Customers looking for high-performance database replication
for Oracle Database systems, requiring support for thousands of rows per second
for load distribution, disaster recovery, migrations or for off-loading
reporting should consider SharePlex for their mission critical database that
support intensive transactions.
In Conclusion:
This article discusses various Replication
technologies. IT managers need to plan how to use
replication over the next several years before choosing a tool or developing
their own tool using database features like streams.
Some factors to consider: How are
replication needs expected to grow? Is your company considering using
replication for situations such as customer service, running reports or
database testing or Disaster Recovery? How often and how fast will you need to
refresh the data in these alternate applications? Do your users agree to long down
times while you upgrade the database server or
migrate to different platform?
Issues to consider when choosing third
party tools:
Are
the tools application-aware? If the tool can integrate with different Enterprise applications like SAP
or Remedy, companies can save time and money by using it to manage replication
across multiple applications.
Will
the tools work cross-platform? This helps give DBAs more control of their
environment and enables them to manage replication through policies and
procedures implemented and automated companywide.
Are
you looking for ease of use? Some tools like SharePlex allow the DBA to
offload replication tasks during after hours to the people who are responsible for
monitoring the servers – operators/ administrators, for example. In this case,
ease of use is extremely valuable.
Replication is more than just backup, and
replication management software is becoming more necessary in a world that's
increasingly driven by information. Companies should examine whether this
technology is the right choice for them.
The wise one once said:
Replicate early and often... Infrastructure crashes can happen at any time for
any reason!
Back to DBAsupport.com