Steve Callan offers an overview of Oracle Streams and GoldenGate, comparing topology similarities and differences.
Replication in any database system, not just Oracle,
provides functionality that can fill many needs. One of the main benefits of replicating
data in Oracle is reducing (and practically eliminating) the load on a
production server. It’s not just DML and transactions taking place on a server.
The flip side of inserting, updating and deleting all that data is reporting on
it. With replication, you can offload reporting to another server. Remote sites
may not need to be exact replicas of the source system either, so replicating
part of the source may be more efficient than duplicating it via a full-scale
replication or use of a standby system.
Let’s start with a one over the world view of Streams and
Data Guard. These two Oracle features are closely related.
Replication is not entirely separate from protecting data
via a standby database – a logical standby, that is. Streams, for example, use
some of the same architecture used to establish a logical standby. In Data
Guard, you can apply SQL statements, or you can apply redo logs. With Streams,
you can use pieces of Data Guard SQL Apply. SQL Apply equates to logical
standby, the other being a physical standby, which is based on Redo Apply. To
some degree, a limitation in each type of standby is an advantage in
replication. In a physical standby, the database is closed for queries (with
the exception of Active Data Guard, where the physical standby is read only).
In a logical standby, the target database is open for read-write, but (and it’s
a big one) not all data types are supported.
This is what Streams offers: the ability for you to make
modifications on the remote or downstream database. Given that the downstream
database is used for reporting purposes, being able to add indexes and
materialized views, plus filter or transform data along the way (between the
source and the target), your being able to do things that make queries run faster
is of obvious value. What Streams captures is granulated, that is, it can
capture changes made to tables, schemas, or the entire database. Coming back to
that idea of filtering or transforming data along the way: think of ETL and
data warehousing too.
Overall, Streams can be used for the following needs:
- Data Replication
- Date Warehouse Loading
- Database Availability (upgrade and maintenance)
- Message Queueing
- Event Management and Notification
- Data Protection
Topologies
Without getting into too much detail about Streams (and
believe me, there is more than enough to make your head hurt), the key
processes are Capture, Propagate, and Apply. Grab the data of interest,
transport it to the remote server, and then apply it aptly describes CPA.
With CPA in mind, there are quite a few ways Streams can be
architected. The simplest version is to replicate on one server. Only capture
and apply matter, as there is no need to propagate over a network. However, the
most common version, conceptually speaking, is shown below in Figure 1.
Figure 1 – Unidirectional system
A slightly extended version of the unidirectional system is the,
well, extended unidirectional version (cascaded or directed, to use other
terms). Figure 2 depicts this arrangement. The forwarding database can live on
the source’s server, the target’s server, or an entirely separate server.
Figure 2 – Cascaded/directed system
Then, in no time at all, the topology gets much more
complicated, and the terms N-way, Master-to-Master, Hub-and-spoke, Bi-directional
and Hybrid, to name a few, come into play. And oh yeah, so far we’re only
talking about Oracle to Oracle configurations. Streams supports heterogeneous
configurations as well. Figure 3 is what SQL Server’s Publisher/Subscriber
replication model is like (to use a simple example).
Figure 3 – Hub-and-spoke system
Figure 4 illustrates how multiple sources can feed into one
target. The target could be a data warehouse, being fed by multiple data
sources.
Figure 4 – Unidirectional spokes-to-hub system
Finally, an analogy for Figure 5 can be found in Active
Directory, where every domain controller (DC) is a primary domain controller
(in the past, there would be a primary DC and secondary DCs). In the Streams
case, every database can be a database of record.
Figure 5 – N-way/Master-to-master system
How does GoldenGate compare to Streams in terms of supported
topologies? Figure 6 (from the Administrator’s Guide) shows that GoldenGate is
just as robust as Streams in terms of your source and target options. Some of
the names or terms are different, but function-wise, they are the same.
Figure 6 – GoldenGate topologies
(from Oracle® GoldenGate Windows and UNIX Administrator's Guide 11g Release 1 (11.1.1)
So one question about GoldenGate is this: how is data replicated?
We know that data goes from source to target, but how is the data being
applied? Is it more like SQL Apply (logical) or SQL Redo (physical)? If not
physical, then can we assume that GoldenGate is also constrained by data types?
GoldenGate operates much like what takes place in SQL Apply,
so GoldenGate target databases are akin to logical standby databases (assuming
you’re doing database level replication; otherwise, tables could be of
interest). And yes, there are data type limitations, but they’re not that bad
or extensive.
|
GoldenGate Non-supported Datatypes
|
- ORDDICOM
- ANYDATA
- ANYDATASET
- ANYTYPE
- BFILE
- MLSLABEL
- TIMEZONE_ABBR
- TIMEZONE_REGION
- URITYPE
- UROWID
|
The main difference, aside from the extra licensing costs associated
with GoldenGate, is the ease of setup, installation and configuration.
GoldenGate is geared more towards a heterogeneous environment, so it doesn’t
depend on Oracle-specific items. Well, it does in one sense in terms of
applying your own SQL statements in a call to SQLEXEC(). However, as far as
relying on being able to access online redo logs, standby redo logs, or
archived redo logs, no big deal. GoldenGate uses its own proprietary trail
files (one set on the source and one on the target). It doesn’t care about
Oracle Net Services either when it comes time to transport data from the source
to the target.
GoldenGate is not without other limitations. There are
situations where things go bump in the middle of the night, but what it does
have is an easier interface (100% command line if you prefer) in which to
explore what took place. There is no need to decipher scores of
V$STREAMS_WHATEVER data dictionary views.
Another way to compare Streams and GoldenGate is to think of
coupling. Obviously, Streams would be considered to be tightly coupled with
Oracle (duh, where else it is used but within Oracle?). GoldenGate, on the
other hand, would be considered as being loosely coupled. Given that GoldenGate
is Oracle Corporation’s strategic solution for at least replication and data
integration, you can be sure that whatever is loose today won’t be nearly as
much in the future.
In the next article on GoldenGate, we’ll take a look at
setting up a simple unidirectional replicated system.
Related Articles
Oracle GoldenGate Fundamentals
Oracle Buys GoldenGate for Data Integration
Oracle Aims for Real-Time Business Intelligence
Back to DBAsupport.com