Oracle CDC

From HaFrWiki
Jump to: navigation, search

Introduction

Oracle Change Data Capture (CDC) [1] was introduced with Oracle 9i, and provide:

  • ability to track changes to tables
  • store them in a change table, for further consumption by an ETL process.

Oracle 9i Change Data Capture worked by creating triggers on the source tables, transferring data synchronously but creating a processing overhead and requiring access to the structure of the source tables. Because of the effect that the triggers had on the underlying tables, many warehouse projects did without change data capture and used other methods to capture changes.

Oracle 10g introduces Asynchronous Change Data Capture (ACDC), which instead of using triggers uses the database log files to capture changes and apply them to collection tables. Asynchronous Change Data Capture therefore doesn't require changes to the table structure and doesn't impact on database performance.


Oracle Streams

Oracle9i Database introduced the Change Data Capture (CDC) feature. Oracle Streams Change Data Capture captures all the inserts, updates, and deletes made to user tables (CUD). These changes are stored in a database object called a change table, and the change data is made available to applications in a controlled way through what is called a subscriber view.

Prior to Oracle 10g, the change data was synchronously fed from the source. In Oracle Database 10g, it becomes possible to asynchronously feed the change data, and there are multiple options for how this feed is obtained and managed. Asynchronous data capture reads the changed data from Oracle redo log files. As a result of this, dependence on the transaction activity is avoided and performance overhead is reduced. Oracle Streams Methodology is used to capture change data from redo log files.

Asynchronous change data capture requires a streams configuration for each CDC change set. These configurations include the streams capture and apply processes that read change data from redo log files and inserts it into change data capture change tables.

Asynchronous change data capture requires both the source and staging databases to be Oracle Database 10g.

ACDC

Asynchronous Change Data Capture (ACDC) is now adopted, in which change data is extracted from the redo logs without any negative performance implication on the source database. Furthermore, asynchronous change data capture can be described as a lightweight technology targeted towards change extraction and propagation in a data warehousing system and in which changes to the source tables are viewed as relational data for onward consumption by subscribers.

When a SQL statement that performs a DML operation is committed, change data is captured by taking it from the redo log files. Using this technique, the change data is not captured as part of the modifying transaction, and therefore has no dilatory effect on that transaction. Asynchronous Change Data Capture is available with Oracle Database 10g Enterprise Edition only.

Asynchronous Change Data Capture is based on, and provides a relational interface to, Oracle Streams. The following are the two methods of capturing change data asynchronously:

  • HotLog Mode
  • AutoLog Mode

HotLog Mode

In the HotLog mode, change data is captured from the current active redo log file on the source database. The change tables are populated automatically as new committed transactions arrive. No data is recorded until the transaction commits. There will be a minor delay between the commit of the source table transactions and the arrival of change data to the change table.

In the HotLog mode, there is a single, predefined HotLog change source, identified as the hotlog_source, that represents the location of the current redo log files of the source database. There is only one source for the HotLog change.

Nota Bene: Once defined, this cannot be altered or dropped without a database restart.

AutoLog Mode

In the AutoLog mode, change data is captured from a set of redo log files managed by the log transport services of Data Guard. The publisher of the change data (source database) must configure the Data Guard log transport services to copy the filled redo log files from the source database system to the staging database system, and then to automatically register the redo log files. Change sets are then populated automatically as new redo log files arrive. The amount of delay depends on the frequency of redo log switches in the source database.

There is no requirement for a predefined AutoLog change source. The publisher form the source database provides information about the source database to create the AutoLog change source.

More details on Oracle Streams Asynchronous Change Data Capture can be found in the book Oracle Streams [2], which also details improvements to the transportable tablespaces feature in Oracle Database 10g.

Examples

Streams Data Change Capture Example

BEGIN
 iscn := DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER();
 dbms_output.put_line ('Instantiation SCN : ' || iscn ) ;
DBMS_APPLY_ADM.SET_SCHEMA_INSTANTIATION_SCN@DNYOIP20.world ( 
    source_schema_name    => 'ny2',
    source_database_name  => 'DNYTST10.world',
    instantiation_scn     => iscn ,
    recursive => true
    );
END;
/

Start both the Streams Change Apply and Capture processes at the destination and source databases, respectively:

connect STRMADM/STRMADM@dnyoip20

PROMPT Connected to Destination (DNYOIP20)
PROMPT Starting the Apply Process at Destination
begin
dbms_apply_adm.start_apply (apply_name => 'LN2_APPLY ' ) ;
end ;

See also

top

References

top

  1. Inside Oracle Streams Configuration Change Data Capture, Article by Burleson Consulting, May 18, 2008.
  2. Oracle Streams, Books 2004 2 Streams.