The Functioning of Oracle Change Data Capture – A Comprehensive Guide

Let’s start with a brief description of the Change Data Capture (CDC) of the Oracle database management system.

Change Data Capture (CDC) is a software design pattern widely used to track and monitor changes to a database so that suitable action may be taken by organizations based on those changes. CDC is also a process that integrates data based on data capture and identification as well as data delivery of all changes made to the source databases in enterprises. 

The technology on which Oracle CDC (Change Data Capture) has been developed works in various ways. Not only does it facilitate real-time data integration across enterprises, but it also speeds up data warehousing and improves the performance and availability of databases. Several replication activities can also be carried out with the Oracle CDC without affecting database performance. These include migrating databases to the cloud, without any downtime and divesting queries from databases in production to data warehouses or other targeted analytical platforms.

Oracle CDC is commonly used to extract incremental data (changes made at the source after database migration) and transfer it to a data warehouse. Preserving and capturing the state of the data primarily in a data warehouse environment is one of the critical functions of Oracle Change Data Capture. Developers have the option to configure this feature in various ways, from application logic to physical storage in one or multiple combinations of system layers.

Development of the Oracle Change Data Capture Technology

The evolution of the Oracle CDC technology started with the launch of the 9i version of Oracle where this feature came out of the box. It tracked and recorded all changes in user tables in a database which were then stored in specific change tables to be used in ETL applications. This data, after being processed and formatted, was stored in databases and data warehouses. 

The launch version of the Oracle CDC functioned through in-built triggers in the source tables but this method was considered to be very invasive and DBAs were not very receptive to it. In keeping with the feedback, Oracle made changes to its CDC tool and launched a less-intrusive form with its 10g version. This worked by leveraging the redo logs of the source database in Oracle and was named Oracle Streams. It could detect and transfer change data to a target data storage system without lowering the performance of the source database.      

Even though Oracle CDC in this form was very well received, Oracle surprisingly decided to discontinue Oracle Streams after its 12c version was put on the market. Streams no longer supported CDC and users had to either pay for Oracle GoldenGate that had the feature or seek another matching solution. 

The Present Form of Oracle CDC

Now, consider the basic concept of CDC. It is a process where one computer stores data that has changed and another computer takes some action based on the changes. The first computer is the source of the data and the second is where the data is to be transferred, commonly called the target database. In some cases, both the source and the target databases may be the same and, in these cases too, Oracle CDC works no less efficiently. It is not uncommon to find several CDC solutions present in the same system.  

Any change made to the source data and used for other applications is identified by the Oracle Data Integrator of the Oracle CDC. The Data Integrator supports two journalizing modes. The first is the Simple Journalizing mode. It is good for tracking changes made to individual data that is stored in a system. The second is the Consistent Set Journalizing mode. This is used for tracking changes made to a set of data stores by considering the referential integrity between each data store. 

An Oracle Data Integrator model can be easily integrated with the Oracle CDC. It is not a difficult or complex task. 

Types of Oracle Change Data Capture

Businesses have two types of Oracle CDC to choose from depending on their specific needs. 

# Synchronous Change Data Capture

In this form of CDC, triggers are inserted into records in a change table whenever data is modified. Once a change is identified, these are the points that are activated. 

How does Synchronous Change Data Capture work?

Here, a user acts as a change data publisher with access to the tables at the source from where the changes have to be tracked and captured. Then, a changeset and tables are created that subscribe to the changes. To do this, a script has to be used that will copy the data, develop the records, and add the data to the intended target database. 

The downside of the Synchronous Change Data Capture is that the triggers adversely impact the performance of the database. 

# Asynchronous Change Data Capture

In this type of Oracle CDC, data is sent to the redo log files and any changes to the data are captured only after a DML activity is performed by a SQL statement. CDC does not have any effect on the transaction as the modified data is not captured as a part of the transaction that changed the source table. There are three modes of Asynchronous Change Data Capture – HotLog, Distributed HotLog, and AutoLog. This form of Oracle CDC offers a relational interface and is structured on Oracle Streams. 

Summing up, it is seen that Oracle CDC has played a major role over the years in taking database administration as well as migration and replication activities to a very high and greatly optimized level. Paying for this feature is worth the investment.