Real-Time Data Synchronization from Oracle to Singdata Lakehouse via Bluepipe
Solution Introduction
For over 30 years, Oracle has held a significant position in the relational database and data warehouse space. With the introduction of integrated systems such as Exadata, Exalytics, Exalogic, SuperCluster, and 12c Database, the tight integration of storage and compute enables faster processing of large amounts of data using on-premises infrastructure. However, the volume, velocity, and variety of data have dramatically increased, and the cloud brings more possibilities for modern data analytics. For example, by separating compute from storage, Singdata Lakehouse enables a new generation of cloud data platforms with automatic, instant scaling of compute and storage through a share-nothing data architecture.

Solution Advantages
Bluepipe supports real-time data synchronization from Oracle to Singdata Lakehouse. Particularly in complex database environments with multiple instances and tables, Bluepipe achieves automated synchronization, greatly reducing the complexity and workload of manually configuring sync jobs. For source databases with tens of thousands of tables, Bluepipe can automatically configure synchronization tasks.
Out-of-the-box, Ready in 10 Minutes
Bluepipecan run on virtually anyLinuxsystem, supportingx86andarmchips; common rack-mounted servers, laptops, and even Raspberry Pi can be used for deployment;- Minimal configuration process; default parameters achieve optimal performance.
Full and Incremental Integration, Zero Ops Intervention
- Deep synergy between full and incremental sync, with almost no routine maintenance operations required;
- Efficient data comparison and hot-fix technology, always ensuring data consistency;
- Highly robust
Schema Evolution.
Push Data, Not Expose Ports
Bluepipeis deployed together with your database in your internal network, with no need to expose ports externally;- Elastic
buffer sizetechnology, automatically balancing betweenThroughputandlatency.
Unique Advantages of the Oracle Pipeline
Bluepipe captures change data based on Oracle LogMiner. At the same time, it has been deeply optimized in the following areas:
Deep DDL Compatibility
Under the default LogMiner strategy, after a DDL operation occurs, subsequent DML operations on the related table cannot be correctly parsed, resulting in an inability to correctly capture changes.
Bluepipe maintains an automatic dictionary file build strategy to ensure that correct incremental data can still be captured after table structure changes.
Large Transaction Optimization
Oracle Redo Log records the complete transaction process, but from a business perspective, only data after commit is typically desired. Therefore, a buffer is needed during transmission to temporarily store change records that have not yet been committed.
Based on unique memory management technology, Bluepipe can easily handle transactions involving tens of millions of records even on a single node.
Starting from Oracle 12.2, the maximum length for table names and column names is supported up to 128 bytes. However, for various reasons, LogMiner for instances without an OGG LICENSE still does not support DML parsing for tables containing long names. For details, refer to the official documentation.
Based on efficient stream-batch fusion technology, Bluepipe fully supports incremental data capture and delivery in such scenarios.
RAC Architecture Support
Synchronization Results
LAG: approximately 10 seconds
Sync speed: 20,000 rows/second

Implementation Steps
Install and Deploy Bluepipe
If you have not yet completed the installation and deployment of Bluepipe, please contact Singdata or Bluepipe.
Configure Sync Data Sources in Bluepipe
Configure Oracle Data Source

Configuration Item Description
| Item Name | Description |
|---|---|
| Connection String | The connection method for the data source, format: IP:PORT:SID, e.g., 127.0.0.1:1521:XE |
| Username | The username for connecting to the database, e.g., C##CDC_USER |
| Password | The password corresponding to the database username, e.g., userpassword |
| Connection Name | A custom name for the data source, convenient for future management, e.g., Local Test Instance |
| Allow Batch Extraction | Read data tables by query, supports row-level filtering, enabled by default |
| Allow Streaming Extraction | Capture database changes in real time via CDC, enabled by default |
| Allow Data Writing | Can serve as a target data source, enabled by default |
Basic Features
| Feature | Description |
|---|---|
| Schema Migration | If the target does not have the selected table, automatically generate and execute the creation statement based on source metadata and mapping |
| Full Data Migration | Logical migration, sequentially scanning table data and writing data in batches to the target database |
| Incremental Real-Time Sync | Supports INSERT, UPDATE, DELETE common DML sync |
Configure Singdata Lakehouse as Target

Configuration Item Description
| Item Name | Description |
|---|---|
| Connection String | The connection method for the data source, format: {instance}.{domain}/{workspace}, e.g., abcdef.cn-shanghai-alicloud.api.singdata.com/quick_start |
| Virtual Cluster | Set the running virtual cluster, default value is default |
| Username | The username for connecting to the database, e.g., username |
| Password | The password corresponding to the database username, e.g., userpassword |
| Connection Name | A custom name for the data source, convenient for future management, e.g., Local Test Instance |
| Allow Batch Extraction | Read data tables by query, supports row-level filtering, extraction not currently supported |
| Allow Streaming Extraction | Capture database changes in real time via CDC, extraction not currently supported |
| Allow Data Writing | Can serve as a target data source, enabled by default |
Create Tables in Oracle Database
Create a New Sync Job in Bluepipe
Please note:
- When selecting the source, the
{namespace}in the target table name{namespace}/{table}can be replaced with your desired schema name, such asbluepipe_oracle_staging. - In the design phase, select "Use CDC technology for real-time replication" for incremental replication.
After successfully creating the sync job, the effect is as shown below:
The new job will automatically start and begin full data synchronization, followed by ongoing incremental real-time synchronization.
Data Generation
Run the following Python code to insert data into the Oracle source table employees in real time:
Observe Source and Target Data via Metabase
Please refer to: Metabase Installation and Deployment
