Synchronize data from a self-managed Oracle database to DataHub - Data Transmission Service

Use Data Transmission Service (DTS) to continuously synchronize change data from a self-managed Oracle database to a DataHub project. DTS supports schema synchronization, full data synchronization, and incremental data synchronization, so you can start streaming Oracle changes to DataHub without downtime.

What DTS supports for this scenario

Capability	Supported	Notes
Schema synchronization	Yes	Tables only; triggers are not supported
Full data synchronization	Yes	Free of charge
Incremental data synchronization (CDC)	Yes	Charged; see Billing
DML: INSERT, UPDATE, DELETE	Yes
DDL: CREATE/ALTER/DROP TABLE, VIEW, PROCEDURE, FUNCTION, TRIGGER, INDEX; RENAME TABLE; TRUNCATE TABLE	Yes
One-to-one synchronization	Yes
One-to-many synchronization	Yes
Many-to-one synchronization	Yes
Cascade synchronization	Yes
Foreign key synchronization	No	Cascade and delete operations on the source are not propagated to the destination

Prerequisites

Before you begin, make sure you have:

A self-managed Oracle database at version 9i, 10g, 11g, 12c, 18c, or 19c
The Oracle database running in ARCHIVELOG mode, with archived log files accessible and a suitable retention period configured. See Managing Archived Redo Log Files
Supplemental logging enabled on the Oracle database, with SUPPLEMENTAL_LOG_DATA_PK and SUPPLEMENTAL_LOG_DATA_UI both set to Yesalert notification settings. See Supplemental Logging
A DataHub project created to receive the synchronized data. See Get started with DataHub and Manage projects
A database account on the Oracle source with fine-grained permissions. See Prepare a database account
Familiarity with the capabilities and limits of DTS for Oracle synchronization. Advanced Database & Application Migration (ADAM) is used for database evaluation to help you smoothly synchronize data to the cloud. See Prepare an Oracle database and Overview
The following information ready before starting the task configuration:
- Oracle host IP or ECS instance ID, port (default: 1521), and SID or service name
- Oracle database account credentials
- DataHub project name and region

Important

To synchronize incremental data from Oracle, you must enable both archive logging and supplemental logging. See the Configure an Oracle database section of the "Prepare an Oracle database" topic.

Billing

Synchronization type	Fee
Schema synchronization and full data synchronization	Free
Incremental data synchronization	Charged. See Billing overview

Limitations

DTS does not synchronize foreign keys from the source database. Cascade and delete operations on the source are not propagated to the destination.

General limitations

Tables must have a PRIMARY KEY or UNIQUE constraint, with all fields unique. Without this, the destination may contain duplicate records.
For Oracle 12c or later, table names cannot exceed 30 bytes.
If you select individual tables as synchronization objects and plan to rename them or their columns in the destination, a single task supports up to 1,000 tables. For more than 1,000 tables, configure multiple tasks or synchronize the entire database instead.
Schema synchronization supports only tables. DTS does not support schema synchronization for triggers.
Warning
Delete any triggers from the source database before starting the task to prevent data inconsistency. See Configure a data synchronization or migration task for a source database that contains a trigger.
Only tables can be selected as synchronization objects.
A single string in the destination DataHub project cannot exceed 2 MB.
Write data to the destination only through DTS during synchronization. Writing through other tools may cause data inconsistency or loss, particularly when DMS online DDL operations are in use.
DTS calculates synchronization latency based on the timestamp of the latest synchronized record versus the current source timestamp. If no DML operations occur on the source for an extended period, the reported latency may be inaccurate. To update the latency, run a DML operation on the source. If you synchronize an entire database, create a heartbeat table that receives data every second.

Limitations for Oracle RAC databases

If the source is an Oracle Real Application Clusters (RAC) database connected over Express Connect, specify a Virtual IP (VIP) when configuring the task.
Use only a VIP — not a Single Client Access Name (SCAN) IP — when configuring the task. After specifying a VIP, node failover is not supported.
Do not perform a primary/secondary switchover while the task is running. Doing so causes the task to fail.

Limitations for incremental data synchronization

Redo logging and archive logging must be enabled on the source, with the following minimum retention periods: If logs are unavailable when DTS needs them, the task may fail or produce data inconsistency or data loss. After full data synchronization completes, you can reduce the retention period to more than 24 hours.
- Incremental synchronization only: retain logs for more than 24 hours.
- Full data synchronization followed by incremental synchronization: retain logs for at least 7 days.
Log retention requirements are part of the DTS service level agreement (SLA). If you do not meet them, DTS cannot guarantee service reliability or performance.
Do not use Oracle Data Pump to write data to the source during incremental synchronization. Doing so may cause data loss.
Do not update LONGTEXT fields during synchronization. Doing so causes the task to fail.
Do not perform DDL operations that change database or table schemas during schema synchronization. Doing so causes the task to fail.
The task fails if the source contains an empty string of type VARCHAR2 and the corresponding destination column has a NOT NULL constraint. Oracle processes VARCHAR2 empty strings as null values.

Create a synchronization task

Step 1: Open the task creation wizard

Log on to the Data Management (DMS) console.Data Synchronization Tasks page of the new DTS console
In the top navigation bar, click Data + AI.
In the left-side navigation pane, choose DTS (DTS) > Data Synchronization.
Navigation options may vary based on the DMS console mode and layout. See Simple mode and Customize the layout and style of the DMS console. You can also go directly to the Data Synchronization Tasks page of the new DTS console.
Select the region where the data synchronization instance resides.
In the new DTS console, select the region in the top navigation bar.
Click Create Task.

Step 2: Configure the source and destination

In the wizard, configure the source and destination databases using the following parameters.

Section	Parameter	Description
N/A	Task Name	A name for the DTS task. DTS generates a name automatically. Specify a descriptive name to make the task easy to identify. The name does not need to be unique.
Source Database	Database Type	Select Oracle.
	Connection Type	The access method for the source database. This example uses Self-managed Database on ECS. To deploy a self-managed database, see Preparation overview.
	Instance Region	The region where the self-managed Oracle database resides.
	ECS Instance ID	The ID of the Elastic Compute Service (ECS) instance that hosts the Oracle database.
	Port Number	The service port of the Oracle database. Default value: 1521.
	Oracle Type	The architecture type. This example uses RAC or PDB Instance. Select Non-RAC Instance to configure the SID parameter, or RAC or PDB Instance to configure the Service Name parameter.
	Database Account	The database account for the source Oracle database.
	Database Password	The password for the database account.
Destination Database	Database Type	Select DataHub.
	Connection Type	Select Alibaba Cloud Instance.
	Instance Region	The region where the destination DataHub project resides.
	Project	The name of the DataHub project.

Click Test Connectivity and Proceed.

DTS automatically adds its CIDR blocks to the security settings of Alibaba Cloud database instances and ECS-hosted databases. For self-managed databases in data centers or from third-party cloud providers, manually add the DTS CIDR blocks to the database allowlist. See Add the CIDR blocks of DTS servers.

Warning

Adding DTS CIDR blocks to an allowlist or security group introduces potential security risks. Before proceeding, strengthen account credentials, restrict exposed ports, validate API calls, and periodically audit allowlist entries. For higher security, connect the database to DTS through Express Connect, VPN Gateway, or Smart Access Gateway.

Step 3: Select synchronization objects and configure settings

Configure the following parameters.

Parameter	Description
Synchronization Type	By default, Incremental Data Synchronization is selected. Also select Schema Synchronization. After the precheck, DTS synchronizes all existing data from the selected objects as the baseline for subsequent incremental synchronization.
Naming Rules of Additional Columns	When DTS writes to a DataHub topic, it adds extra columns. If those column names conflict with existing columns, the task fails. Set this to New Rule or Previous Rule based on your requirements. Check for conflicts before setting this parameter. See Naming rules for additional columns.
Processing Mode for Existing Destination Tables	Precheck and Report Errors: checks for tables with identical names in source and destination before starting. If duplicates exist, the precheck fails and the task cannot start. To resolve this without deleting destination tables, use object name mapping. See Map object names. Ignore Errors and Proceed: skips the duplicate-name check. If schemas match and a record has the same primary key or unique key value, DTS retains the existing destination record during full synchronization and overwrites it during incremental synchronization. If schemas differ, some columns may not synchronize or the task may fail.
Case Policy for Destination Object Names	Controls capitalization of database, table, and column names in the destination. Default: DTS default policy. See Specify the capitalization of object names in the destination instance.
Source Objects	Select one or more tables from the Source Objects list and click to move them to the Selected Objects list. Only tables can be selected.
Selected Objects	To rename a single object before synchronization, right-click it. See Map the name of a single object. To rename multiple objects at once, click Batch Edit in the upper-right corner. See Map multiple object names at a time. To filter rows by condition when synchronizing individual tables, right-click an object and set a WHERE condition. See Set filter conditions. To set a shard key, right-click an object and clear Synchronize All Tables. Database name mapping is not supported. If you rename a table or column, other objects that depend on it may fail to synchronize. Note

Step 4: Configure advanced settings

Click Next: Advanced Settings and configure the following parameters.

Parameter	Description
Dedicated Cluster for Task Scheduling	By default, DTS schedules the task to the shared cluster. For higher task stability, purchase a dedicated cluster. See What is a DTS dedicated cluster.
Retry Time for Failed Connections	How long DTS retries a connection if the source or destination becomes unreachable. Valid values: 10–1440 minutes. Default: 720 minutes. We recommend that you set this parameter to a value greater than 30 minutes. DTS resumes the task if it reconnects within this window; otherwise, the task fails. If multiple tasks share the same source or destination, the shortest retry window takes effect. During retry, the DTS instance continues to be charged.
Retry Time for Other Issues	How long DTS retries if DDL or DML operations fail after the task starts. Valid values: 1–1440 minutes. Default: 10 minutes. We recommend that you set this parameter to a value greater than 10 minutes. This value must be smaller than Retry Time for Failed Connections.
Enable Throttling for Full Data Migration	Limits the load DTS places on the destination database server during full data synchronization. Configure QPS (Queries per second) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). This parameter appears only when Full Data Synchronization is selected.
Enable Throttling for Incremental Data Synchronization	Limits the load during incremental synchronization. Configure RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s).
Environment Tag	A tag to identify the DTS instance. Select a tag based on your requirements.
Actual Write Code	The encoding format for data written to the destination.
Configure ETL	Specifies whether to enable the extract, transform, and load (ETL) feature. Select Yes to enter data processing statements in the code editor. See Configure ETL in a data migration or data synchronization task. Select No to skip ETL.
Monitoring and Alerting	Specifies whether to set up alerts for task failures or latency exceeding a threshold. Select Yes to configure an alert threshold and notification contacts. See Configure monitoring and alerting when you create a DTS task.

Step 5: Run the precheck and purchase an instance

Save the task settings and run the precheck.
- To preview the API parameters for this task, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
- Click Next: Save Task Settings and Precheck to proceed.
DTS runs a precheck before starting the task. The task can start only after passing the precheck. If the precheck fails, click View Details next to each failed item, fix the issue, and rerun the precheck. If an alert is triggered for an item that can be ignored, click Confirm Alert Details, then Ignore, then OK, and run the precheck again. Ignoring alerts may result in data inconsistency.
Wait until the Success Rate reaches 100%, then click Next: Purchase Instance.

On the buy page, configure the following parameters.

Parameter	Description
Billing Method	Subscription: pay upfront for a set period. More cost-effective for long-term use. Pay-as-you-go: billed hourly. Suitable for short-term use. Release the instance when it is no longer needed to stop charges.
Resource Group Settings	The resource group for the instance. Default: default resource group. See What is Resource Management?
Instance Class	DTS instance classes vary in synchronization speed. See Instance classes of data synchronization instances.
Subscription Duration	Available for Subscription billing only. Choose 1–9 months, or 1, 2, 3, or 5 years.

Read and accept the Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start, then click OK in the dialog box.

The task appears in the task list. Monitor its progress there.