Data Transmission Service (DTS) streams change data from a self-managed Oracle database to an ApsaraMQ for Kafka instance in real time. This lets downstream consumers receive a reliable change stream from Oracle for data lake ingestion, real-time data warehousing, and service decoupling scenarios.
Prerequisites
Before you begin, ensure that you have:
A self-managed Oracle database and a destination ApsaraMQ for Kafka instance. For supported versions, see Overview of data synchronization scenarios
The Oracle database running in ARCHIVELOG mode, with archived log files accessible and a retention period set. See Managing Archived Redo Log Files
Supplemental logging enabled on the Oracle database, with SUPPLEMENTAL_LOG_DATA_PK and SUPPLEMENTAL_LOG_DATA_UI set to Yes. See Supplemental Logging
Enough free storage on the ApsaraMQ for Kafka instance to hold all data from the Oracle database
A topic created in the ApsaraMQ for Kafka instance to receive the synchronized data. See Step 1: Create a topic
Familiarity with DTS capabilities and limits for Oracle synchronization. See Prepare an Oracle database and Overview
Billing
| Synchronization type | Fee |
|---|---|
| Schema synchronization and full data synchronization | Free |
| Incremental data synchronization | Charged. See Billing overview |
Supported synchronization topologies
One-way one-to-one synchronization
One-way one-to-many synchronization
One-way many-to-one synchronization
One-way cascade synchronization
For details, see Synchronization topologies.
SQL operations that can be synchronized
| Operation type | Statements |
|---|---|
| DML | INSERT, UPDATE, DELETE |
| DDL | CREATE TABLE, ALTER TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE; CREATE/ALTER/DROP VIEW; CREATE/ALTER/DROP PROCEDURE; CREATE/DROP FUNCTION; CREATE/DROP TRIGGER; CREATE/DROP INDEX |
Permissions required
| Database | Required permission | References |
|---|---|---|
| Self-managed Oracle database | Fine-grained permissions | Prepare a database account, CREATE USER, GRANT |
To synchronize incremental data from Oracle, enable archive logging and supplemental logging. See Configure an Oracle database.
Limitations
DTS does not synchronize foreign keys. Cascade and delete operations on the source database are not propagated to the destination.
Source database limitations
Objects to synchronize:
Tables must have PRIMARY KEY or UNIQUE constraints with all fields unique. Tables without these constraints may produce duplicate records in the destination.
For Oracle 12c and later, table names cannot exceed 30 bytes.
If you select tables as objects to synchronize and plan to edit them in the destination (for example, renaming tables or columns), a single task supports up to 1,000 tables. Tasks with more than 1,000 tables fail with a request error. Split the tables across multiple tasks, or synchronize the entire database instead.
Oracle RAC:
If the source is an Oracle RAC database connected over Express Connect, specify a virtual IP address (VIP) when configuring the task.
For Oracle RAC databases, use a VIP instead of a Single Client Access Name (SCAN) IP address. After you specify a VIP, node failover for the Oracle RAC database is not supported.
Log retention:
| Synchronization scope | Minimum log retention |
|---|---|
| Incremental data synchronization only | More than 24 hours |
| Full data synchronization + incremental data synchronization | At least 7 days |
If DTS cannot read redo logs or archive logs, the task fails, or even data inconsistency or data loss may occur. After full data synchronization completes, you can reduce the retention period to more than 24 hours. DTS service-level agreements (SLAs) do not apply if the retention period is shorter than these minimums.
Other source limitations:
A primary/secondary switchover on the source database while a task is running causes the task to fail.
Empty strings of the VARCHAR2 type are treated as null values in Oracle. If the corresponding column in the destination has a NOT NULL constraint, the task fails.
Do not update LONGTEXT fields during synchronization. Doing so causes the task to fail.
Do not execute DDL statements during schema synchronization or full data synchronization. Doing so causes the task to fail.
Single-record size limit
The maximum size of a single record that DTS can write to Kafka is 10 MB. If a source row exceeds this limit, the task is interrupted.
To avoid this, do not synchronize entire tables that contain large fields. Synchronize only specific columns instead. When configuring the task, exclude columns with large values. If a table with large fields is already included in the task, remove the table, add it again, and then apply filter conditions to exclude the large columns.
Other limitations
Renamed tables: If you rename a table in the source database and the new name is not in the selected objects, DTS does not synchronize the renamed table's data to Kafka. To resume synchronization, reselect the objects. See Add an object to a data synchronization task.
Oracle Data Pump: Do not use Oracle Data Pump to write data to the source database during incremental data synchronization. Data loss may occur.
Performance impact: DTS uses read and write resources from both the source and destination databases during full data synchronization, which may increase server load. Schedule synchronization tasks during off-peak hours to minimize impact.
Tablespace size: Concurrent INSERT operations during full data synchronization cause table fragmentation. The destination tablespace will be larger than the source after full data synchronization completes.
Synchronization latency: Latency is calculated based on the timestamp of the latest synchronized record in the destination and the current time in the source. If no DML operation runs on the source for an extended period, the displayed latency may be inaccurate. Run a DML operation on the source to refresh the latency. If you synchronize an entire database, create a heartbeat table that is updated every second to keep the latency accurate.
Exclusive writes: Write data to the destination only through DTS. Writing through other tools may cause data inconsistency or data loss — for example, when using DMS to perform online DDL operations.
Kafka scaling: If the destination Kafka cluster is scaled up or down, restart the Kafka cluster to resume synchronization.
Create a data synchronization task
Step 1: Go to the Data Synchronization Tasks page
Log on to the Data Management (DMS) console.
In the top navigation bar, click Data + AI.
In the left-side navigation pane, choose DTS (DTS) > Data Synchronization.
Steps may vary depending on the DMS console mode and layout. See Simple mode and Customize the layout and style of the DMS console. You can also go directly to the Data Synchronization Tasks page of the new DTS console.
Step 2: Select the region
On the right side of Data Synchronization Tasks, select the region where the data synchronization instance resides.
In the new DTS console, select the region from the top navigation bar.
Step 3: Configure source and destination databases
Click Create Task. In the Create Data Synchronization Task wizard, configure the following parameters.
Source Database
| Parameter | Description |
|---|---|
| Task Name | A name for the DTS task. DTS generates a default name. Use a descriptive name to make the task easy to identify. The name does not need to be unique. |
| Database Type | Select Oracle. |
| Connection Type | The access method for the source database. This example uses Self-managed Database on ECS. If your source is a self-managed database, set up the network environment first. See Preparation overview. |
| Instance Region | The region where the self-managed Oracle database resides. |
| ECS Instance ID | The ID of the Elastic Compute Service (ECS) instance hosting the Oracle database. |
| Port Number | The service port of the Oracle database. Default: 1521. |
| Oracle Type | The Oracle architecture. Select Non-RAC Instance and configure SID, or select RAC or PDB Instance and configure Service Name. Note RAC instances are not supported. This example uses Non-RAC Instance. |
| Database Account | The Oracle account. For required permissions, see Permissions required. |
| Database Password | The password for the Oracle account. |
Destination Database
| Parameter | Description |
|---|---|
| Database Type | Select Kafka. |
| Connection Type | Select Express Connect, VPN Gateway, or Smart Access Gateway. DTS does not list ApsaraMQ for Kafka as a dedicated access method. Connect to it as a self-managed Kafka cluster. |
| Instance Region | The region where the ApsaraMQ for Kafka instance resides. |
| Connected VPC | The ID of the virtual private cloud (VPC) that the ApsaraMQ for Kafka instance belongs to. To get the VPC ID, go to the ApsaraMQ for Kafka console, open the Instance Details page, and find the VPC ID in Configuration Information on the Instance Information tab. |
| IP Address or Domain Name | An IP address from the Default Endpoint parameter of the ApsaraMQ for Kafka instance. To get an IP address, go to the Instance Details page and look in Endpoint Information on the Instance Information tab. |
| Port Number | The service port of the ApsaraMQ for Kafka instance. Default: 9092. |
| Database Account | The account for the ApsaraMQ for Kafka instance. Not required for VPC-connected instances. |
| Database Password | The password for the account. |
| Kafka Version | The version of the ApsaraMQ for Kafka instance. |
| Encryption | Select Non-encrypted or SCRAM-SHA-256 based on your security requirements. |
| Topic | The topic to receive the synchronized data. Select from the drop-down list. |
| Topic That Stores DDL Information | The topic to store DDL information. If not set, DDL information is stored in the topic specified by Topic. |
| Use Kafka Schema Registry | Whether to use Kafka Schema Registry. Select No or Yes. If you select Yes, enter the URL or IP address registered in Kafka Schema Registry for your Avro schemas. |
Step 4: Test connectivity
Click Test Connectivity and Proceed.
DTS automatically adds its CIDR blocks to the whitelists of Alibaba Cloud database instances (such as ApsaraDB RDS for MySQL or ApsaraDB for MongoDB). For ECS-hosted databases, DTS automatically adds its CIDR blocks to the ECS security group rules. If the database is deployed across multiple ECS instances, manually add the DTS CIDR blocks to each ECS instance's security group rules. For on-premises databases or databases from third-party providers, manually add the DTS CIDR blocks to the database whitelist. See CIDR blocks of DTS servers.
Adding DTS CIDR blocks to your whitelist or security group rules may introduce security risks. Before proceeding, take preventive measures such as strengthening credentials, limiting exposed ports, auditing API calls, regularly reviewing whitelist rules, and blocking unauthorized CIDR blocks. Alternatively, connect the database to DTS using Express Connect, VPN Gateway, or Smart Access Gateway.
Step 5: Configure objects and synchronization settings
Configure the following parameters:
| Parameter | Description |
|---|---|
| Synchronization Types | By default, Incremental Data Synchronization is selected. Also select Schema Synchronization and Full Data Synchronization. After the precheck, DTS synchronizes historical data from the source to the destination as the baseline for incremental synchronization. |
| Processing Mode of Conflicting Tables | Precheck and Report Errors: DTS checks for tables with identical names in the source and destination. The task cannot start if conflicts are found. To resolve naming conflicts without deleting or renaming destination tables, use the object name mapping feature. See Map object names. Ignore Errors and Proceed: Skips the naming conflict check. If the source and destination share the same schema, DTS retains existing destination records during full data synchronization and overwrites them during incremental data synchronization. If schemas differ, initialization may partially fail or the task may fail entirely. Use with caution. |
| Data Format in Kafka | The format for data written to Kafka. DTS Avro: data follows the DTS Avro schema. See the schema definition on GitHub. SharePlex JSON: data uses the SharePlex JSON format. See Shareplex Json. |
| Policy for Shipping Data to Kafka Partitions | The partitioning policy for synchronized data. See Specify the policy for migrating data to Kafka partitions. |
| Capitalization of Object Names in Destination Instance | The capitalization policy for database, table, and column names in the destination. Default: DTS default policy. See Specify the capitalization of object names in the destination instance. |
| Source Objects | Select columns, tables, or databases from the Source Objects section and click the |
| Selected Objects | To rename a single object, right-click it. See Map the name of a single object. To rename multiple objects at once, click Batch Edit. See Map multiple object names at a time. To filter specific SQL operations for a table, right-click the table and select the operations. See SQL operations that can be synchronized. To filter rows, right-click a table and set filter conditions. See Set filter conditions. Renaming an object may cause dependent objects to fail synchronization. |
Step 6: Configure advanced settings
Click Next: Advanced Settings and configure the following parameters:
| Parameter | Description |
|---|---|
| Dedicated Cluster for Task Scheduling | DTS uses the shared cluster by default. For improved stability, purchase a dedicated cluster. See What is a DTS dedicated cluster. |
| Retry Time for Failed Connections | The retry window for connection failures. Valid values: 10–1440 minutes. Default: 720 minutes. Set this to more than 30 minutes. If DTS reconnects within this window, the task resumes. Otherwise, the task fails. If multiple tasks share a source or destination, the shortest retry window applies. During retries, you are charged for the DTS instance. |
| Retry Time for Other Issues | The retry window for DDL or DML operation failures. Valid values: 1–1440 minutes. Default: 10 minutes. Set this to more than 10 minutes. This value must be smaller than Retry Time for Failed Connections. |
| Enable Throttling for Full Data Migration | Limits the load on the destination database server during full data synchronization. Configure Queries per second (QPS) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). This parameter is available only when Full Data Synchronization is selected. |
| Enable Throttling for Incremental Data Synchronization | Limits the load during incremental synchronization. Configure RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s). |
| Environment Tag | A tag to identify the DTS instance. Select based on your requirements. This example uses no tag. |
| Actual Write Code | The encoding format for data written to the destination. |
| Configure ETL | Whether to enable extract, transform, and load (ETL). Select Yes to configure data processing statements. See Configure ETL in a data migration or data synchronization task. Select No to skip ETL. |
| Monitoring and Alerting | Whether to configure alerts for task failures or high synchronization latency. Select No or Yes. If you select Yes, configure the alert threshold and notification settings. See Configure monitoring and alerting when you create a DTS task. |
Step 7: Save settings and run the precheck
To preview OpenAPI parameters before saving, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
Click Next: Save Task Settings and Precheck.
DTS runs a precheck before the task can start. If the precheck fails:
Click View Details next to each failed item, resolve the issue, and run the precheck again.
If an alert is triggered for an item that can be ignored, click Confirm Alert Details, then click Ignore in the dialog box, click OK, and then click Precheck Again. Ignoring an alert may cause data inconsistency.
Step 8: Purchase a synchronization instance
Wait until the Success Rate reaches 100%, then click Next: Purchase Instance.
On the purchase page, configure the following parameters:
| Parameter | Description |
|---|---|
| Billing Method | Subscription: pay upfront for a fixed term. More cost-effective for long-term use. Pay-as-you-go: billed hourly. Suitable for short-term use. Release the instance when no longer needed to stop charges. |
| Resource Group Settings | The resource group for the instance. Default: default resource group. See What is Resource Management? |
| Instance Class | The instance class, which determines synchronization speed. See Instance classes of data synchronization instances. |
| Subscription Duration | Available only for the Subscription billing method. Valid values: 1–9 months, or 1, 2, 3, or 5 years. |
Step 9: Start the task
Read and select Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start. In the dialog box, click OK.
The task appears in the task list. Monitor its progress from there.