Synchronize Self-Managed Oracle to MaxCompute for Real-Time Analytics via DTS - Data Transmission Service

Use Data Transmission Service (DTS) to synchronize data from a self-managed Oracle database to a MaxCompute project. DTS supports schema synchronization, full data synchronization, and incremental data synchronization (change data capture via redo and archived logs).

How it works

DTS synchronizes data in three sequential phases:

Schema synchronization — DTS creates the destination tables in MaxCompute. Each table name gets a _base suffix. For example, if the source table is customer, DTS creates customer_base in MaxCompute.
Full data synchronization — DTS copies all existing rows from the source Oracle database to the corresponding _base tables in MaxCompute.
Incremental data synchronization — DTS reads Oracle redo and archived logs and writes change records to a separate incremental data table in MaxCompute. The incremental data table name gets a _log suffix. For example, changes to customer are written to customer_log.

Billing

Synchronization type	Fee
Schema synchronization and full data synchronization	Free
Incremental data synchronization	Charged. For more information, see Billing overview.

Supported SQL operations

Operation type	SQL statements
DML (data manipulation language)	INSERT, UPDATE, DELETE
DDL (data definition language)	ADD COLUMN (attribute columns not supported)

Note

DTS does not synchronize foreign keys. Cascade and delete operations on the source database are not replicated to the destination.

Prerequisites

Before you begin, make sure you have:

Activated MaxCompute and created a MaxCompute project. See Activate MaxCompute and DataWorks and Create a MaxCompute project.
Configured an IP address whitelist on the MaxCompute client to allow DTS access. See Configure an IP address whitelist to allow access from Alibaba Cloud services to MaxCompute.
Created an AccessKey pair for the Alibaba Cloud account that owns the destination MaxCompute project. See Create an AccessKey pair.
Note
Alternatively, create a RAM user and set it as the super administrator for the MaxCompute project.
Configured the source Oracle database:
- Running in ARCHIVELOG mode with accessible archived log files and an appropriate retention period. See Managing archived redo log files.
- Supplemental logging enabled, with SUPPLEMENTAL_LOG_DATA_PK and SUPPLEMENTAL_LOG_DATA_UI set to Yesalert notification settings. See Supplemental logging.
- A database account with the required permissions. See Prepare an Oracle database.
Reviewed the capabilities and limits of DTS for Oracle synchronization, and (optionally) evaluated migration feasibility with Advanced Database & Application Migration (ADAM). See Overview.

Limitations

Source database requirements

Tables must have a primary key or unique key, with all fields unique. Without this, duplicate rows may appear in the destination.
Oracle 12c and later: table names cannot exceed 30 bytes.
If you rename tables or columns at the destination (Selected Objects > right-click), a single task supports up to 1,000 tables. For larger workloads, split tables across multiple tasks or synchronize the entire database without renaming.
Oracle Real Application Cluster (RAC): use a virtual IP address (VIP), not a Single Client Access Name (SCAN) IP address. After you specify the VIP, node failover is not supported.
If the Oracle RAC database connects over Express Connect, specify a VIP when configuring the task.

Log retention requirements

The redo logs and archived logs must meet the following minimum retention periods:

Synchronization mode	Minimum retention period
Incremental data synchronization only	24 hours
Full data synchronization + incremental data synchronization	7 days

If logs are purged before DTS can read them, the task fails. In edge cases, data inconsistency or loss may occur. After full data synchronization completes, you can reduce the retention period to 24 hours.

Important

Retention periods below these minimums may cause DTS to miss log data, which can compromise the Service Level Agreement (SLA) guarantees.

Other limitations

Do not perform a primary/secondary switchover while the task is running — the task will fail.
Empty VARCHAR2 strings (treated as NULL by Oracle) cause failures if the corresponding destination column has a NOT NULL constraint.
Tables with the Fine-Grained Audit (FGA) policy enabled: DTS cannot read the ORA_ROWSCN pseudocolumn and the task fails. Disable the FGA policy on affected tables, or exclude them from synchronization.
Do not update LONGTEXT fields while synchronization is running.
Do not run DDL operations during schema synchronization or initial full data synchronization.
Do not use Oracle Data Pump to write to the source database during incremental data synchronization — data loss may occur.
External tables cannot be synchronized.
MaxCompute does not support primary key constraints. If network errors occur during synchronization, DTS may write duplicate rows to MaxCompute.
Write to the destination MaxCompute project only through DTS to prevent data inconsistency.
If DTS needs to recover a failed instance, the process may take up to 8 hours and may involve restarting the instance or adjusting DTS instance parameters (database parameters are not modified). For the parameters that may be changed, see Modify instance parameters.

Performance considerations

Synchronize data during off-peak hours. Full data synchronization consumes read and write resources on both the source and destination databases.
After full data synchronization, the destination tablespace may be larger than the source due to fragmentation from concurrent INSERT operations.
Synchronization latency is calculated from the timestamp of the latest synchronized record in the destination versus the current timestamp in the source. If no DML operations run on the source for an extended period, the latency reading may be inaccurate. Run a DML operation on the source to refresh it, or enable a heartbeat table if you synchronize an entire database.

Required database account permissions

Database	Required permissions	Reference
Self-managed Oracle database	Fine-grained permissions	Prepare a database account, CREATE USER, GRANT

Important

To synchronize incremental data, archive logging and supplemental logging must be enabled on the Oracle database. See Configure an Oracle database.

Create a data synchronization task

Step 1: Open the Data Synchronization page

Use one of the following consoles to start:

DTS console

Log on to the DTS console.DTS console
In the left-side navigation pane, click Data Synchronization.
In the upper-left corner, select the region where the data synchronization task will run.

DMS console

Note

The exact navigation path may vary based on the DMS console mode. See Simple mode and Customize the layout and style of the DMS console.

Log on to the DMS console.DMS console
In the top navigation bar, move the pointer over Data + AI and choose DTS (DTS) > Data Synchronization.
From the drop-down list next to Data Synchronization Tasks, select the region where the task will run.

Step 2: Configure source and destination databases

Click Create Task to open the task configuration page. Configure the parameters as follows:

Section	Parameter	Description
N/A	Task Name	A name for the DTS task. DTS generates a name automatically. Specify a descriptive name for easy identification. Task names do not need to be unique.
Source Database	Select Existing Connection	Select a registered database instance from the drop-down list to auto-populate the connection parameters. If the instance is not registered, configure the parameters manually.
	Database Type	Select Oracle.
	Access Method	Select the method that matches where the Oracle database is deployed. This example uses Self-managed Database on ECS. For other access methods, see Preparation overview.
	Instance Region	The region where the source Oracle database resides.
	ECS Instance ID	The ID of the ECS instance running the source Oracle database.
	Port Number	The Oracle service port. Default: 1521.
	Oracle Type	Non-RAC Instance: requires the SID (system ID). RAC or PDB Instance: requires the Service Name. This example uses RAC or PDB Instance with a Service Name.
	Database Account	The Oracle account. For required permissions, see Required database account permissions.
	Database Password	The password for the Oracle account.
Destination Database	Select Existing Connection	Select a registered database instance from the drop-down list to auto-populate the connection parameters. If the instance is not registered, configure the parameters manually.
	Database Type	Select MaxCompute.
	Access Method	Select Alibaba Cloud Instance.
	Instance Region	The region where the destination MaxCompute project resides.
	Project	The name of the destination MaxCompute project.
	AccessKey ID of Alibaba Cloud Account	The AccessKey ID from the AccessKey pair you prepared in the Prerequisites section.
	AccessKey Secret of Alibaba Cloud Account	The AccessKey secret from the AccessKey pair you prepared in the Prerequisites section.

Step 3: Test connectivity

At the bottom of the page, click Test Connectivity and Proceed. In the CIDR Blocks of DTS Servers dialog box, click Test Connectivity.

Note

The CIDR blocks of DTS servers must be added to the security settings of both the source and destination databases. This can be done automatically or manually. See Add DTS server IP addresses to a whitelist.

Click OK to grant permissions to your MaxCompute account, then click Test Connectivity and Proceed.

Step 4: Configure objects to synchronize

In the Configure Objects step, set the following parameters:

Parameter	Description
Synchronization Types	Incremental Data Synchronization is selected by default. Also select Schema Synchronization and Full Data Synchronization to run all three phases.
Naming Rules of Additional Columns	DTS adds extra columns to destination tables. If an additional column name conflicts with an existing column, the task fails and data loss can occur. Select New Rule or Previous Rule based on your requirements. Check for naming conflicts before setting this parameter. See Naming rules for additional columns.
Partition Definition of Incremental Data Table	Select partition names based on your requirements. See Partition.
Processing Mode of Conflicting Tables	Precheck and Report Errors (default): checks for identical table names in source and destination. The precheck fails if conflicts exist. To resolve conflicts, use the object name mapping feature to rename destination tables. See Map object names. Ignore Errors and Proceed: skips the conflict check. During full synchronization, existing destination records are kept; during incremental synchronization, existing records are overwritten. If schemas differ, initialization may fail or only some columns may be synchronized.
Capitalization of Object Names in Destination Instance	Controls the case of database, table, and column names in the destination. Default: DTS default policy. See Specify the capitalization of object names in the destination instance.
Source Objects	Select one or more tables and click to add them to Selected Objects. Only tables can be synchronized.
Selected Objects	To rename a single object: right-click it and specify a new name. See Map the name of a single object. To rename multiple objects: click Batch Edit. See Map multiple object names at a time. To filter rows: right-click the table and set filter conditions. See Specify filter conditions. Renaming an object may cause other objects that depend on it to fail synchronization.

Step 5: Configure advanced settings

Click Next: Advanced Settings and configure the following:

Parameter	Description
Dedicated Cluster for Task Scheduling	By default, DTS schedules tasks to a shared cluster. Purchase a dedicated cluster for improved stability. See What is a DTS dedicated cluster.
Retry Time for Failed Connections	How long DTS retries after a connection failure. Valid values: 10–1440 minutes. Default: 720 minutes. Set this to at least 30 minutes. If multiple tasks share the same source or destination database, the shortest retry time applies. DTS charges for the instance during retries.
Retry Time for Other Issues	How long DTS retries after a DDL or DML failure. Valid values: 1–1440 minutes. Default: 10 minutes. Set this to a value greater than 10 minutes, and to a value smaller than Retry Time for Failed Connections.
Enable Throttling for Full Data Synchronization	Throttle the full synchronization to reduce load on the database servers. Configure QPS (queries per second) to the source database, RPS of Full Data Migration, and Data migration speed for full migration (MB/s). Available only when Full Data Synchronization is selected.
Enable Throttling for Incremental Data Synchronization	Throttle the incremental synchronization by configuring RPS of Incremental Data Synchronization and Data synchronization speed for incremental synchronization (MB/s).
Actual Write Code	The encoding format for data written to the destination. Select based on your requirements.
Environment Tag	An optional tag to identify the DTS instance.
Configure ETL	Enables the extract, transform, and load (ETL) feature. Select Yes to enter data processing statements in the code editor. See Configure ETL in a data migration or data synchronization task. Select No to skip. See also What is ETL?
Monitoring and Alerting	Configure alerts for task failures or high synchronization latency. Select Yes and set the alert threshold and notification contacts. See Configure monitoring and alerting when you create a DTS task.

Step 6: Run the precheck

Click Next: Save Task Settings and Precheck.

Note

To preview the API parameters for this task configuration, hover over the button and click Preview OpenAPI parameters.

DTS runs a precheck before the task can start. If the precheck fails:

Click View Details next to each failed item, resolve the issue, and click Precheck Again.
For alert items that can be ignored: click Confirm Alert Details, then Ignore in the dialog, then OK, then Precheck Again. Ignored alerts may result in data inconsistency.

Step 7: Purchase a data synchronization instance

Wait until Success Rate reaches 100%, then click Next: Purchase Instance.

On the buy page, configure the following:

Section	Parameter	Description
New Instance Class	Billing Method	Subscription: pay upfront for a fixed term — more cost-effective for long-term use. Pay-as-you-go: billed hourly — suitable for short-term use. Release the instance when no longer needed to stop charges.
	Resource Group Settings	The resource group for the instance. Default: default resource group. See What is Resource Management?
	Instance Class	The synchronization throughput class. See Instance classes of data synchronization instances.
	Subscription Duration	Available for the Subscription billing method. Options: 1–9 months, 1 year, 2 years, 3 years, or 5 years.

Read and accept Data Transmission Service (Pay-as-you-go) Service Terms.
Click Buy and Start, then click OK in the confirmation dialog.

The task appears in the task list. Monitor its progress from the list.

Incremental data table schema

Note

Run set odps.sql.allow.fullscan=true; in MaxCompute to allow full table scans on the project.

DTS writes incremental change records to the _log table in MaxCompute. Each row in the incremental data table represents one change event and includes the following metadata fields in addition to the source table columns:

Field	Description
`record_id`	Unique identifier for the log entry. Auto-increments for each new entry. For an UPDATE operation, DTS generates two entries (pre-update and post-update values) with the same `record_id`.
`operation_flag`	The operation type: `I` (INSERT), `D` (DELETE), or `U` (UPDATE).
`utc_timestamp`	The operation timestamp in UTC. Also the timestamp of the binary log file.
`before_flag`	Whether the row contains pre-update values: `Y` or `N`.
`after_flag`	Whether the row contains post-update values: `Y` or `N`.

Data Transmission Service:Synchronize data from a self-managed Oracle database to a MaxCompute project

How it works

Billing

Supported SQL operations

Prerequisites

Limitations

Source database requirements

Log retention requirements

Other limitations

Performance considerations

Required database account permissions

Create a data synchronization task

Step 1: Open the Data Synchronization page

DTS console

DMS console

Step 2: Configure source and destination databases

Step 3: Test connectivity

Step 4: Configure objects to synchronize

Step 5: Configure advanced settings

Step 6: Run the precheck

Step 7: Purchase a data synchronization instance

Incremental data table schema

What's next