You can use Data Transmission Service (DTS) to migrate data from a self-managed Oracle database to a DataHub project for real-time streaming analytics and application development.
Prerequisites
Make sure that the following requirements are met:
-
DataHub is activated and a DataHub project is created to receive the migrated data. For more information, see Get started with DataHub and Manage projects
-
The self-managed Oracle database runs in ARCHIVELOG mode with accessible archived log files and an appropriate retention period. For more information, see Managing Archived Redo Log Files
-
Supplemental logging is enabled for the self-managed Oracle database, with
SUPPLEMENTAL_LOG_DATA_PKandSUPPLEMENTAL_LOG_DATA_UIset to Yes. For more information, see Supplemental Logging -
DTS capabilities and limits for Oracle data migration are understood. Use Advanced Database & Application Migration (ADAM) to evaluate the feasibility and costs of migration. For more information, see Prepare an Oracle database and Overview
For supported Oracle database and DataHub project versions, see Overview of data migration scenarios.
Limitations
Source database limits
Networking and connectivity
-
The server hosting the source database must have sufficient outbound bandwidth. Insufficient bandwidth reduces migration speed.
-
If the source database is an Oracle RAC database connected over Express Connect, specify a VIP when configuring the source database.
-
If the source database is an Oracle RAC database connected over Express Connect, VPN Gateway, Smart Access Gateway, Database Gateway, or Cloud Enterprise Network (CEN), use a single VIP rather than a Single Client Access Name (SCAN) IP address. After specifying the VIP, node failover is not supported for the Oracle RAC database.
Data type constraints
-
If a VARCHAR2 field in the source Oracle database contains an empty string (evaluated as null in Oracle) and the corresponding destination field has a NOT NULL constraint, the migration task fails.
Object requirements
-
Tables to be migrated must have PRIMARY KEY or UNIQUE constraints, and all fields must be unique. Otherwise, duplicate data records may appear in the destination database.
-
For Oracle 12c or later, table names cannot exceed 30 bytes in length.
-
When migrating tables with modifications in the destination database (such as renaming tables or columns), a single task supports up to 1,000 tables. For more than 1,000 tables, configure multiple tasks to migrate in batches, or migrate the entire database.
Incremental migration requirements
-
Redo logging and archive logging must be enabled.
-
Log retention requirements:
-
Incremental data migration only: Store redo logs and archive logs for more than 24 hours.
-
Full data migration and incremental data migration: Store redo logs and archive logs for at least seven days. After full data migration completes, reduce the retention period to more than 24 hours.
-
-
If DTS fails to obtain redo logs and archive logs, the task may fail. In extreme cases, data inconsistency or loss may occur. The Service Level Agreement (SLA) of DTS does not guarantee service reliability or performance if log retention requirements are not met.
Operations during migration
-
During schema migration and full data migration, do not perform DDL operations that change database or table schemas. Otherwise, the data migration task fails.
-
During full data migration only, do not write data to the source database. Otherwise, data inconsistency may occur between the source and destination databases. To maintain data consistency, select schema migration, full data migration, and incremental data migration as the migration types.
-
During data migration, do not update LONGTEXT fields. Otherwise, the data migration task fails.
Other limits
-
A single string in the destination DataHub project cannot exceed 2 MB in length.
-
Only tables can be selected as migration objects. DTS does not support the migration of triggers.
-
Migrate data during off-peak hours. During full data migration, DTS consumes read and write resources of both the source and destination databases, which may increase server loads.
-
During full data migration, concurrent INSERT operations cause fragmentation in the tables of the destination cluster. After full data migration completes, the size of used tablespace of the destination cluster is larger than that of the source database.
-
DTS attempts to resume failed migration tasks within the last seven days. Before switching workloads to the destination database, stop or release failed tasks. Alternatively, execute the
REVOKEstatement to revoke write permissions from the accounts that DTS uses to access the destination database. Otherwise, data in the source database overwrites data in the destination database after a failed task resumes. -
Data inconsistency between the source and destination databases may occur if data from other sources is written to the destination database during migration.
Billing
| Migration type | Task configuration fee | Data transfer fee |
|---|---|---|
| Schema migration | Free of charge. | Free of charge in this example. Note
Charged only when data is migrated from Alibaba Cloud over the Internet. For more information, see Billing overview. |
| Incremental data migration | Charged. For more information, see Billing overview. |
Supported SQL operations for incremental migration
| Operation type | SQL statement |
|---|---|
| DML | INSERT, UPDATE, and DELETE |
| DDL | CREATE TABLE, ALTER TABLE, DROP TABLE, RENAME TABLE, and TRUNCATE TABLE; CREATE VIEW, ALTER VIEW, and DROP VIEW; CREATE PROCEDURE, ALTER PROCEDURE, and DROP PROCEDURE; CREATE FUNCTION, DROP FUNCTION, CREATE TRIGGER, and DROP TRIGGER; CREATE INDEX and DROP INDEX |
Required permissions for database accounts
| Database type | Schema migration | Incremental data migration | References |
|---|---|---|---|
| Self-managed Oracle database | Permissions of the schema owner | Fine-grained permissions | CREATE USER and GRANT |
To migrate incremental data from an Oracle database, enable archive logging and supplemental logging to obtain incremental data. For more information, see the Configure an Oracle database section of the "Prepare an Oracle database" topic.
Procedure
Step 1: Open the data migration tasks page
-
Log on to the Data Management (DMS) console.
-
In the top navigation bar, click DTS.
-
In the left-side navigation pane, choose DTS (DTS) > Data Migration.
Operations may vary based on the mode and layout of the DMS console. For more information, see Simple mode and Customize the layout and style of the DMS console.
The Data Migration Tasks page of the new DTS console is also available.
Step 2: Select a region and create a task
-
From the drop-down list next to Data Migration Tasks, select the region in which the data migration instance resides.
On the new DTS console, select the region in the upper-left corner.
-
Click Create Task.
Step 3: Configure the source and destination databases
On the Create Task page, configure the source and destination database connections.
After configuring the source and destination databases, read the Limits displayed in the upper part of the page. Otherwise, the task may fail or data inconsistency may occur.
Task settings
| Parameter | Description |
|---|---|
| Task Name | DTS automatically assigns a task name. Specify a descriptive name to help identify the task. A unique task name is not required. |
Source database settings
| Parameter | Description |
|---|---|
| Select an existing DMS database instance. (Optional. If you have not registered a DMS database instance, ignore this option and configure database settings in the section below.) | If an existing DMS database instance is selected, DTS automatically populates the database parameters. Otherwise, configure the parameters for the source database manually. |
| Database Type | Select Oracle. |
| Access Method | Select a value based on the deployment location of the source database. In this example, Self-managed Database on ECS is selected. Important
For other access methods, set up the required environment for the self-managed Oracle database. For more information, see Preparation overview. |
| Instance Region | The region where the source Oracle database resides. Select the correct region to ensure connectivity. |
| ECS Instance ID | The ID of the Elastic Compute Service (ECS) instance on which the source Oracle database is deployed. |
| Port Number | The service port number of the source Oracle database. Default value: 1521. |
| Oracle Type | The architecture of the source Oracle database. Select Non-RAC Instance and specify the SID parameter, or select RAC or PDB Instance and specify the Service Name parameter. In this example, RAC or PDB Instance is selected with Service Name specified. |
| Database Account | The account of the self-managed Oracle database. For required permissions, see Required permissions for database accounts. |
| Database Password | The password of the database account. |
Destination database settings
| Parameter | Description |
|---|---|
| Select an existing DMS database instance. (Optional. If you have not registered a DMS database instance, ignore this option and configure database settings in the section below.) | If an existing DMS database instance is selected, DTS automatically populates the database parameters. Otherwise, configure the parameters for the destination database manually. |
| Database Type | Select DataHub. |
| Access Method | Select Alibaba Cloud Instance. |
| Instance Region | The region in which the destination DataHub project resides. |
| Project | The name of the destination Project. |
Step 4: Test connectivity
-
In the lower part of the page, click Test Connectivity and Proceed.
-
If a whitelist is configured for the self-managed Oracle database, add the CIDR blocks of DTS servers to the whitelist, then click Test Connectivity and Proceed. DTS handles whitelist configuration differently depending on the database type:
-
Alibaba Cloud database instances (such as ApsaraDB RDS for MySQL or ApsaraDB for MongoDB): DTS automatically adds CIDR blocks of DTS servers to the IP address whitelist.
-
Self-managed databases on ECS: DTS automatically adds CIDR blocks to the ECS security group rules. Make sure the ECS instance can access the database. For multiple ECS instances, manually add CIDR blocks to each security group.
-
Self-managed databases in a data center or on third-party cloud: Manually add the CIDR blocks of DTS servers to the database IP address whitelist. For more information, see the CIDR blocks of DTS servers section of the Add the CIDR blocks of DTS servers topic.
-
Adding public CIDR blocks of DTS servers to a database whitelist or ECS security group introduces security risks. Before using DTS, acknowledge the potential risks and take preventive measures, including but not limited to: strengthening username and password security, limiting exposed ports, authenticating API calls, regularly checking whitelists or security group rules and removing unauthorized CIDR blocks, or connecting the database to DTS through Express Connect, VPN Gateway, or Smart Access Gateway.
Step 5: Configure migration objects and advanced settings
Migration objects
| Parameter | Description |
|---|---|
| Migration Types | Select Schema Migration and Incremental Data Migration. Full Data Migration cannot be selected. Note
If Incremental Data Migration is not selected, do not write data to the source database during migration to maintain data consistency. |
| Naming Rules of Additional Columns | When DTS migrates data to DataHub, it adds additional columns to the destination topic. If an additional column has the same name as an existing column in the destination topic, the migration task fails. Select New Rule or Previous Rule based on your requirements. Warning
Before configuring this parameter, check whether additional columns have name conflicts with existing columns in the destination topic. Name conflicts may cause task failure and data loss. For more information, see Modify the naming rules for additional columns. |
| Processing Mode of Conflicting Tables | Precheck and Report Errors: Checks whether the destination database contains tables with the same names as those in the source database. If identical table names exist, the precheck returns an error and the task cannot start. Note
If identical table names exist and the destination tables cannot be deleted or renamed, use the object name mapping feature to rename migrated tables. For more information, see Map object names. Ignore Errors and Proceed: Skips the precheck for identical table names. Warning
Selecting Ignore Errors and Proceed may cause data inconsistency. If schemas match, DTS skips data records with the same primary keys. If schemas differ, only specific columns are migrated or the task fails. |
| Capitalization of Object Names in Destination Instance | The capitalization of database names, table names, and column names in the destination instance. By default, DTS default policy is selected. For more information, see Specify the capitalization of object names in the destination instance. |
| Source Objects | Select one or more objects from the Source Objects section and click the Note
Only tables can be selected as migration objects. |
| Selected Objects | - To rename a single object, right-click it in the Selected Objects section. For more information, see Map the name of a single object. - To rename multiple objects at a time, click Batch Edit in the upper-right corner of the Selected Objects section. For more information, see Map multiple object names at a time. Note
Database name mapping is not supported. Renaming an object may cause dependent objects to fail during migration. Object names include table names and column names. - To filter data with WHERE conditions, right-click the table in the Selected Objects section and configure filter conditions in the dialog box. For more information, see Set filter conditions. - To configure the ShardKey parameter, right-click the table in the Selected Objects section and clear Synchronize All Tables in the dialog box. |
Advanced settings
Click Next: Advanced Settings to configure the following parameters.
| Parameter | Description |
|---|---|
| Select the dedicated cluster used to schedule the task | By default, DTS schedules the task to a shared cluster. You can optionally purchase a dedicated cluster to run the migration task. For more information, see What is a DTS dedicated cluster. |
| Retry Time for Failed Connections | The retry duration when a connection to the source or destination database fails. Valid values: 10 to 1440 minutes. Default value: 720. We recommend that you set this to a value greater than 30. If DTS reconnects within the specified duration, it resumes the task. Otherwise, the task fails. Note
If multiple tasks share the same source or destination database, the most recently configured retry duration takes effect. DTS charges for the instance during retries. Set the retry duration based on your requirements and release the DTS instance promptly after the source or destination instance is released. |
| The wait time before a retry when other issues occur in the source and destination databases. | The retry duration for other issues, such as failed DDL or DML operations. Valid values: 1 to 1440 minutes. Default value: 10. We recommend that you set this to a value greater than 10. If the operations succeed within the specified duration, DTS resumes the task. Otherwise, the task fails. Important
This value must be smaller than the Retry Time for Failed Connections value. |
| Enable Throttling for Incremental Data Migration | Whether to enable throttling for incremental data migration. Configure the RPS of Incremental Data Migration and Data migration speed for incremental migration (MB/s) parameters to reduce destination server loads. Note
Available only when Incremental Data Migration is selected for Migration Types. |
| Environment Tag | The environment tag for identifying the DTS instance. Select a tag based on your requirements. |
| Actual Write Code | The encoding format of data written to the destination database. |
| Configure ETL | Whether to enable the extract, transform, and load (ETL) feature. For more information, see What is ETL?. Yes: Enables ETL. Enter data processing statements in the code editor. For more information, see Configure ETL in a data migration or data synchronization task. No: Disables ETL. |
| Monitoring and Alerting | Whether to configure alerting for the migration task. If the task fails or migration latency exceeds the threshold, alert contacts are notified. No: Disables alerting. Yes: Enables alerting. Configure the alert threshold and notification settings. For more information, see Configure monitoring and alerting for a new DTS task. |
Step 6: Run precheck and start the task
-
Save the task settings and run a precheck. You can also preview the OpenAPI parameters before starting the precheck.
-
To view the API parameters for the DTS task, hover over Next: Save Task Settings and Precheck and click Preview OpenAPI parameters.
-
Click Next: Save Task Settings and Precheck in the lower part of the page.
- DTS performs a precheck before starting the migration task. The task can only start after the precheck passes. - If the precheck fails, click View Details next to each failed item. Analyze the causes and fix the issues, then run the precheck again. - If an alert is generated during the precheck: - If the alert item cannot be ignored, click View Details next to the failed item, fix the issues, and run the precheck again. - If the alert item can be ignored, click Confirm Alert Details. In the View Details dialog box, click Ignore, then click OK. Click Precheck Again to rerun the precheck. Ignoring alert items may cause data inconsistency and expose your business to potential risks.
-
-
Wait until the success rate reaches 100%, then click Next: Purchase Instance.
-
On the Purchase Instance page, configure the following parameters.
Parameter Description Resource Group Settings The resource group for the data migration instance. Default value: default resource group. For more information, see What is Resource Management?. Instance Class The instance class determines the migration speed. Select an instance class based on your scenario. For more information, see Specifications of data migration instances. -
Select the Data Transmission Service (Pay-as-you-go) Service Terms check box.
-
Click Buy and Start to start the data migration task. You can view the task progress in the task list.