You can use Data Transmission Service (DTS) to migrate data from a self-managed Oracle database to a Message Queue for Apache Kafka instance or a self-managed Kafka cluster. The data migration feature allows you to extend message processing capabilities. This topic describes how to migrate data from a self-managed Oracle database to a Message Queue for Apache Kafka instance.

Prerequisites

  • The version of the self-managed Oracle database is 9i, 10g, 11g, 12c, 18c, or 19c.
  • Supplemental logging, including SUPPLEMENTAL_LOG_DATA_PK and SUPPLEMENTAL_LOG_DATA_UI, is enabled for the self-managed Oracle database. For more information, see Supplemental Logging.
  • The self-managed Oracle database is running in ARCHIVELOG mode. Archived log files are accessible and a suitable retention period is set for archived log files. For more information, see Managing Archived Redo Log Files.
  • The network environment is deployed for the source Oracle database. For more information, see Preparation overview.
  • The tables to be migrated from the self-managed Oracle database contain primary keys or UNIQUE NOT NULL indexes.
  • The version of the Message Queue for Apache Kafka instance is 0.10.1.0 to 2.x. The version of the self-managed Kafka cluster is 0.10.1.0 to 2.7.0.
  • The available storage space of the destination Message Queue for Apache Kafka instance is larger than the total size of the data in the self-managed Oracle database.
  • In the destination Message Queue for Apache Kafka instance, a topic is created to receive the migrated data. For more information, see Create a topic.

Precautions

  • DTS uses read and write resources of the source and destination databases during full data migration. This may increase the loads of the database servers. If the database performance is unfavorable, the specification is low, or the data volume is large, database services may become unavailable. For example, DTS occupies a large amount of read and write resources in the following cases: a large number of slow SQL queries are performed on the source database, the tables have no primary keys, or a deadlock occurs in the destination database. Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours. For example, you can migrate data when the CPU utilization of the source and destination databases is less than 30%.
  • If a data migration task fails, DTS automatically resumes the task. Before you switch your workloads to the destination database, stop or release the data migration task. Otherwise, the data from the source database will overwrite the data in the destination database after the task is resumed.
  • If the version of your Oracle database is 12c or later, the names of the tables to be migrated cannot exceed 30 bytes in length.
  • The tables to be migrated in the source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.

Billing

Migration type Task configuration fee Internet traffic fee
Schema migration and full data migration Free of charge. Charged only when data is migrated from Alibaba Cloud over the Internet. For more information, see Pricing.
Incremental data migration Charged. For more information, see Pricing.

Migration types

Migration type Description
Schema migration DTS migrates the schemas of the required objects from the source database to the destination database. In this scenario, DTS can migrate only the schemas of tables.
Full data migration DTS migrates historical data of the required objects from the source database to the destination database.
Note During schema migration and full data migration, we recommend that you do not perform data definition language (DDL) operations on the required objects. Otherwise, the objects may fail to be migrated.
Incremental data migration After full data migration is complete, DTS retrieves redo log files from the source Oracle database. Then, DTS migrates incremental data from the source Oracle database to the destination database in real time. Incremental data migration allows you to ensure service continuity when you migrate data from a self-managed Oracle database.

DTS can synchronize the following SQL operations during incremental data migration:

Data format

The data that is migrated to the Kafka cluster is stored in the Avro format. You must parse the migrated data based on the Avro schema. For more information, see DTS Avro schema.

Before you begin

Log on to the source Oracle database, create an account for data collection, and grant permissions to the account.

Note If you have created a database account and the account has the permissions that are listed in the following table, skip this step.
Database Schema migration Full data migration Incremental data migration
Self-managed Oracle database The permissions of the schema owner The permissions of the schema owner DBA

For more information about how to create and authorize a database account, see the following topics:

Self-managed Oracle database: CREATE USER and GRANT

Notice If you need to migrate incremental data from an Oracle database but the DBA permission cannot be granted to the database account, you can grant fine-grained permissions to the account. The following sample statements show you how to grant specific permissions to an Oracle database account.
create session;
connect;
resource;
execute on sys.dbms_logmnr;
select on v_$logmnr_contents;
select on v_$log;
select on v_$logfile;
select on v_$archived_log;
select on v_$logmnr_logs;
select on v_$parameter;
select on v_$database;
select on all_objects;
select on all_tab_cols;
select on dba_registry;
select any table;
select any transaction;
select on v$active_instances;
select on v$instance;
select on sys.USER$;
select on SYS.OBJ$;
select on SYS.COL$;
select on SYS.IND$;
select on SYS.ICOL$;
select on SYS.CDEF$;
select on SYS.CCOL$;
select on SYS.TABPART$;
select on SYS.TABSUBPART$;
select on SYS.TABCOMPART$;
select on gv_$listener_network;
#Grant permissions on the pluggable database (PDB) and container database (CDB).
#Grant permissions on the PDB:
create   session;
connect;
resource;
select on  all_objects;
select on  all_tab_cols;
select on  dba_registry;
select any table;
select any transaction;
select on v_$log;
select on v_$logfile;
select on v_$archived_log;
select on v_$parameter;
select on v_$database;
select on v_$active_instances;
select on v_$instance;
select on V_$PDBS;
select on sys.USER$;
select on SYS.OBJ$;
select on SYS.COL$;
select on SYS.IND$;
select on SYS.ICOL$;
select on SYS.CDEF$;
select on SYS.CCOL$;
select on SYS.TABPART$;
select on SYS.TABSUBPART$;
select on SYS.TABCOMPART$;

#Grant permissions on the CDB:
create   session;
LOGMINING;
select on v_$logmnr_contents;

#The following sample statements show you how to grant permissions to a database account named dtstest.
create user dtstest IDENTIFIED BY rdsdt_dtsacct;
grant create session to dtstest;
grant connect to dtstest;
grant resource to dtstest;
grant execute on sys.dbms_logmnr to dtstest;
grant select on v_$logmnr_contents to dtstest;
grant select on v_$log to dtstest;
grant select on v_$logfile to dtstest;
grant select on v_$archived_log to dtstest;
grant select on v_$logmnr_logs to dtstest;
grant select on v_$parameter to dtstest;
grant select on v_$database to dtstest;
grant select on all_objects to dtstest;
grant select on all_tab_cols to dtstest;
grant select on dba_registry to dtstest;
grant select any table to dtstest;
grant select any transaction to dtstest;
grant select on v$active_instances to dtstest;
grant select on v$instance to dtstest;
grant select on sys.USER$ to dtstest;
grant select on SYS.OBJ$ to dtstest;
grant select on SYS.COL$ to dtstest;
grant select on SYS.IND$ to dtstest;
grant select on SYS.ICOL$ to dtstest;
grant select on SYS.CDEF$ to dtstest;
grant select on SYS.CCOL$ to dtstest;
grant select on SYS.TABPART$ to dtstest;
grant select on SYS.TABSUBPART$ to dtstest;
grant select on SYS.TABCOMPART$ to dtstest;
grant select on gv_$listener_network to dtstest;
#The following sample statements show you how to grant permissions on the PDB and CDB to a database account named dtstest.
#Grant permissions on the PDB:
create user dtstest IDENTIFIED BY rdsdt_dtsacct;
grant create  session to dtstest;
grant connect  to dtstest;
grant resource to dtstest;
grant select on  all_objects to dtstest;
grant select on  all_tab_cols to dtstest;
grant select on  dba_registry to dtstest;
grant select any table to dtstest;
grant select any transaction to dtstest;
-- v$log privileges
grant select on v_$log to dtstest;
-- v$logfile privileges
grant select on v_$logfile to dtstest;
-- v$archived_log privileges
grant select on v_$archived_log to dtstest;
-- v$parameter privileges
grant select on v_$parameter to dtstest;
-- v$database privileges
grant select on v_$database to dtstest;
-- v$active_instances privileges
grant select on v_$active_instances to dtstest;
-- v$instance privileges
grant select on v_$instance to dtstest;
-- V$PDBS privileges
grant select on V_$PDBS to dtstest;
grant select on sys.USER$ to dtstest;
grant select on SYS.OBJ$ to dtstest;
grant select on SYS.COL$ to dtstest;
grant select on SYS.IND$ to dtstest;
grant select on SYS.ICOL$ to dtstest;
grant select on SYS.CDEF$ to dtstest;
grant select on SYS.CCOL$ to dtstest;
grant select on SYS.TABPART$ to dtstest;
grant select on SYS.TABSUBPART$ to dtstest;
grant select on SYS.TABCOMPART$ to dtstest;

#Grant permissions on the CDB:
create user dtstest IDENTIFIED BY rdsdt_dtsacct;
grant create   session to dtstest;
grant LOGMINING TO dtstest;
-- v$logmnr_contents privileges
grant select on v_$logmnr_contents to dtstest;

Procedure

  1. Log on to the DTS console.
  2. In the left-side navigation pane, click Data Migration.
  3. At the top of the Migration Tasks page, select the region where the destination instance resides.
    Select a region
  4. In the upper-right corner of the page, click Create Migration Task.
  5. Configure the source and destination databases.
    Configure the source and destination databases
    Section Parameter Description
    N/A Task Name DTS automatically generates a task name. We recommend that you specify an informative name for easy identification. You do not need to use a unique task name.
    Source Database Instance Type Select an instance type based on the deployment of the source database. In this example, select User-Created Database with Public IP Address.
    Note If you select other instance types, you must deploy the network environment for the source database. For more information, see Preparation overview.
    Instance Region If the instance type is set to User-Created Database with Public IP Address, you do not need to specify the instance region.
    Note If a whitelist is configured for the self-managed Oracle database, you must add the CIDR blocks of DTS servers to the whitelist of the database. You can click Get IP Address Segment of DTS next to Instance Region to obtain the CIDR blocks of DTS servers.
    Database Type Select Oracle.
    Hostname or IP Address Enter the endpoint that is used to connect to the self-managed Oracle database. In this example, enter the public IP address.
    Port Number Enter the service port number of the self-managed Oracle database. The default port number is 1521.
    Note The service port of the self-managed Oracle database must be accessible over the Internet.
    Instance Type
    • Non-RAC Instance: If you select this option, you must specify the SID.
    • RAC Instance: If you select this option, you must specify the Service Name.
    Database Account Enter the account of the self-managed Oracle database. For more information about the permissions that are required for the account, see Before you begin.
    Database Password Enter the password of the database account.
    Note After you specify the source database parameters, click Test Connectivity next to Database Password to verify whether the specified parameters are valid. If the specified parameters are valid, the Passed message appears. If the Failed message appears, click Check next to Failed. Modify the source database parameters based on the check results.
    Destination Database Instance Type Select User-Created Database Connected over Express Connect, VPN Gateway, or Smart Access Gateway.
    Note You cannot select Message Queue for Apache Kafka as the instance type. However, you can use Message Queue for Apache Kafka as a self-managed Kafka database to configure data migration.
    Instance Region Select the region where the destination Message Queue for Apache Kafka instance resides.
    Peer VPC Select the ID of the virtual private cloud (VPC) to which the destination Message Queue for Apache Kafka instance belongs. To obtain the VPC ID, you can log on to the Message Queue for Apache Kafka console and go to the Instance Details page of the Message Queue for Apache Kafka instance. In the Basic Information section, you can view the VPC ID. kafka_vpcid
    Database Type Select Kafka.
    IP Address Enter an IP address that is included in the Default Endpoint parameter of the Message Queue for Apache Kafka instance.
    Note To obtain an IP address, you can log on to the Message Queue for Apache Kafka console and go to the Instance Details page of the Message Queue for Apache Kafka instance. In the Basic Information section, you can obtain an IP address from the Default Endpoint parameter.
    Port Number Enter the service port number of the Message Queue for Apache Kafka instance. The default port number is 9092.
    Database Account Enter the username that is used to log on to the Message Queue for Apache Kafka instance.
    Note If the instance type of the Message Queue for Apache Kafka instance is VPC Instance, you do not need to specify the database account or database password.
    Database Password Enter the password of the username.
    Topic Click Get Topic List, and select a topic name from the drop-down list.
    Topic That Stores DDL Information Click Get Topic List, and select a topic name from the drop-down list. The topic is used to store the DDL information. If you do not specify this parameter, the DDL information is stored in the topic that is specified by the Topic parameter.
    Kafka Version Select the version of the Message Queue for Apache Kafka instance.
    Encryption Select Non-encrypted or SCRAM-SHA-256 based on your business and security requirements.
    Whether to Use Kafka Schema Registry Kafka Schema Registry provides a serving layer for your metadata. It provides a RESTful interface for storing and retrieving your Avro schemas.
    • No: Kafka Schema Registry is not used.
    • Yes: Kafka Schema Registry is used. In this case, you must enter the URL or IP address that is registered in Kafka Schema Registry for your Avro schemas.
  6. In the lower-right corner of the page, click Set Whitelist and Next.
  7. Select the migration types, the migration policy, and the objects to be migrated.
    Select the objects to be migrated
    Setting Description
    Select the migration types Select Schema Migration, Full Data Migration, and Incremental Data Migration.
    Notice If Incremental Data Migration is not selected, do not write data to the source database during full data migration. This ensures data consistency between the source and destination databases.
    Select the data format used in Kafka The data that is migrated to the Kafka cluster is stored in the Avro format. You must parse the migrated data based on the Avro schema. For more information, see DTS Avro schema.
    Select the policy for migrating data to Kafka partitions Select a migration policy based on your business requirements.
    Select the objects to be migrated Select one or more tables from the Available section and click the Right arrow icon to move the tables to the Selected section.
    Note DTS maps the table names to the topic name that you select in Step 5. For information about how to change the topic name, see Object name mapping.
    Specify whether to rename object names You can use the object name mapping feature to change the names of the objects that are migrated to the destination instance. For more information, see Object name mapping.
    Specify the retry time for failed connections to the source or destination database By default, if DTS fails to connect to the source or destination database, DTS retries within the next 12 hours. You can specify the retry time based on your needs. If DTS reconnects to the source and destination databases within the specified time, DTS resumes the data migration task. Otherwise, the data migration task fails.
    Note When DTS retries a connection, you are charged for the DTS instance. We recommend that you specify the retry time based on your business needs. You can also release the DTS instance at your earliest opportunity after the source and destination instances are released.
  8. In the lower-right corner of the page, click Precheck.
    Notice
    • Before you can start the data migration task, a precheck is performed. You can start the data migration task only after the task passes the precheck.
    • If the task fails to pass the precheck, you can click the Info icon icon next to each failed item to view details.
      • You can troubleshoot the issues based on the causes and run a precheck again.
      • If you do not need to troubleshoot the issues, you can ignore failed items and run a precheck again.
  9. After the precheck is past, click Next.
  10. In the Confirm Settings dialog box, specify the Channel Specification parameter and select Data Transmission Service (Pay-As-You-Go) Service Terms.
  11. Click Buy and Start to start the data migration task.

Stop the migration task

Warning We recommend that you prepare a rollback solution to migrate incremental data from the destination database to the source database in real time. This allows you to minimize the negative impact of switching your workloads to the destination database. For more information, see Switch workloads to the destination database. If you do not need to switch your workloads, you can perform the following steps to stop the migration task.
  • Full data migration

    Do not manually stop a task during full data migration. Otherwise, the system may fail to migrate all data. Wait until the migration task automatically ends.

  • Incremental data migration

    The task does not automatically end during incremental data migration. You must manually stop the migration task.

    1. Wait until the task progress bar shows Incremental Data Migration and The migration task is not delayed. Then, stop writing data to the source database for a few minutes. In some cases, the progress bar shows the delay time of incremental data migration.
    2. After the status of incremental data migration changes to The migration task is not delayed, manually stop the migration task.Stop a task during incremental migration

What to do next

The database accounts that are used for data migration have the read and write permissions. After data migration is complete, you must delete the account of the self-managed Oracle database. You must also modify the permissions of the Resource Access Management (RAM) user in the destination Kafka instance. For more information, see Grant permissions to RAM users.