You can use Data Transmission Service (DTS) to migrate data from a self-managed Oracle database to a Message Queue for Apache Kafka instance or a self-managed Kafka cluster. The data migration feature allows you to extend message processing capabilities. This topic describes how to migrate data from a self-managed Oracle database to a Message Queue for Apache Kafka instance.

Prerequisites

  • The version number of the self-managed Oracle database is 9i, 10g, 11g, 12c, 18c, or 19c.
  • Supplemental logging, SUPPLEMENTAL_LOG_DATA_PK, and SUPPLEMENTAL_LOG_DATA_UI are enabled for the self-managed Oracle database. For more information, see Supplemental Logging.
  • The self-managed Oracle database is running in ARCHIVELOG mode. Archived log files are accessible and a suitable retention period is set for archived log files. For more information, see Managing Archived Redo Log Files.
  • The network environment is deployed for the source self-managed Oracle database. For more information, see Preparation overview.
  • The tables to be migrated from the self-managed Oracle database contain primary keys or UNIQUE NOT NULL indexes.
  • The version number of the Message Queue for Apache Kafka instance is in the range from 0.10.1.0 to 2.x. The version number of the self-managed Kafka cluster is in the range from 0.10.1.0 to 2.7.0.
  • The available storage space of the destination Message Queue for Apache Kafka instance is larger than the total size of the data in the self-managed Oracle database.
  • In the destination Kafka instance, a topic is created to receive the synchronized data. For more information, see Create a topic.

Usage notes

  • DTS uses read and write resources of the source and destination databases during full data migration. This may increase the loads of the database servers. If the database performance is unfavorable, the specification is low, or the data volume is large, database services may become unavailable. For example, DTS occupies a large amount of read and write resources in the following cases: a large number of slow SQL queries are performed on the source database, the tables have no primary keys, or a deadlock occurs in the destination database. Before you migrate data, evaluate the impact of data migration on the performance of the source and destination databases. We recommend that you migrate data during off-peak hours. For example, you can migrate data when the CPU utilization of the source and destination databases is less than 30%.
  • If a data migration task fails and stops, DTS automatically resumes the task. Before you switch your workloads to the destination database, stop or release the data migration task. Otherwise, the data from the source database overwrites the data in the destination database after the task is resumed.
  • If the self-managed Oracle database is deployed in a Real Application Cluster (RAC) architecture and is connected to DTS over an Alibaba Cloud virtual private cloud (VPC), you must connect the Single Client Access Name (SCAN) IP address of the Oracle RAC and the virtual IP address (VIP) of each node to the VPC and configure routes. The settings ensure that your DTS task can run as expected. For more information, see Connect an on-premises data center to DTS by using VPN Gateway.
    Important When you configure the source Oracle database in the DTS console, you can specify the SCAN IP address of the Oracle RAC as the database endpoint or IP address.
  • If the version of your Oracle database is 12c or later, the names of the tables to be migrated cannot exceed 30 bytes in length.
  • The tables to be migrated in the source database must have PRIMARY KEY or UNIQUE constraints and all fields must be unique. Otherwise, the destination database may contain duplicate data records.

Billing

Migration typeTask configuration feeInternet traffic fee
Schema migration and full data migrationFree of charge. Charged only when data is migrated from Alibaba Cloud over the Internet. For more information, see Billing overview.
Incremental data migrationCharged. For more information, see Billing overview.

Migration types

Migration typeDescription
Schema migrationDTS migrates the schemas of the required objects from the source database to the destination database. In this scenario, DTS can migrate only the schemas of tables.
Full data migrationDTS migrates the historical data of required objects from the source database to the destination database.
Note During schema migration and full data migration, do not perform DDL operations on the objects to be migrated. Otherwise, the objects may fail to be migrated.
Incremental data migrationAfter full data migration is complete, DTS retrieves redo log files from the source Oracle database. Then, DTS migrates incremental data from the source Oracle database to the destination database in real time. Incremental data migration ensures service continuity when you migrate data between self-managed databases.

During incremental data migration, DTS can synchronize DML and DDL operations.

Before you begin

Log on to the self-managed Oracle database, create an account that you want to use to collect data, and grant permissions to the account.

Note If you have created a database account and the account has permissions that are listed in the following table, skip this step.
DatabaseSchema migrationFull data migrationIncremental data migration
Self-managed Oracle databasePermissions of the schema ownerPermissions of the schema ownerDatabase administrator (DBA)

For more information about how to create a database account and grant permissions to the database account, see the following topics:

Self-managed Oracle database: CREATE USER and GRANT

Enable logging and grant fine-grained permissions to an Oracle database account

Important If you need to migrate incremental data from an Oracle database but the database administrator (DBA) permissions cannot be granted to the database account, you can enable archive logging and supplemental logging, and grant fine-grained permissions to the account.

  1. Enable archive logging and supplemental logging.
    TypeProcedure
    Archive loggingExecute the following statements to enable archive logging:
    shutdown immediate;
    startup mount;
    alter database archivelog;
    alter database open;
    archive log list;
    Supplemental loggingEnable supplemental logging at the database or table level based on your business requirements.
    Note You can enable database-level supplemental logging to ensure the stability of Data Transmission Service (DTS) tasks. You can enable table-level supplemental logging to reduce the disk usage of the source Oracle database.
    • Enable database-level supplemental logging
      1. Execute the following statement to enable minimal supplemental logging:
        alter database add supplemental log data;
      2. Execute the following statement to enable primary key and unique key supplemental logging at the database level:
        alter database add supplemental log data (primary key,unique index) columns;
    • Enable table-level supplemental logging
      1. Execute the following statement to enable minimal supplemental logging:
        alter database add supplemental log data;
      2. Enable table-level supplemental logging by using one of the following methods:
        • Enable primary key supplemental logging at the table level
          alter table table_name add supplemental log data (primary key) columns;
        • Enable table-level supplemental logging for all columns
          alter table tb_name add supplemental log data (all) columns;
    Force loggingExecute the following statement to enable force logging:
    alter database force logging;
  2. Grant fine-grained permissions to an Oracle database account.
    # Create a database account named rdsdt_dtsacct and grant permissions to the account.
    create user rdsdt_dtsacct IDENTIFIED BY rdsdt_dtsacct;
    grant create session to rdsdt_dtsacct;
    grant connect to rdsdt_dtsacct;
    grant resource to rdsdt_dtsacct;
    grant execute on sys.dbms_logmnr to rdsdt_dtsacct;
    grant select on V_$LOGMNR_LOGS to rdsdt_dtsacct;
    grant select on  all_objects to rdsdt_dtsacct;
    grant select on  all_tab_cols to rdsdt_dtsacct;
    grant select on  dba_registry to rdsdt_dtsacct;
    grant select any table to rdsdt_dtsacct;
    grant select any transaction to rdsdt_dtsacct;
    -- v$log privileges
    grant select on v_$log to rdsdt_dtsacct;
    -- v$logfile privileges
    grant select on v_$logfile to rdsdt_dtsacct;
    -- v$archived_log privileges
    grant select on v_$archived_log to rdsdt_dtsacct;
    -- v$parameter privileges
    grant select on v_$parameter to rdsdt_dtsacct;
    -- v$database privileges
    grant select on v_$database to rdsdt_dtsacct;
    -- v$active_instances privileges
    grant select on v_$active_instances to rdsdt_dtsacct;
    -- v$instance privileges
    grant select on v_$instance to rdsdt_dtsacct;
    -- v$logmnr_contents privileges
    grant select on v_$logmnr_contents to rdsdt_dtsacct;
    -- system tables
    grant select on sys.USER$ to rdsdt_dtsacct;
    grant select on SYS.OBJ$ to rdsdt_dtsacct;
    grant select on SYS.COL$ to rdsdt_dtsacct;
    grant select on SYS.IND$ to rdsdt_dtsacct;
    grant select on SYS.ICOL$ to rdsdt_dtsacct;
    grant select on SYS.CDEF$ to rdsdt_dtsacct;
    grant select on SYS.CCOL$ to rdsdt_dtsacct;
    grant select on SYS.TABPART$ to rdsdt_dtsacct;
    grant select on SYS.TABSUBPART$ to rdsdt_dtsacct;
    grant select on SYS.TABCOMPART$ to rdsdt_dtsacct;
    grant select_catalog_role TO rdsdt_dtsacct;
    
    # Switch to the pluggable database (PDB). Create a database account named rdsdt_dtsacct and grant permissions to the account.
    ALTER SESSION SET container = ORCLPDB1;
    create user rdsdt_dtsacct IDENTIFIED BY rdsdt_dtsacct;
    grant create  session to rdsdt_dtsacct;
    grant connect  to rdsdt_dtsacct;
    grant resource to rdsdt_dtsacct;
    grant execute on sys.dbms_logmnr to rdsdt_dtsacct;
    grant select on  all_objects to rdsdt_dtsacct;
    grant select on  all_tab_cols to rdsdt_dtsacct;
    grant select on  dba_registry to rdsdt_dtsacct;
    grant select any table to rdsdt_dtsacct;
    grant select any transaction to rdsdt_dtsacct;
    -- v$log privileges
    grant select on v_$log to rdsdt_dtsacct;
    -- v$logfile privileges
    grant select on v_$logfile to rdsdt_dtsacct;
    -- v$archived_log privileges
    grant select on v_$archived_log to rdsdt_dtsacct;
    -- v$parameter privileges
    grant select on v_$parameter to rdsdt_dtsacct;
    -- v$database privileges
    grant select on v_$database to rdsdt_dtsacct;
    -- v$active_instances privileges
    grant select on v_$active_instances to rdsdt_dtsacct;
    -- v$instance privileges
    grant select on v_$instance to rdsdt_dtsacct;
    -- v$logmnr_contents privileges
    grant select on v_$logmnr_contents to rdsdt_dtsacct;
    grant select on sys.USER$ to rdsdt_dtsacct;
    grant select on SYS.OBJ$ to rdsdt_dtsacct;
    grant select on SYS.COL$ to rdsdt_dtsacct;
    grant select on SYS.IND$ to rdsdt_dtsacct;
    grant select on SYS.ICOL$ to rdsdt_dtsacct;
    grant select on SYS.CDEF$ to rdsdt_dtsacct;
    grant select on SYS.CCOL$ to rdsdt_dtsacct;
    grant select on SYS.TABPART$ to rdsdt_dtsacct;
    grant select on SYS.TABSUBPART$ to rdsdt_dtsacct;
    grant select on SYS.TABCOMPART$ to rdsdt_dtsacct;
    -- V$PDBS privileges
    grant select on V_$PDBS to rdsdt_dtsacct;
    grant select on v$database to rdsdt_dtsacct;
    grant select on dba_objects to rdsdt_dtsacct;
    grant select on DBA_TAB_COMMENTS to rdsdt_dtsacct;
    grant select on dba_tab_cols to rdsdt_dtsacct;
    grant select_catalog_role TO rdsdt_dtsacct;
    
    # Switch to the CDB$ROOT, which is the root container of the container database (CDB). Create a database account and grant permissions to the account.
    ALTER SESSION SET container = CDB$ROOT;
    # Create a database account named rdsdt_dtsacct and grant permissions to the account. You must modify the default parameters of the Oracle database. 
    alter session set "_ORACLE_SCRIPT"=true;
    create user rdsdt_dtsacct IDENTIFIED BY rdsdt_dtsacct;
    grant create session to rdsdt_dtsacct;
    grant connect to rdsdt_dtsacct;
    grant select on v_$logmnr_contents to rdsdt_dtsacct;
    grant LOGMINING TO rdsdt_dtsacct;
    grant EXECUTE_CATALOG_ROLE to rdsdt_dtsacct;
    grant execute on sys.dbms_logmnr to rdsdt_dtsacct;
    
    # Create a database account named rdsdt_dtsacct and grant permissions to the account.
    create user rdsdt_dtsacct IDENTIFIED BY rdsdt_dtsacct;
    grant create  session to rdsdt_dtsacct;
    grant connect  to rdsdt_dtsacct;
    grant resource to rdsdt_dtsacct;
    grant select on V_$LOGMNR_LOGS to rdsdt_dtsacct;
    grant select on  all_objects to rdsdt_dtsacct;
    grant select on  all_tab_cols to rdsdt_dtsacct;
    grant select on  dba_registry to rdsdt_dtsacct;
    grant select any table to rdsdt_dtsacct;
    grant select any transaction to rdsdt_dtsacct;
    grant select on v$database to rdsdt_dtsacct;
    grant select on dba_objects to rdsdt_dtsacct;
    grant select on DBA_TAB_COMMENTS to rdsdt_dtsacct;
    grant select on dba_tab_cols to rdsdt_dtsacct;
    -- v$log privileges
    grant select on v_$log to rdsdt_dtsacct;
    -- v$logfile privileges
    grant select on v_$logfile to rdsdt_dtsacct;
    -- v$archived_log privileges
    grant select on v_$archived_log to rdsdt_dtsacct;
    -- v$parameter privileges
    grant select on v_$parameter to rdsdt_dtsacct;
    -- v$database privileges
    grant select on v_$database to rdsdt_dtsacct;
    -- v$active_instances privileges
    grant select on v_$active_instances to rdsdt_dtsacct;
    -- v$instance privileges
    grant select on v_$instance to rdsdt_dtsacct;
    -- v$logmnr_contents privileges
    grant select on v_$logmnr_contents to rdsdt_dtsacct;
    grant select on sys.USER$ to rdsdt_dtsacct;
    grant select on SYS.OBJ$ to rdsdt_dtsacct;
    grant select on SYS.COL$ to rdsdt_dtsacct;
    grant select on SYS.IND$ to rdsdt_dtsacct;
    grant select on SYS.ICOL$ to rdsdt_dtsacct;
    grant select on SYS.CDEF$ to rdsdt_dtsacct;
    grant select on SYS.CCOL$ to rdsdt_dtsacct;
    grant select on SYS.TABPART$ to rdsdt_dtsacct;
    grant select on SYS.TABSUBPART$ to rdsdt_dtsacct;
    grant select on SYS.TABCOMPART$ to rdsdt_dtsacct;
    grant LOGMINING TO rdsdt_dtsacct;
    grant EXECUTE_CATALOG_ROLE to rdsdt_dtsacct;
    grant execute on sys.dbms_logmnr to rdsdt_dtsacct;
    grant select_catalog_role TO rdsdt_dtsacct;

Procedure

  1. Log on to the DTS console.
    Note If you are redirected to the Data Management (DMS) console, you can click the old icon in the lower-right corner to go to the previous version of the DTS console.
  2. In the left-side navigation pane, click Data Migration.
  3. In the upper part of Migration Tasks page, select the region in which the destination instance resides.
  4. In the upper-right corner of the page, click Create Migration Task.
  5. Configure the source and destination databases.
    Configure the source and destination databases
    SectionParameterDescription
    N/ATask NameThe task name that DTS automatically generates. We recommend that you specify a name that indicates your business requirements for easy identification. You do not need to use a unique name.
    Source DatabaseInstance TypeThe access method of the source database. In this example, User-Created Database with Public IP Address is selected.
    Note If the source self-managed database is of another type, you must set up the environment that is required for the self-managed database. For more information, see Preparation overview.
    Instance RegionIf you select User-Created Database with Public IP Address as the instance type, you do not need to specify the Instance Region parameter.
    Note If an IP address whitelist is configured for the self-managed Oracle database, you must add the CIDR blocks of DTS servers to the IP address whitelist of the database. You can click Get IP Address Segment of DTS next to Instance Region to obtain the CIDR blocks of DTS servers.
    Database TypeThe type of the source database. Select Oracle.
    Hostname or IP AddressThe endpoint that is used to connect to the self-managed Oracle database. In this example, the public IP address of the database is used.
    Port NumberThe service port number of the self-managed Oracle database. Default value: 1521.
    Note The service port of the self-managed Oracle database must be accessible over the Internet.
    Instance Type
    • The architecture type of the self-managed Oracle database. If you select Non-RAC Instance, you must specify the SID parameter.
    • If you select RAC or PDB Instance, you must specify the Service Name parameter.
    Database AccountThe account of the self-managed Oracle database. For information about the permissions that are required for the account, see Before you begin.
    Database PasswordThe password of the account of the self-managed Oracle database.
    Note After you specify the information about the self-managed Oracle database, you can click Test Connectivity next to Database Password to check whether the information is valid. If the information is valid, the Passed message appears. If the Failed message appears, click Check next to Failed. Then, modify the information based on the check results.
    Destination DatabaseInstance TypeThe access method of the source database. Select User-Created Database Connected Over Express Connect, VPN Gateway, or Smart Access Gateway.
    Note You cannot specify Message Queue for Apache Kafka for the Instance Type parameter. You can use Message Queue for Apache Kafka as a self-managed Kafka cluster to configure data synchronization.
    Instance RegionThe region in which the destination Message Queue for Apache Kafka instance resides.
    Peer VPCThe ID of the virtual private cloud (VPC) to which the destination Message Queue for Apache Kafka instance belongs. To obtain the VPC ID, perform the following operations: Log on to the Message Queue for Apache Kafka console and go to the Instance Details page of the Message Queue for Apache Kafka instance. In the Configuration Information section, view the VPC ID. kafka_vpcid
    Database TypeThe type of the destination database. Select Kafka.
    IP AddressThe IP address that is included in the Default Endpoint parameter of the Message Queue for Apache Kafka instance.
    Note To obtain an IP address, perform the following operations: Log on to the Message Queue for Apache Kafka console and go to the Instance Details page of the Message Queue for Apache Kafka instance. In the Endpoint Information section of the Instance Information tab, view the IP address included in the Default Endpoint parameter.
    Port NumberThe service port number of the Message Queue for Apache Kafka instance. Default value: 9092.
    Database AccountThe database account that is used to log on to the Message Queue for Apache Kafka instance.
    Note If the Message Queue for Apache Kafka instance is of the VPC Instance type, you do not need to specify the Database Account and Database Password parameters.
    Database PasswordThe password of the database account that is used to log on to the Message Queue for Apache Kafka instance.
    TopicClick Get Topic list next to Topic and select a topic from the drop-down list.
    Topic for storing DDLClick Get Topic list next to Topic for storing DDL, and select a topic from the drop-down list. The topic is used to store the DDL information. If you do not specify this parameter, the DDL information is stored in the topic that is specified by the Topic parameter.
    Kafka VersionThe version of the Message Queue for Apache Kafka instance.
    EncryptionSpecifies whether to encrypt the connection to the destination cluster. Select Non-encrypted or SCRAM-SHA-256 based on your business and security requirements.
    Whether to Use Kafka schema registryKafka Schema Registry provides a serving layer for your metadata. It provides a RESTful API to store and retrieve your Avro schemas.
    • No: does not use Kafka Schema Registry.
    • Yes: uses Kafka Schema Registry. In this case, you must enter the URL or IP address that is registered in Kafka Schema Registry for your Avro schemas.
  6. In the lower-right corner of the page, click Set Whitelist and Next.
    Warning If the CIDR blocks of DTS servers are automatically or manually added to the whitelist of the database or instance, or to the ECS security group rules, security risks may arise. Therefore, before you use DTS to migrate data, you must understand and acknowledge the potential risks and take preventive measures, including but not limited to the following measures: enhance the security of your username and password, limit the ports that are exposed, authenticate API calls, regularly check the whitelist or ECS security group rules and forbid unauthorized CIDR blocks, or connect the database to DTS by using Express Connect, VPN Gateway, or Smart Access Gateway.
  7. Select the migration types, the migration policy, and the objects to be migrated.
    Select the objects to be migrated
    Parameter or settingDescription
    Migration TypesSelect Schema Migration, Full Data Migration, and Incremental Data Migration.
    Important If Incremental Data Migration is not selected, we recommend that you do not write data to the source database during full data migration. This ensures data consistency between the source and destination databases.
    Select the data format used in KafkaThe data that is migrated to the Kafka cluster is stored in the Avro format. You must parse the migrated data based on the Avro schema. For more information, see DTS Avro schema.
    Select the policy for migrating data to Kafka partitionsSelect a migration policy based on your business requirements. For more information, see Specify the policy for migrating data to Kafka partitions.
    Select the objects to be migratedSelect one or more tables from the Available section and click the Rightwards arrow icon to add the tables to the Selected section.
    Note DTS maps the table names to the name of the topic that you select in Step 5. For more information about how to change the topic name, see Object name mapping.
    Specify whether to rename objectsYou can use the object name mapping feature to rename the objects that are migrated to the destination instance. For more information, see Object name mapping.
    Specify the retry time range for failed connections to the source or destination databaseBy default, if DTS fails to connect to the source or destination database, DTS retries within the following 12 hours. You can specify the retry time range based on your business requirements. If DTS is reconnected to the source or destination database within the specified retry time range, DTS resumes the data migration task. Otherwise, the data migration task fails.
    Note Within the retry time range in which DTS attempts to reconnect to the source and destination databases, you are charged for using the DTS instance. We recommend that you specify the retry time range based on your business requirements. You can also release the DTS instance at the earliest opportunity after the source and destination instances are released.
  8. In the lower-right corner of the page, click Precheck.
    Important
    • Before you can start the data migration task, a precheck is performed. You can start the data migration task only after the task passes the precheck.
    • If the task fails to pass the precheck, you can click the Info icon icon next to each failed item to view details.
      • After you troubleshoot the issues based on the causes, you can run a precheck again.
      • If you do not need to troubleshoot the issues, you can ignore failed items and run a precheck again.
  9. After the task passes the precheck, click Next.
  10. In the Confirm Settings dialog box, specify the Channel Specification parameter and select Data Transmission Service (Pay-As-You-Go) Service Terms.
  11. Click Buy and Start to start the data migration task.

Stop the data migration task

Warning We recommend that you prepare a rollback solution to migrate incremental data from the destination database to the source database in real time. This allows you to minimize the negative impact of switching your workloads to the destination database. For more information, see Switch workloads to the destination database. If you do not need to switch your workloads, you can perform the following steps to stop the data migration task.
  • Full data migration

    Do not manually stop a task during full data migration. Otherwise, the system may fail to migrate all data. Wait until the migration task automatically ends.

  • Incremental data migration

    The task does not automatically end during incremental data migration. You must manually stop the migration task.

    1. Wait until the task progress bar shows Incremental Data Migration and The migration task is not delayed. Then, stop writing data to the source database for a few minutes. In some cases, the progress bar shows the delay time of incremental data migration.
    2. After the status of incremental data migration changes to The migration task is not delayed, manually stop the migration task.Stop a task during incremental migration

What to do next

The database accounts that are used for data migration have read and write permissions. After data migration is complete, you must delete the account of the self-managed Oracle database. You must also modify the permissions of the RAM user in the destination Kafka instance. For more information, see Grant permissions to RAM users.