All Products
Search
Document Center

ApsaraDB for SelectDB:Migrate data from a MySQL data source

Last Updated:Mar 28, 2026

ApsaraDB for SelectDB supports four methods to migrate data from a MySQL data source: Data Transmission Service (DTS), Flink change data capture (CDC), DataWorks, and catalog. The MySQL data source can be a self-managed MySQL database, an ApsaraDB RDS for MySQL instance, or a PolarDB for MySQL cluster. Select a method based on whether you need a one-time migration or continuous synchronization.

Choose a migration method

All four methods support historical data migration. The key differences are in continuous synchronization and schema handling:

MethodHistorical data migrationIncremental data synchronizationSchema migrationDatabase migrationDDL synchronizationData verification
DTSYesYesYesYesYesYes
DataWorksYesYesYesYesYesNo
Flink CDCYesYesYesYesYesNo
CatalogYesNoNoNoNoNo

Decision guide:

  • One-time data migration only: Use catalog. It uses federated queries via INSERT INTO SELECT — no pipeline to configure or maintain.

  • Continuous synchronization with data verification: Use DTS. It is the only method that supports end-to-end data verification.

  • Continuous synchronization with custom pipeline control: Use Flink CDC. It gives you fine-grained control over parallelism and filtering.

  • Continuous synchronization through a visual interface: Use DataWorks.

Limitations

  • DataWorks does not support writing fields of the BITMAP, HyperLogLog (HLL), or QUANTILE_STATE data type.

  • The catalog method supports only historical data migration. It does not support schema migration, DDL synchronization, or incremental synchronization.

  • When you synchronize data from MySQL to SelectDB using DTS without selecting schema migration, create tables that use the Unique Key model in the destination instance in advance.

  • When synchronizing tables using DTS, editing table objects (such as renaming) is limited to 1,000 tables per task. For more than 1,000 tables, configure multiple tasks or synchronize at the database level.

Prerequisites

Before you begin, make sure that:

Migrate data using DTS

DTS supports historical data migration, incremental data synchronization, schema migration, DDL synchronization, and data verification. The following steps use an ApsaraDB RDS for MySQL instance as the source. For the complete reference, see Use DTS to import data.

Create a data synchronization task

  1. Log on to the ApsaraDB for SelectDB console.

  2. In the top navigation bar, select the region where your instance resides.

  3. In the left-side navigation pane, click Instances. On the Instances page, click the instance ID to go to the Instance Details page.

  4. In the left-side navigation pane of the Instance Details page, click Data Pipeline. On the Data Synchronization tab, click Create Data Synchronization Task.

    Note

    In DTS, data synchronization covers both existing data migration and real-time incremental synchronization. Data migration in DTS is for existing historical data only.

  5. Configure the source and destination databases.

    SectionParameterDescription
    Task informationTask NameThe name of the DTS task. DTS generates a name automatically. Specify a descriptive name to identify the task easily. The name does not need to be unique.
    Source DatabaseDatabase TypeSelect MySQL.
    Access MethodSelect Alibaba Cloud Instance.
    Instance RegionThe region where the source ApsaraDB RDS for MySQL instance resides.
    RDS Instance IDThe ID of the source ApsaraDB RDS for MySQL instance. For example, rm-2z3m****.
    Database AccountThe database account for the source instance. For required permissions, see the "Permissions required for database accounts" section in Synchronize data from an ApsaraDB RDS for MySQL instance to an ApsaraDB for SelectDB instance.
    Database PasswordThe password for the database account.
    EncryptionSelect Non-encrypted or SSL-encrypted. If you select SSL encryption, enable SSL on the RDS instance first. See Use a cloud certificate to enable SSL encryption.
    Destination DatabaseDatabase TypeSelect SelectDB.
    Access MethodSelect Alibaba Cloud Instance.
    Instance RegionThe region where the destination ApsaraDB for SelectDB instance resides.
    Instance IDThe ID of the destination ApsaraDB for SelectDB instance.
    Database AccountThe database account for the destination instance. For required permissions, see the "Permissions required for database accounts" section in Synchronize data from an ApsaraDB RDS for MySQL instance to an ApsaraDB for SelectDB instance.
    Database PasswordThe password for the database account.
  6. Click Test Connectivity and Proceed.

  7. Configure the objects to synchronize and advanced settings.

    ParameterDescription
    Synchronization TypesBy default, Incremental Data Synchronization is selected. Also select Schema Synchronization and Full Data Synchronization. After the precheck, DTS migrates the historical data of selected objects, which serves as the baseline for incremental synchronization.
    Important

    If you do not select Schema Synchronization, create tables using the Unique Key model in the destination SelectDB instance before starting. See Synchronize data from an ApsaraDB RDS for MySQL instance to an ApsaraDB for SelectDB instance and Data models.

    Source ObjectsSelect objects from the Source Objects section and click the arrow icon to move them to Selected Objects. You can select columns, tables, or databases.
    Important

    (1) If you select databases, DTS does not synchronize new tables created after the task starts. Select tables instead if you need to capture new tables. To change objects after the task starts, choose More > Reselect Objects in the task list. (2) If you select tables and need to rename them, the task supports up to 1,000 tables. For more than 1,000 tables, configure multiple tasks or synchronize at the database level.

    Selected ObjectsTo rename a single object in the destination, right-click it in Selected Objects. See Map the name of a single object. To rename multiple objects at once, click Batch Edit. See Map multiple object names at a time.
    Note

    (1) To filter which SQL operations to synchronize for a specific object, right-click the object and select the operations. See Synchronize data from an ApsaraDB RDS for MySQL instance to an ApsaraDB for SelectDB instance. (2) To filter rows by condition, right-click a table and specify a WHERE clause. See Specify filter conditions. (3) Renaming an object may cause dependent objects to fail synchronization.

  8. (Optional) Click Next: Configure Database and Table Fields. In the dialog box, configure the Primary Key Column, Distribution Key, and Engine parameters for tables in the destination SelectDB instance.

    Note

    (1) This step is available only when Schema Synchronization is selected. To edit parameters, set Definition Status to All. (2) Primary Key Column accepts multiple columns. One or more primary key columns can also be used as the distribution key. Only Unique is available for Engine.

  9. Save the task settings and run a precheck. Wait until the success rate reaches 100%, then click Next: Purchase Instance.

  10. On the Purchase Instance page, configure the billing method and instance class. Read and agree to the Data Transmission Service (Pay-as-you-go) Service Terms, then click Buy and Start. Monitor task progress in the task list.

Migrate data using Flink CDC

Flink CDC supports historical data migration, incremental synchronization, schema migration, and DDL synchronization. The following steps use Flink CDC to synchronize data from an upstream MySQL database to an ApsaraDB for SelectDB instance. For the complete reference, see Use Flink to import data.

Set up the Flink environment

The following steps deploy a Flink standalone cluster version 1.16.

  1. Download and extract the Flink package.

    wget https://archive.apache.org/dist/flink/flink-1.16.3/flink-1.16.3-bin-scala_2.12.tgz
    tar -zxvf flink-1.16.3-bin-scala_2.12.tgz

    If this version is no longer available, download another version from Apache Flink downloads.

  2. Download the MySQL CDC and SelectDB connector JARs to the FLINK_HOME/lib directory.

    cd flink-1.16.3/lib/
    wget https://repo1.maven.org/maven2/com/ververica/flink-sql-connector-mysql-cdc/2.4.2/flink-sql-connector-mysql-cdc-2.4.2.jar
    wget https://repo1.maven.org/maven2/com/selectdb/flink-selectdb-connector-1.16/2.1.0/flink-selectdb-connector-1.16-2.1.0.jar
  3. Start the Flink standalone cluster.

    bin/start-cluster.sh
  4. Create an ApsaraDB for SelectDB instance. See Create an instance.

  5. Connect to the SelectDB instance over the MySQL protocol. See Connect to an instance.

  6. Create a test database and table in SelectDB.

    -- Create the database
    CREATE DATABASE test_db;
    
    -- Create the table
    USE test_db;
    CREATE TABLE employees (
        emp_no       int NOT NULL,
        birth_date   date,
        first_name   varchar(20),
        last_name    varchar(20),
        gender       char(2),
        hire_date    date
    )
    UNIQUE KEY(`emp_no`)
    DISTRIBUTED BY HASH(`emp_no`) BUCKETS 1;

Synchronize data using Flink CDC

Run the following command to start a Flink CDC synchronization job. All parameters use --mysql-conf and --sink-conf prefixes to separate MySQL source configuration from SelectDB sink configuration.

<FLINK_HOME>/bin/flink run \
    -Dexecution.checkpointing.interval=10s \
    -Dparallelism.default=1 \
    -c org.apache.doris.flink.tools.cdc.CdcTools \
    lib/flink-doris-connector-1.16-1.5.2.jar \
    mysql-sync-database \
    --database test_db \
    --including-tables "tbl1|test.*" \
    --mysql-conf hostname=127.0.0.1 \
    --mysql-conf username=root \
    --mysql-conf password=123456 \
    --mysql-conf database-name=mysql_db \
    --sink-conf fenodes=selectdb-cn-****.selectdbfe.rds.aliyuncs.com:8080 \
    --sink-conf username=admin \
    --sink-conf password=****
ParameterDescription
execution.checkpointing.intervalThe checkpoint interval, which controls how frequently data is committed to SelectDB. Set to 10s.
parallelism.defaultThe parallelism of the Flink job. Increase this value to speed up synchronization.
databaseThe destination database name in the SelectDB instance.
including-tablesThe MySQL tables to synchronize. Separate multiple table names with vertical bars (|). Regular expressions are supported. For example, --including-tables table1|tbl.* synchronizes table1 and all tables whose names start with tbl.
excluding-tablesThe tables to exclude. Uses the same syntax as including-tables.
mysql-confConfiguration for the MySQL CDC source. The hostname, username, password, and database-name parameters are required. For all available options, see MySQL CDC Connector.
sink-confConfiguration for the SelectDB sink. See the "Configuration items of Doris Sink" section in Use Flink to import data.
table-confConfiguration for the destination SelectDB table. These options map to table properties used in CREATE TABLE.
Note

1. To synchronize data, you must install the Flink CDC dependencies — such as flink-sql-connector-mysql-cdc-${version}.jar and flink-sql-connector-oracle-cdc-${version}.jar — in the $FLINK_HOME/lib directory. 2. To synchronize from a full database, Flink 1.15 or later is required. For other connector versions, see org/apache/doris on Maven Central.

Migrate data using a catalog

The catalog method uses federated queries to read MySQL data directly in SelectDB and insert it into a local table. It is the simplest approach for one-time historical data migration. For the complete reference, see JDBC data source.

Import historical data using a catalog

  1. Connect to the SelectDB instance using a MySQL client.

    Note

    If you use Data Management (DMS) to connect, the SWITCH command is not supported. Use a MySQL client instead.

  2. Create a Java Database Connectivity (JDBC) catalog for the MySQL data source.

    ParameterRequiredDefaultDescription
    userYesThe account used to connect to the MySQL database.
    passwordYesThe password for the account.
    jdbc_urlYesThe JDBC URL for the MySQL database.
    driver_urlYesThe name of the JDBC driver JAR file.
    driver_classYesThe JDBC driver class name.
    lower_case_table_namesNofalseWhether to synchronize database and table names in lowercase.
    only_specified_databaseNofalseWhether to limit migration to specific databases.
    include_database_listNo""The databases to include. Takes effect only when only_specified_database is true. Separate multiple names with commas. Names are case-sensitive.
    exclude_database_listNo""The databases to exclude. Takes effect only when only_specified_database is true. Separate multiple names with commas. Names are case-sensitive.
    CREATE CATALOG jdbc_mysql PROPERTIES (
        "type"="jdbc",
        "user"="root",
        "password"="123456",
        "jdbc_url" = "jdbc:mysql://127.0.0.1:3306/demo",
        "driver_url" = "mysql-connector-java-8.0.25.jar",
        "driver_class" = "com.mysql.cj.jdbc.Driver",
        "checksum" = "fdf55dcef04b09f2eaf42b75e61ccc9a"
    )
  3. Create the destination table in SelectDB, then use INSERT INTO SELECT to copy the data.

    -- Create the destination table
    CREATE TABLE selectdb_table ...
    
    -- Import data from MySQL via the catalog
    INSERT INTO selectdb_table SELECT * FROM mysql_catalog.mysql_database.mysql_table;

    For details on the INSERT INTO SELECT syntax, see Import data by using INSERT INTO statements.

Migrate data using DataWorks

DataWorks uses its Data Integration feature to synchronize data from MySQL to SelectDB. For the complete reference, see Import data by using DataWorks.

Important

DataWorks does not support writing fields of the BITMAP, HyperLogLog (HLL), or QUANTILE_STATE data type.

Add data sources

Before creating a synchronization task, add both a MySQL data source and a SelectDB data source to DataWorks.

  1. Add a MySQL data source. See MySQL data source.

  2. Add an ApsaraDB for SelectDB data source. See Add and manage data sources. Use the following parameters.

    ParameterDescription
    Data Source NameThe name of the data source.
    Host Address/IP AddressThe JDBC URL in the format jdbc:mysql://<ip>:<port>/<dbname>. To get the VPC endpoint or public endpoint and MySQL port, go to the Instance Details page in the SelectDB console and view the Network Information section of the Basic Information page. Example: jdbc:mysql://selectdb-cn-4xl3jv1****.selectdbfe.rds.aliyuncs.com:9030/test_db
    HTTP Connection AddressThe HTTP address and port in the format <ip>:<port>. To get the HTTP port, go to the Instance Details page and view the HTTP Port field in the Network Information section. Example: selectdb-cn-4xl3jv1****.selectdbfe.rds.aliyuncs.com:8080
    UsernameThe username of the owner account of the SelectDB instance.
    PasswordThe password of the owner account.

Configure a synchronization task

Configure a synchronization task using the codeless UI or the code editor:

What's next