All Products
Search
Document Center

ApsaraDB for SelectDB:Use DataWorks to import data

Last Updated:Mar 28, 2026

When you need to load data from a relational database such as MySQL into ApsaraDB for SelectDB, you can use the Data Integration feature of DataWorks with SelectDB Writer to run an offline batch synchronization job — without writing custom ingestion code.

This topic walks you through adding the required data sources in DataWorks and configuring a batch synchronization task that reads from MySQL and writes to ApsaraDB for SelectDB.

Limits

  • Data Integration only supports offline writes to ApsaraDB for SelectDB. Real-time streaming is not supported.

  • SelectDB Writer cannot write fields of the following data types: BITMAP, HyperLogLog (HLL), or QUANTILE_STATE.

Prerequisites

Before you begin, make sure you have:

  • An ApsaraDB for SelectDB instance

  • A MySQL database containing the data to synchronize

  • Access to the DataWorks console with permissions to create data synchronization tasks

Add data sources

Before configuring the synchronization task, add both a MySQL data source and an ApsaraDB for SelectDB data source to DataWorks.

  1. Add a MySQL data source. For details, see MySQL data source.

  2. Add an ApsaraDB for SelectDB data source. For details, see Add and manage data sources. Use the following parameters:

    If MySQL and ApsaraDB for SelectDB are in the same virtual private cloud (VPC), use the VPC endpoint. If they are in different VPCs, use the Public endpoint.
    Important

    Add the IP addresses of your DataWorks resource groups to the IP address whitelist of ApsaraDB for SelectDB. Exclusive resource group IPs and shared resource group IPs must be added to separate whitelists. For details, see Configure an IP address whitelist.

    ParameterDescription
    Data Source NameThe name of the data source.
    JDBC URLThe Java Database Connectivity (JDBC) connection string. Format: jdbc:mysql://<ip>:<port>/<dbname>. Find the VPC endpoint and MySQL port in the Network information section on the Basic information tab of your instance. Example: jdbc:mysql://selectdb-cn-4xl3jv1****.selectdbfe.rds.aliyuncs.com:9030/test_db
    HTTP addressThe HTTP access address. Format: <ip>:<port>. Find the VPC endpoint and HTTP port in the Network information section. Example: selectdb-cn-4xl3jv1****.selectdbfe.rds.aliyuncs.com:8080
    UsernameThe username of the ApsaraDB for SelectDB instance owner account.
    PasswordThe password of the ApsaraDB for SelectDB instance owner account.

Configure a batch synchronization task

DataWorks provides two ways to configure the synchronization task:

MethodBest for
Codeless UIVisual configuration without writing scripts
Code editorFull control over the synchronization script (recommended for complex mappings)

The following sections provide a complete code example and parameter reference for the code editor approach.

Sample code and parameters

Code example

The following script uses MySQL Reader and SelectDB Writer to synchronize data from a MySQL table to ApsaraDB for SelectDB.

{
  "type": "job",
  "version": "2.0",
  "steps": [
    {
      "stepType": "mysql",
      "parameter": {
        "column": [
          "<id>",
          "<table_id>",
          "<table_no>",
          "<table_name>",
          "<table_status>"
        ],
        "connection": [
          {
            "datasource": "<mysql_datasource>",
            "table": [
              "<mysql_table_name>"
            ]
          }
        ],
        "where": "",
        "splitPk": "",
        "encoding": "UTF-8"
      },
      "name": "Reader",
      "category": "reader"
    },
    {
      "stepType": "selectdb",
      "parameter": {
        "postSql": [],
        "preSql": [],
        "username": "<selectdb_username>",
        "password": "<selectdb_password>",
        "loadUrl": [
          "<ip:port>"
        ],
        "column": [
          "<id>",
          "<table_id>",
          "<table_no>",
          "<table_name>",
          "<table_status>"
        ],
        "connection": [
          {
            "datasource": "<selectdb_datasource>",
            "table": [
              "<selectdb_table_name>"
            ]
          }
        ],
        "maxBatchRows": 1000000,
        "loadProps": {
          "format": "csv",
          "column_separator": "\\x01",
          "line_delimiter": "\\x02"
        }
      },
      "name": "Writer",
      "category": "writer"
    }
  ],
  "setting": {
    "errorLimit": {
      "record": "0"
    },
    "speed": {
      "throttle": false,
      "concurrent": 1
    }
  },
  "order": {
    "hops": [
      {
        "from": "Reader",
        "to": "Writer"
      }
    ]
  }
}

Replace the placeholders with your actual values:

PlaceholderDescriptionExample
<mysql_datasource>Name of the MySQL data source added in DataWorksmy_mysql_source
<mysql_table_name>Source table name in MySQLorders
<selectdb_username>ApsaraDB for SelectDB usernameadmin
<selectdb_password>ApsaraDB for SelectDB password
<ip:port>VPC endpoint and HTTP port of the SelectDB instanceselectdb-cn-4xl3jv1****.selectdbfe.rds.aliyuncs.com:8080
<selectdb_datasource>Name of the ApsaraDB for SelectDB data source added in DataWorksmy_selectdb_source
<selectdb_table_name>Destination table name in SelectDBorders_replica

SelectDB Writer parameters

datasource

The name of the data source configured in DataWorks. Must match the name used when adding the data source.

  • Required: Yes

table

The name of the destination table to write data into.

  • Required: Yes

column

The column names to write to in the destination table. Separate multiple column names with commas. To write to all columns, use ["*"].

  • Required: Yes

  • Example: ["id", "name", "age"]

loadUrl

The HTTP endpoint of the ApsaraDB for SelectDB instance. Use the VPC endpoint and HTTP port.

  • Required: Yes

  • Format: ip:port

  • Example: selectdb-cn-4xl3jv1****.selectdbfe.rds.aliyuncs.com:8080

username

The username for accessing the ApsaraDB for SelectDB database.

  • Required: Yes

password

The password for accessing the ApsaraDB for SelectDB database.

  • Required: Yes

preSql

SQL statements to run before the synchronization task starts — for example, to delete stale data. The codeless UI supports one statement; the code editor supports multiple.

  • Required: No

postSql

SQL statements to run after the synchronization task completes — for example, to add a timestamp. The codeless UI supports one statement; the code editor supports multiple.

  • Required: No

maxBatchRows

The maximum number of rows written per batch. maxBatchRows controls how frequently data is flushed to SelectDB — a higher value means fewer, larger batches.

  • Required: No

  • Default: 500000

loadProps

Parameters for the COPY INTO request that control the import format. If left blank or set to {}, JSON format is used by default.

  • Required: No

  • Default: json

JSON format (default): The strip_outer_array parameter is automatically set to true.

"loadProps": {
  "format": "json",
  "strip_outer_array": true
}

CSV format: Specify delimiters explicitly. If you do not specify the row and column delimiters, the imported data is converted into strings by default — \t is used as the column delimiter and \n as the row delimiter.

"loadProps": {
  "format": "csv",
  "column_separator": "\\x01",
  "line_delimiter": "\\x02"
}

Supported data types

SelectDB Writer supports the following ApsaraDB for SelectDB data types for offline writes.

Data typeSupported
INTYes
BIGINTYes
LARGEINTYes
SMALLINTYes
TINYINTYes
BOOLEANYes
DECIMALYes
DOUBLEYes
FLOATYes
CHARYes
VARCHARYes
STRINGYes
DATEYes
DATEV2Yes
DATETIMEYes
DATETIMEV2Yes
ARRAYYes
JSONBYes
BITMAPNo
HLLNo
QUANTILE_STATENo

Related topics