All Products
Search
Document Center

DataWorks:SelectDB

Last Updated:Mar 25, 2026

DataWorks Data Integration supports writing data to Alibaba Cloud SelectDB via single-table or Entire-Database Offline Synchronization.

Supported SelectDB versions

SelectDB Writer uses MySQL Driver 5.1.47. The following versions are supported:

  • Standard SelectDB: 2.2.31 and 2.2.22

  • Alibaba Cloud SelectDB: 2.4, 3.0, and 4.0

For details on driver capabilities, see MySQL Connectors.

Limitations

Writing data to BITMAP, HLL (HyperLogLog), or QUANTILE_STATE fields is not supported.

Supported field types

The following table lists the supported field types using SelectDB 2.2.22 as an example. For a complete list for each version, see the official SelectDB documentation.

SelectDB field typeOffline write (SelectDB Writer)
INTSupported
BIGINTSupported
LARGEINTSupported
SMALLINTSupported
TINYINTSupported
BOOLEANSupported
DECIMALSupported
DOUBLESupported
FLOATSupported
CHARSupported
VARCHARSupported
STRINGSupported
DATESupported
DATEV2Supported
DATETIMESupported
DATETIMEV2Supported
ARRAYSupported
JSONBSupported
BITMAPNot supported
HLL (HyperLogLog)Not supported
QUANTILE_STATENot supported

Prerequisites

Before you configure a synchronization task, complete the following steps.

Step 1: Confirm the SelectDB version

Data Integration has version requirements for SelectDB. Verify that your SelectDB version meets the requirements listed in Supported SelectDB versions. Check your version in the SelectDB console.

Step 2: Create an account and grant permissions

Create a dedicated SelectDB account for DataWorks to use when accessing the data source. If you use the default admin user of the SelectDB data warehouse instead, set a password for that user.

Step 3: Configure network connectivity

Choose a connection type based on your network environment.

Resource group networkSelectDB connection typeSteps
VPCPrivate network (recommended)In the SelectDB console, create a private endpoint following the Network Configuration instructions. After the endpoint status changes from Creating to Available and the connection status changes from Connecting to Connected, bind the endpoint's VPC to the resource group's VPC. See Network connectivity solution.
PublicInternetAdd the elastic IP addresses (EIPs) of the resource group to the SelectDB whitelist. See Add a whitelist.

A private network connection provides higher bandwidth and a more stable connection than an Internet connection.

Add a data source

Add SelectDB as a data source in DataWorks following the instructions in Data source management. The following parameters are specific to SelectDB:

  • MySQL Connection Address: The Java Database Connectivity (JDBC) connection string. Copy the JDBC Connection String from Connection Method > Other Methods in the SelectDB console. Both public and private IP addresses are supported.

  • HTTP Connection Address: The HTTP protocol endpoint. Copy the HTTP Protocol Endpoint from Connection Method > Other Methods in the SelectDB console. Both public and private IP addresses are supported.

  • Username: The username for the SelectDB data warehouse.

  • Password: The password for the specified username.

View parameter descriptions directly in the DataWorks console when adding a data source.

Develop a synchronization task

Single-table offline synchronization

Configure a task using either the codeless UI or the code editor:

For the complete parameter reference and script examples, see Appendix: Script demo and parameter reference.

Entire-database offline synchronization

Synchronize an entire database from MySQL or PostgreSQL to SelectDB. For the configuration procedure, see Configure an Entire-Database Offline Synchronization task.

Appendix: Script demo and parameter reference

Use the code editor to configure a batch synchronization task in script format. For the general script format requirements, see Configure an offline synchronization task using the code editor.

Script examples

All examples use "stepType": "selectdb" to identify the SelectDB Writer plug-in.

Example 1: Write in JSON format (default)

JSON is the default import format. Use this when the source data is already structured as JSON objects.

{
  "stepType": "selectdb",
  "parameter": {
    "datasource": "selectdb_datasource",
    "table": "selectdb_table_name",
    "column": ["id", "table_id", "table_no", "table_name", "table_status"],
    "loadProps": {
      "format": "json",
      "strip_outer_array": true
    }
  },
  "name": "Writer",
  "category": "writer"
}

Example 2: Write in CSV format

Use CSV format when you need custom column and row delimiters.

{
  "stepType": "selectdb",
  "parameter": {
    "datasource": "selectdb_datasource",
    "table": "selectdb_table_name",
    "column": ["id", "table_id", "table_no", "table_name", "table_status"],
    "loadProps": {
      "format": "csv",
      "column_separator": "\\x01",
      "line_delimiter": "\\x02"
    }
  },
  "name": "Writer",
  "category": "writer"
}
If you use CSV format without specifying column_separator and line_delimiter, all imported data is converted to strings, with \t as the column delimiter and \n as the row delimiter.

Writer parameters

ParameterDescriptionRequiredDefault value
datasourceThe name of the data source added in DataWorks.YesNone
tableThe name of the destination table.YesNone
columnThe columns to write data to, specified as a list. For example: ["id", "name", "age"]. To write to all columns in order, use ["*"].YesNone
preSqlSQL statements to run before the synchronization task starts. The codeless UI supports one statement; the code editor supports multiple statements.NoNone
postSqlSQL statements to run after the synchronization task completes. The codeless UI supports one statement; the code editor supports multiple statements.NoNone
maxBatchRowsThe maximum number of rows per import batch. The import starts when either maxBatchRows or batchSize is reached.No500000
batchSizeThe maximum data volume per import batch, in bytes. The import starts when either batchSize or maxBatchRows is reached.No94371840
maxRetriesThe number of retry attempts after a batch import fails.No3
labelPrefixThe prefix for each uploaded file's label. The final label is labelPrefix + UUID, forming a globally unique label that prevents duplicate imports.Nodatax_selectdb_writer_
loadPropsThe COPY INTO request parameters, used to configure the import format. If omitted or set to {}, data is imported in JSON format with strip_outer_array=true.NoJSON format (strip_outer_array=true)
clusterNameThe name of the SelectDB Cloud cluster. View the name in the SelectDB console.NoNone
flushIntervalThe interval between batch writes, in milliseconds. If maxBatchRows and batchSize thresholds are not reached within this interval, the batch is imported based on the interval.No30000

What's next