All Products
Search
Document Center

ApsaraDB for SelectDB:Integrate data

Last Updated:Mar 28, 2026

Data Integration is a visual tool for importing external data into ApsaraDB for SelectDB instances and databases without writing code. Use it to load benchmark datasets for performance testing or production data from Object Storage Service (OSS).

Supported integration types

TypeDescription
Sample dataPreloaded benchmark datasets (ClickBench, TPC-H, Github Demo, SSB-FLAT) for performance testing
OSSData files stored in an OSS bucket, supporting JSON, CSV, ORC, Parquet, and automatic format detection

Prerequisites

Before you begin, ensure that you have:

  • An ApsaraDB for SelectDB instance running version 3.0.7 or later. For details, see Create an instance.

Open Data Integration

  1. Log on to the ApsaraDB for SelectDB console.

  2. In the top navigation bar, select the region where your instance resides.

  3. In the left-side navigation pane, click Instances. On the Instances page, find your instance and click its ID to go to the Instance Details page.

  4. In the left-side navigation pane, click Data Development and Management (Studio) > Data Integration.

    Note

    The first time you open Data Development and Management (Studio), the console prompts you to add your machine's public IP address to the webui_whitelist IP address whitelist. Click OK to proceed.

  5. If you haven't logged on to the WebUI system before, the WebUI logon page appears. Log on with the admin account. If you don't know the password, see Reset the password of an account.

The Integration page opens. If you haven't created any tasks yet, the Stage page opens instead — you can create your first task from there.

Load sample data

Use this option to import benchmark datasets for performance testing.

  1. On the Integration page, click Create in the upper-left corner.

  2. On the New Integration page, select a dataset in the Sample Data section.

    Sample dataDescription
    ClickBenchThe ClickBench datasets
    TPC-HThe TPC-H datasets
    Github DemoThe GitHub events
    SSB-FLATThe SSB-FLAT datasets
  3. Configure the following parameters and click Create and Load.

    ParameterDescriptionExample
    Integration nameName for this integration tasktest
    CommentDescription of the tasktest comment
    ClusterThe cluster to run the task onnew_cluster
    Sample data sizeAmount of sample data to load1GB

Import data from OSS

Use this option to load your own data files from an OSS bucket.

  1. On the Integration page, click Create in the upper-left corner.

  2. On the New Integration page, click Object Storage in the Stage section.

  3. On the New Integration - Object Storage OSS page, configure the parameters in the following sections and click Confirm.

Connection settings

ParameterDescriptionExample
Integration nameName for this integration tasktest
CommentDescription of the tasktest comment
BucketName of the OSS buckettest_bucket_name
Default data file pathDefault path within the bucket
AuthenticationAuthorization method to access OSSAccess Key
Access KeyThe AccessKey ID of your Alibaba Cloud accountakdemo
Secret KeyThe AccessKey secret of your Alibaba Cloud accountskdemo
Advanced settingsDefault properties applied during all object imports

File configuration

ParameterDescriptionExample
File typeFile format of OSS objects. Options: JSON, ORC, CSV, Parquet, Automatic RecognitionJSON
Compression methodCompression format of OSS objectsgz
Column separatorColumn delimiter for data in OSS objects\t
Line delimiterRow delimiter for data in OSS objects\n
File sizeSize limits on OSS objectsUnlimited

Loading configuration

ParameterDescriptionDefault
on ErrorHow to handle errors during import: Continue keeps importing, Abort stops the task, Customized applies a custom policyAbort
Strict modeControls how type conversion errors are handled. Open filters out error data after column type conversion. The following rules apply: (1) Error data refers to NULL values generated in NOT NULL destination columns after type conversion — the strict mode does not apply to destination columns whose NULL values are generated by functions. (2) If a destination column restricts values to a specific range and the converted value does not belong to that range, the strict mode does not apply (for example, source value is 10 and the destination column is DECIMAL(1,0) — 10 can be converted but falls outside the allowed range). Close does not filter out error data after column type conversion.Open

Manage integration tasks

Search for a task: On the Integration page, click the Search icon in the upper-right corner and enter the task name.

Delete a task: On the Integrations page, find the task and click the Delete icon in the Actions column.

Note

Deleting a task does not affect data that has already been imported, but may affect data currently being imported. Deleted tasks cannot be recovered.

Related topics

API reference