This topic describes how to use an external table to import MaxCompute data to an ApsaraDB for ClickHouse cluster (for ClickHouse 19.15.2.2).

Import data

  1. Log on to the destination ApsaraDB for ClickHouse cluster and create an external table.
    The following syntax is used:
    CREATE TABLE <table_name> [on cluster default]
    (
    'col_name' col_type,[col_name col_type]
    )
    ENGINE = MaxCompute('<tunnel-endpoint>', '<project-name>', '<table-name>', '<partition-spec>', '<access-key-id>', '<access-key-secret>');
    Parameter Description
    table_name The name of the custom external table.
    'col_name' col_type The name and type of the column.

    The column type of the ApsaraDB for ClickHouse table must map that of the MaxCompute table. For more information, see Data type mapping between MaxCompute and ApsaraDB for ClickHouse.

    ENGINE = MaxCompute Indicates an external table. The storage engine is MaxCompute.
    tunnel-endpoint The endpoint of MaxCompute.
    Note To ensure normal connection between the ApsaraDB for ClickHouse cluster and MaxCompute, the tunnel-endpoint must be an endpoint of VPC type and MaxCompute must be in the same region as the ApsaraDB for ClickHouse cluster, such as both in the China (Shanghai) region. For more information about how to view the tunnel endpoints of MaxCompute, see Configure endpoints.
    project-name The name of the MaxCompute project.
    Note It must be consistent with the name in MaxCompute.
    table-name The name of the MaxCompute table.
    Note It must be consistent with the name in MaxCompute.
    partition-spec The MaxCompute partition that you want to access.

    You must specify partitions. The expression to specify multiple partitions cannot contain spaces or single quotation marks ('). Example: sale_date=2020-09-01,region=beijing.

    access-key-id The AccessKey ID used to access the MaxCompute table.
    Note The account corresponding to the AccessKey ID must have the read permissions on the MaxCompute project and table.
    access-key-secret The AccessKey secret used to access the MaxCompute table.
    Example
    CREATE TABLE default.odps_src_tbl on cluster default
    (
        `shop_name` String,
        `customer_id` String,
        `total_price` Float64
    )
    ENGINE = MaxCompute('http://dt.cn-shanghai.maxcompute.aliyun-inc.com', 'test_project', 'test_tbl', 'sale_date=2020-09-01,region=beijing', '<your-access-key-id>', '<your-access-key-secret>');
  2. Create an ApsaraDB for ClickHouse table.
    CREATE TABLE default.odps_dst_tbl_local on cluster default
    (
        `shop_name` String,
        `customer_id` String,
        `total_price` Float64
    )
    ENGINE= MergeTree()
    ORDER BY customer_id;
    
    CREATE TABLE default.odps_dst_tbl_dist on cluster default
    as default.odps_dst_tbl_local
    ENGINE= Distributed(default, default, odps_dst_tbl_local, rand());
  3. Import data from the external table to ApsaraDB for ClickHouse table.
    insert into odps_dst_tbl_dist select * from odps_src_tbl;

Data type mapping between MaxCompute and ApsaraDB for ClickHouse

MaxCompute data type ApsaraDB for ClickHouse data type
BOOLEAN UINT8
TINYINT UINT8 or INT8
SMALLLINT UINT16 or INT16
INT UINT32 or INT32
BIGINT UINT64 or INT64
FLOAT FLOAT32
DOUBLE FLOAT64
CHAR STRING
VARCHAR STRING
BINARY STRING
STRING STRING
DATE UINT16
DATETIME UINT32
UUID Not supported
INTERVAL Not supported
DECIMAL Not supported
TIMESTAMP Not supported
Composite data type Not supported