This topic describes how to use external tables to import data from MaxCompute to AnalyticDB for MySQL.

Prerequisites

  • A MaxCompute project and a MaxCompute table are created. For more information about the MaxCompute project, table, and data used in this example, see the "Description of sample data" section of this topic.
  • An AnalyticDB for MySQL cluster is created in the region where the MaxCompute project is located. For more information, see Create a cluster.
    Note If the AnalyticDB for MySQL cluster is in elastic mode, you must log on to the AnalyticDB for MySQL console, view the cluster information, and then enable Elastic Network Interface (ENI) in the Network Information section. Enable ENI
  • The CIDR blocks of the virtual private cloud (VPC) in which the AnalyticDB for MySQL cluster is located are added to a whitelist of the MaxCompute project.

    You can log on to the AnalyticDB for MySQL console and view the VPC ID on the Cluster Information page. Then, you can log on to the VPC console and find the VPC ID on the VPCs page to view the CIDR blocks. For more information about how to configure a MaxCompute whitelist, see Manage IP address allowlists.

Description of sample data

In this example, a MaxCompute project named odps_project1 and a MaxCompute table named odps_nopart_import_test are used. The following statement shows how to create this table:
CREATE TABLE IF NOT EXISTS odps_nopart_import_test (
    id int,
    name string,
    age int)
partitioned by (dt string);
The following statement shows how to create a partition in the odps_nopart_import_test table:
ALTER TABLE odps_nopart_import_test 
ADD 
PARTITION (dt='202207');
The following statement shows how to insert data to the partition:
INSERT INTO odps_project1.odps_nopart_import_test 
PARTITION (dt='202207') 
VALUES (1,'james',10),(2,'bond',20),(3,'jack',30),(4,'lucy',40);

Procedure

The following example demonstrates how to import data from the odps_nopart_import_test table to the test_adb database of an AnalyticDB for MySQL cluster.

  1. Connect to the AnalyticDB for MySQL cluster and enter the test_adb database.
  2. Execute the following CREATE TABLE statement to create an external table named odps_nopart_import_test_external_table in the test_adb database:
    CREATE TABLE IF NOT EXISTS odps_nopart_import_test_external_table
    (
        id int,
        name string,
        age int,
        dt string
    )
     ENGINE='ODPS'
     TABLE_PROPERTIES='{
     "endpoint":"http://service.cn.maxcompute.aliyun-inc.com/api",
     "accessid":"L*******FsE",
     "accesskey":"CcwF********iWjv",
     "partition_column":"dt",
     "project_name":"odps_project1",
     "table_name":"odps_nopart_import_test"
     }';                 
    Parameter Description
    ENGINE='ODPS' The storage engine of the external table. To read and write MaxCompute data, set the storage engine to ODPS.
    endpoint The endpoint of MaxCompute.
    Note AnalyticDB for MySQL can access MaxCompute only by using VPC endpoints.

    For more information about the VPC endpoint of each region, see Endpoints in different regions (VPC).

    accessid The AccessKey ID of an Alibaba Cloud account or a RAM user used to access MaxCompute.

    For more information about how to obtain an AccessKey ID and an AccessKey secret, see Obtain the AccessKey pair of an account.

    accesskey The AccessKey secret of an Alibaba Cloud account or a RAM user used to access MaxCompute.

    For more information about how to obtain an AccessKey ID and an AccessKey secret, see Obtain the AccessKey pair of an account.

    partition_column The partition_column parameter is required to create a partitioned table. If the MaxCompute table does not contain partitions, you must create a table without partitions in AnalyticDB for MySQL and do not need to configure the partition_column parameter.
    project_name The name of the MaxCompute project.
    table_name The name of the MaxCompute table.
  3. Execute the following CREATE TABLE statement to create a table named adb_nopart_import_test in the test_adb database to store data imported from MaxCompute:
    CREATE TABLE IF NOT EXISTS adb_nopart_import_test
    (   id int,
        name string,
        age int,
        dt string,
        PRIMARY KEY(id,dt)
    )
    DISTRIBUTE BY HASH(id)
    PARTITION BY VALUE('dt') LIFECYCLE 30;  
  4. Use the external table odps_nopart_import_test_external_table to import data from MaxCompute to the adb_nopart_import_test table of AnalyticDB for MySQL.
    • Method 1: Execute an INSERT INTO statement to import data. If the primary key has duplicate values, data is not repeatedly inserted and the INSERT INTO statement is equivalent to INSERT IGNORE INTO. For more information, see INSERT INTO. Sample statement:
      INSERT INTO adb_nopart_import_test
      SELECT * FROM odps_nopart_import_test_external_table;   
      Execute a SELECT statement to query the data written to the table. Example:
      SELECT * FROM adb_nopart_import_test;
      The following information is returned:
      +------+-------+------+---------+
      | id   | name  | age  |   dt    |
      +------+-------+------+---------+
      |    1 | james |   10 |  202207 |
      |    2 | bond  |   20 |  202207 |
      |    3 | jack  |   30 |  202207 |
      |    4 | lucy  |   40 |  202207 |
      +------+-------+------+---------+
      If you need to import data from a specific partition to the adb_nopart_import_test table, you can execute the following statement:
      INSERT INTO adb_nopart_import_test
      SELECT * FROM odps_nopart_import_test_external_table 
      WHERE dt = '202207'; 
    • Method 2: Execute an INSERT OVERWRITE INTO statement to import data. If the primary key has duplicate values, the original value is overwritten by the new value. Sample statement:
      INSERT OVERWRITE INTO adb_nopart_import_test
      SELECT * FROM odps_nopart_import_test_external_table;
    • Method 3: Execute a SUBMIT JOB statement that contains INSERT OVERWRITE INTO to submit an asynchronous import task. You can add a hint (/* direct_batch_load=true*/) before the data import statement to accelerate the task. For more information, see Asynchronous writing. Sample statement:
      SUBMIT job 
      INSERT OVERWRITE INTO adb_nopart_import_test
      SELECT * FROM odps_nopart_import_test_external_table;  
      The following information is returned:
      +---------------------------------------+
      | job_id                                |
      +---------------------------------------+
      | 2020112122202917203100908203303****** |

      For more information about how to submit an asynchronous task, see Asynchronously submit an import or export task.