All Products
Search
Document Center

Background and preparations

Last Updated: Apr 05, 2020

You can create Object Storage Service (OSS) schemas and tables in Data Lake Analytics (DLA), perform data cleansing for different types of files in OSS, and then write the result data into AnalyticDB for MySQL 3.0. This allows you to perform instant multidimensional analysis and business exploration for hundreds of billions of data records within milliseconds.

Prerequisites

  • You must activate DLA , AnalyticDB for MySQL 3.0, and OSS in the same region.

  • You must create databases and tables in AnalyticDB for MySQL to store OSS data. For more information, see CREATE DATABASE and CREATE TABLE.

    The following shows the table schema information.

    1. CREATE TABLE shipping (
    2. id bigint NOT NULL COMMENT '',
    3. origin_state varchar NOT NULL COMMENT '',
    4. origin_zip varchar NOT NULL COMMENT '',
    5. destination_state varchar NOT NULL COMMENT '',
    6. destination_zip varchar NOT NULL COMMENT '',
    7. package_weight int NOT NULL COMMENT '',
    8. PRIMARY KEY (id)
    9. )
    10. DISTRIBUTED BY HASH (id)
    11. COMMENT ''
    1. CREATE TABLE order_table (
    2. customer_id bigint NOT NULL COMMENT '',
    3. order_id varchar NOT NULL COMMENT '',
    4. order_time timestamp NOT NULL COMMENT '',
    5. order_amount double NOT NULL COMMENT '',
    6. order_type varchar NOT NULL COMMENT '',
    7. address varchar NOT NULL COMMENT '',
    8. city varchar NOT NULL COMMENT '',
    9. order_season bigint COMMENT '',
    10. PRIMARY KEY (customer_id,order_time)
    11. )
    12. DISTRIBUTED BY HASH(customer_id)
    13. PARTITION BY VALUE(DATE_FORMAT(order_time, '%Y%m%d')) LIFECYCLE 30
    14. COMMENT ''

Precautions

  • AnalyticDB for MySQL uses the logic of update in the case of a primary key conflict. If the entire ETL task of INSERT FROM SELECT fails, the entire ETL task will be retried.

  • A delay occurs when data written to AnalyticDB for MySQL is queried. The delay depends on the cluster configuration of AnalyticDB for MySQL.

  • We recommend that you split the ETL task into small batches. For example, if the business permits, 200 GB of OSS data can be split into 100 folders with 2 GB of data in each folder. In DLA, you can create 100 tables and perform ETL on 100 tables separately. If a single ETL task fails, you only need to retry the task.

Procedure

  1. Create an OSS schema and table.

  2. Create an AnalyticDB for MySQL schema and table.

  3. Write OSS data into AnalyticDB for MySQL.