This topic describes how to use the external table feature of MaxCompute to migrate the file data that is compressed in the Snappy format from Object Storage Service (OSS) to MaxCompute. The data in the file that is compressed in the Snappy format is separated by commas (,).

Prerequisites

  • MaxCompute is activated. A MaxCompute project is created. For more information, see Create a MaxCompute project.
  • OSS is activated and a bucket is created. A data file that is compressed in the Snappy format is stored in the bucket. For more information about how to activate OSS and create a bucket, see Create buckets.

    In this example, the OSS bucket in which the data file is stored resides in the China (Beijing) region. The file is saved in the mfosscostfee/demo7/ directory. You can also use a data file based on your business requirements.

Migrate the compressed data

  1. Connect to the MaxCompute development tool.
    The following table describes the development tools that you can use to create an OSS external table.
    How to create Development tool
    Execute MaxCompute SQL statements MaxCompute client
    Query editor in the MaxCompute console
    ODPS SQL nodes in the DataWorks console
    SQL scripts in MaxCompute Studio
    Perform operations on the web UI SQL scripts in MaxCompute Studio
    DataWorks console
  2. Create an OSS external table.
    For more information about how to create an OSS external table, see Create an OSS external table. Sample statement:
    -- Create an OSS external table.
    create external table if not exists mc_oss_ext_snap_split
    (
    str1 string,
    str2 string,
    str3 string,
    str4 string,
    str5 string,
    str6 string,
    str7 string,
    str8 string,
    str9 string,
    str10 string,
    str11 string,
    str12 string,
    str13 string,
    str14 string,
    str15 string,
    str16 string,
    str17 string,
    str18 string
    )
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
    WITH SERDEPROPERTIES ('field.delim'=',') -- The data in the file that is compressed in the Snappy format is separated by commas (,).
    STORED AS TEXTFILE
    location 'oss://oss-cn-beijing-internal.aliyuncs.com/mfosscostfee/demo7/'
    ;                        
  3. Use the OSS external table to read data and write the data to a MaxCompute internal table.
    Sample statements:
    -- Create a MaxCompute internal table.
    create table if not exists mc_oss_snap_split
    (
    str1 string,
    str2 string,
    str3 string,
    str4 string,
    str5 string,
    str6 string,
    str7 string,
    str8 string,
    str9 string,
    str10 string,
    str11 string,
    str12 string,
    str13 string,
    str14 string,
    str15 string,
    str16 string,
    str17 string,
    str18 string
    );
    -- Use the OSS external table to read data and write the data to the MaxCompute internal table.
    insert into table mc_oss_snap_split select * from mc_oss_ext_snap_split limit 10;
  4. Query data in the MaxCompute internal table.
    Sample statement:
    select str2,str3 from mc_oss_snap_split;
    The following result is returned:
    +------------+------------+
    | str2       | str3       |
    +------------+------------+
    | 113.221620 | 23.398279  |
    | 113.288735 | 23.157167  |
    | 113.040365 | 23.681102  |
    | 113.910224 | 22.757139  |
    | 119.086087 | 33.583632  |
    | 113.363475 | 23.141354  |
    | 113.328440 | 23.130362  |
    | 113.249651 | 23.205976  |
    | 113.258325 | 23.159060  |
    | 117.463688 | 38.836611  |
    +------------+------------+
    The preceding result indicates that the compressed data is migrated from OSS to MaxCompute.