After you create an Object Storage Service (OSS) partitioned table in Data Lake Analytics (DLA), you can execute the INSERT OVERWRITE statement to rewrite OSS data to the partitioned table.

Write data to a partitioned table

Step 1: Create a schema

  1. Log on to the DLA console.

  2. In the left-side navigation pane, click SQL access point. On the page that appears, find your VPC and click Log on in DMS in the Actions column. Then, execute the following SQL statement to create an OSS schema.

​CREATE SCHEMA dla_oss_demo with DBPROPERTIES(
 catalog='oss',
 location= 'oss://oss-bucket-name/dla/'
 );​
  • catalog: the type of the schema that you want to create. In this example, the OSS schema is to be created.

  • location: the directory of the OSS bucket in which the file is saved. This directory must end with a forward slash (/).

Step 2: Create a partitioned table

​CREATE EXTERNAL TABLE dla (
    prod_name string,
    cnt bigint)
PARTITIONED BY (y string, m string)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
STORED AS TEXTFILE
LOCATION 'oss://bucket-name/dla/';​

Directories in which OSS files are saved in this example:

  • https://bucket-name.oss-cn-hangzhou.aliyuncs.com/dla/y=2019/m=3/cnt.txt

  • https://bucket-name.oss-cn-hangzhou.aliyuncs.com/dla/y=2018/m=12/cnt2.txt

Step 3: Run the MSCK command to update the partition information

MSCK REPAIR TABLE dla;

Step 4: Run the SHOW PARTITIONS command to view the partition information

​show partitions dla;
+-----------+
| y=2018/m=12|
| y=2019/m=2 |​

Step 5: Query data from the partitioned table

​select * from dla;
+---------------------------+
|prod_name | cnt | y  |  m  |
+---------------------------+
|Mouse      | 1   |2019 |  3  |
|Monitor    | 2   |2019 |  3  |
|Keyboard      | 4   |2019 |  3  |
|Desk    | 100 |2018 |  12 |
|Chair      | 150 |2018 |  12 |​

Step 6: Insert a data record into the partitioned table

​insert into dla values ("Hard disk",10,"2019","3");
+---------------------------+
select * from dla;
+---------------------------+
|prod_name | cnt | y  |  m  |
+---------------------------+
|Hard disk      | 10  |2019 |  3  |
|Mouse      | 1   |2019 |  3  |
|Monitor    | 2   |2019 |  3  |
|Keyboard      | 4   |2019 |  3  |
|Desk    | 100 |2018 |  12 |
|Chair      | 150 |2018 |  12 |​

Step 7: Execute the INSERT OVERWRITE statement to insert data into the partitioned table

​insert overwrite dla values ("Computer",10,"2019","3");
select * from dla;
+---------------------------+
|prod_name | cnt | y  |  m  |
+---------------------------+
|Computer      | 10  |2019 |  3  |
|Desk    | 100 |2018 |  12 |
|Chair      | 150 |2018 |  12 |​

After the INSERT OVERWRITE statement is executed, the data in the table is overwritten by the inserted data. No change has been made to the data in the dla/y=2018/m=12/ directory of the OSS bucket. However, the cnt.txt file in the dla/y=2019/m=3/ directory has been deleted and only the data record that is inserted by using INSERT OVERWRITE exists.