MaxCompute allows you to insert data into a dynamic partition by using INSERT INTO or INSERT OVERWRITE.

Before you execute these statements, make sure that you are granted the ALTER permission on the destination table and the DESCRIBE permission on metadata in the source table. For more information, see Authorize users.

You can execute the INSERT INTO or INSERT OVERWRITE statement on the following platforms:

Description

When you use MaxCompute SQL to process data, you need only to specify the names of partition key columns in INSERT INTO or INSERT OVERWRITE, instead of the values of the partition key columns. The values of the partition key columns are specified in select_statement. MaxCompute automatically inserts data into the destination partitions based on the column values.

For more information about how to insert data into a static partition, see Insert or update data into a table or static partitions (INSERT INTO and INSERT OVERWRITE).

Limits

When you insert data into a dynamic partition by using INSERT INTO or INSERT OVERWRITE, take note of the following limits:
  • A maximum of 10,000 dynamic partitions can be generated after INSERT INTO is executed. A maximum of 60,000 dynamic partitions can be generated after INSERT OVERWRITE is executed.
  • In a distributed environment, an SQL statement for inserting or updating data into dynamic partitions can generate a maximum of 512 dynamic partitions. If the number of dynamic partitions exceeds this limit, an exception occurs.
  • Column values of dynamic partitions cannot be NULL or contain special characters in INSERT statements or the subqueries. Otherwise, FAILED: ODPS-0123031:Partition exception - invalid dynamic partition value: province=xxx is returned.
  • Clustered tables do not support dynamic partitions.

Precautions

If you want to update table data into a dynamic partition, take note of the following points:
  • If you want to use INSERT INTO or INSERT OVERWRITE to insert data into a partition that does not exist, MaxCompute automatically creates the partition.
  • If you want to run multiple jobs at the same time to insert data into partitions that do not exist, MaxCompute automatically creates partitions for the job that is successfully executed. However, only one partition is created for this job.
  • If you cannot control the concurrency of jobs, we recommend that you run the ALTER TABLE command to create partitions in advance. For more information, see Partition and column operations.
  • If a destination table has multi-level partitions, you can specify some partitions as static partitions in an INSERT statement. However, the static partitions must be high-level partitions.
  • To insert data into a dynamic partition, you must specify partition key columns in select_statement. Otherwise, the data fails to be inserted.

Syntax

insert {into|overwrite} table <table_name> partition (<ptcol_name>[, <ptcol_name> ...]) 
<select_statement> from <from_statement>;
  • table_name: required. The name of the destination table into which you want to insert data.
  • ptcol_name: required. The name of the partition key column in the destination table.
  • select_statement: required. The SELECT clause that is used to query data that you want insert into the destination table from the source table.

    If the destination table has only level-1 dynamic partitions, the value of the last field in select_statement indicates the column value of the dynamic partition in the destination table. The mappings between column values in the source table in select_statement and column values in the destination table are determined by the column sequence, instead of column names. If the sequence of columns in the source table is different from that in the destination table, we recommend that you specify columns in select_statement based on the column sequence in the destination table.

  • from_statement: required. The FROM clause that indicates the data source, such as a source table name.

Examples

  • Example 1: Insert data from a source table into a destination table. You can obtain the partitions generated based on the region column only after the statement is executed. Sample commands:
    -- Create a destination table named total_revenues.
    create table total_revenues (revenue double) partitioned by (region string);
    
    -- Insert data from the sale_detail table into the total_revenues table. For more information about the sale_detail table, see Insert or update data into a table or static partitions (INSERT INTO and INSERT OVERWRITE).
    insert overwrite table total_revenues partition(region)
    select total_price as revenue, region from sale_detail;
    
    -- Execute the SHOW PARTITIONS statement to view the partitions in the total_revenues table.
    show partitions total_revenues;
       
    -- The following result is returned:
    region=china  
    
    -- Enable a full table scan only for the current session. Execute the SELECT statement to view the data in the sale_detail_dypart table.  
    set odps.sql.allow.fullscan=true; 
    select * from sale_detail_dypart;    
    
    -- The following result is returned:
    +------------+------------+
    | revenue    | region     |
    +------------+------------+
    | 100.1      | china      |
    | 100.2      | china      |
    | 100.3      | china      |
    +------------+------------+        
  • Example 2: Insert data from a source table into a destination table. If the destination table has multi-level partitions, level-1 partition sale_date must be specified. Sample commands:
    -- Create a destination table named sale_detail_dypart. 
    create table sale_detail_dypart like sale_detail; 
    
    -- Specify a level-1 partition and insert data into the destination table.
    insert overwrite table sale_detail_dypart partition (sale_date='2013', region)
    select shop_name,customer_id,total_price,region from sale_detail;
    
    -- Enable a full table scan only for the current session. Execute the SELECT statement to view the data in the sale_detail_dypart table.
    set odps.sql.allow.fullscan=true; 
    select * from sale_detail_dypart;
    
    -- The following result is returned:
    +------------+-------------+-------------+------------+------------+
    | shop_name  | customer_id | total_price | sale_date  | region     |
    +------------+-------------+-------------+------------+------------+
    | s1         | c1          | 100.1       | 2013       | china      |
    | s2         | c2          | 100.2       | 2013       | china      |
    | s3         | c3          | 100.3       | 2013       | china      |
    +------------+-------------+-------------+------------+------------+
  • Example 3: The mappings between the columns in select_statement and the columns in dynamic partitions in a destination table is determined by the column sequence, instead of the column names. Sample commands:
    -- Insert data from the sale_detail table into the sale_detail_dypart table.
    insert overwrite table sale_detail_dypart partition (sale_date, region)
    select shop_name,customer_id,total_price,sale_date,region from sale_detail;
    
    -- Enable a full table scan only for the current session. Execute the SELECT statement to view the data in the sale_detail_dypart table.
    set odps.sql.allow.fullscan=true; 
    select * from sale_detail_dypart;
    
    -- The following result is returned: The sale_date column in the dynamic partition of the sale_detail_dypart table is determined by the sale_date column in the sale_detail table. The region column in the dynamic partition of the sale_detail_dypart table is determined by the region column in the sale_detail table.
    +------------+-------------+-------------+------------+------------+
    | shop_name  | customer_id | total_price | sale_date  | region     |
    +------------+-------------+-------------+------------+------------+
    | s1         | c1          | 100.1       | 2013       | china      |
    | s2         | c2          | 100.2       | 2013       | china      |
    | s3         | c3          | 100.3       | 2013       | china      |
    +------------+-------------+-------------+------------+------------+
    
    -- Insert data from the sale_detail table into the sale_detail_dypart table and change the sequence of columns in select_statement.
    insert overwrite table sale_detail_dypart partition (sale_date, region)
    select shop_name,customer_id,total_price,region,sale_date from sale_detail;
    
    -- Enable a full table scan only for the current session. Execute the SELECT statement to view the data in the sale_detail_dypart table.
    set odps.sql.allow.fullscan=true; 
    select * from sale_detail_dypart;
    
    -- The following result is returned: The sale_date column in the dynamic partition of the sale_detail_dypart table is determined by the region column in the sale_detail table. The region column in the dynamic partition of the sale_detail_dypart table is determined by the sale_date column in the sale_detail table.
    +------------+-------------+-------------+------------+------------+
    | shop_name  | customer_id | total_price | sale_date  | region     |
    +------------+-------------+-------------+------------+------------+
    | s1         | c1          | 100.1       | china      | 2013       |
    | s2         | c2          | 100.2       | china      | 2013       |
    | s3         | c3          | 100.3       | china      | 2013       |
    +------------+-------------+-------------+------------+------------+
  • Example 4: If you insert data into a dynamic partition, you must specify columns in the dynamic partition in select_statement. Otherwise, the data fails to be inserted. Example of incorrect usage:
    insert overwrite table sale_detail_dypart partition (sale_date='2013', region)
    select shop_name,customer_id,total_price from sale_detail;
  • Example 5: If you specify only low-level sub-partitions when you insert data into dynamic partitions, you may fail to insert data into high-level partitions. Example of incorrect usage:
    insert overwrite table sales partition (region='china', sale_date)
    select shop_name,customer_id,total_price,sale_date from sale_detail;
  • Example 6: When MaxCompute inserts data into a dynamic partition, if the data type of a partition key column does not exactly match the data type of the column in select_statement, an implicit conversion is performed. Sample commands:
    -- Create a source table named src.
    create table src (c int, d string) partitioned by (e int);
    
    -- Add a partition to the src table.
    alter table src add if not exists partition (e=201312);
    
    -- Append data to the src table.
    insert into src partition (e=201312) values (1,100.1),(2,100.2),(3,100.3);
    
    -- Create a destination table named parttable.
    create table parttable(a int, b double) partitioned by (p string);
    
    -- Insert data from the src table into the parttable table.
    insert into parttable partition (p) select c, d, current_timestamp() from src;
    
    -- Query data in the parttable table.
    select * from parttable;
    
    -- The following result is returned:
    +------------+------------+------------+
    | a          | b          | p          |
    +------------+------------+------------+
    | 1          | 100.1      | 2020-11-25 15:13:28.686 |
    | 2          | 100.2      | 2020-11-25 15:13:28.686 |
    | 3          | 100.3      | 2020-11-25 15:13:28.686 |
    +------------+------------+------------+
    Note If your data is ordered, it is randomly scattered when it is inserted into a dynamic partition. This reduces the data compression ratio. In this case, we recommend that you use Tunnel commands to upload the data to dynamic partitions to increase the data compression ratio. For more information about how to use Tunnel commands, see Migrate data from ApsaraDB RDS to MaxCompute based on dynamic partitioning.