To insert data into a partitioned table by using the INSERT OVERWRITE statement, you can specify a partitioning column value in the statement. You can also specify the name of a partitioning column in the INSERT OVERWRITE statement but set the partitioning column value in the SELECT clause. The method of specifying the name of a partitioning column in the INSERT OVERWRITE statement indicates that data is inserted in dynamic partition mode. This topic describes the basic syntax and provides examples on how to insert data in dynamic partition mode.

Syntax

Statement
INSERT OVERWRITE TABLE tablename PARTITION (partcol1, partcol2 ...) select_statement FROM from_statement;
Parameters
  • tablename

    The name of the destination table into which you want to insert the data.

  • partcol1, partcol2 ...

    The names of partitioning columns in the destination table into which you want to insert the data.

  • select_statement

    The statement used to query data in the source table. The fields in the select_statement clause provide the dynamic partition values in the destination table. If the destination table only has level-1 dynamic partitions, the value of the last field in the select_statement clause is the dynamic partition value in the destination table.

    Note
    • If you execute an SQL statement that supports the dynamic partition feature in a distributed environment, a single process can provide up to 512 dynamic partitions. If the number of dynamic partitions exceeds this limit, an exception occurs.
    • An SQL statement that supports the dynamic partition feature can generate up to 2,000 dynamic partitions. If the number of dynamic partitions exceeds this limit, an exception occurs.
    • A dynamic partition value cannot be NULL or contain special characters. Otherwise, FAILED: ODPS-0123031:Partition exception - invalid dynamic partition value: province=xxx is returned.
    • If a destination table has multi-level partitions, you can specify some partitions as static partitions in an INSERT statement. However, the static partitions must be high-level partitions.
    • If a destination table is a hash clustering table, the dynamic partition feature is not supported.

Precautions

In dynamic partition mode, the mapping between the fields in the select_statement clause and dynamic partitions in a destination table is determined by the field order, not by the column name. If the field order in a source table is different from that in a destination table, we recommend that you specify the fields in the select_statement clause based on the order in the destination table.

For example, create destination table sale_detail_dypart.
create table sale_detail_dypart like sale_detail; 
  • Execute the following statements to insert the data from source table sale_detail into destination table sale_detail_dypart:
    insert overwrite table sale_detail_dypart partition (sale_date, region)
      select shop_name,customer_id,total_price,sale_date,region from sale_detail;

    In these statements, sale_date in sale_detail determines the value of the sale_date dynamic partition in sale_detail_dypart, and region in sale_detail determines the value of the region dynamic partition in sale_detail_dypart.

  • Execute the following statements to insert the data from sale_detail into sale_detail_dypart:
    insert overwrite table sale_detail_dypart partition (sale_date, region)
      select shop_name,customer_id,total_price,region,sale_date from sale_detail;

    In these statements, region in sale_detail determines the value of the sale_date dynamic partition in sale_detail_dypart, and sale_date in sale_detail determines the value of the region dynamic partition in sale_detail_dypart.

Note By default, the full scan is disabled for partitioned tables of a newly created project. If a full table scan statement is executed on a partitioned table of the project, a failure message is returned. To address this issue, you must specify a partition condition during the query. Therefore, when you execute a full table scan statement, you must insert set odps.sql.allow.fullscan=true; before the statement and then commit the statements for execution.

Examples

  • Example 1
    -- Create destination table total_revenues.
    create table total_revenues (revenue bigint) partitioned by (region string);
    
    -- Insert the data from sale_detail to total_revenues.
    insert overwrite table total_revenues partition(region)
      select total_price as revenue, region from sale_detail;

    You can obtain the partitions generated based on the region field only after the SELECT statement is executed.

  • Example 2: Insert the data from a source table into a destination table. If the destination table has multi-level partitions, level-1 partition sale_date must be specified.
    insert overwrite table sale_detail_dypart partition (sale_date='2013', region)
      select shop_name,customer_id,total_price,region from sale_detail;
  • Example 3: To insert data in dynamic partition mode, the dynamic partition column must be included in the SELECT statement. Otherwise, the execution fails.
    insert overwrite table sale_detail_dypart partition (sale_date='2013', region)
      select shop_name,customer_id,total_price from sale_detail;
  • Example 4: If you specify only low-level sub-partitions when you insert data in dynamic partition mode, inserting data into high-level partitions will fail.
    insert overwrite table sales partition (region='china', sale_date)
      select shop_name,customer_id,total_price,sale_date from sale_detail;
  • Example 5: If the types of values in partitioning columns are inconsistent with those in the SELECT clause, an error is returned when data is inserted in dynamic partition mode in MaxCompute V1.0. MaxCompute V2.0 supports the implicit conversion of data types.
    -- Create destination table parttable.
    create table parttable(a int, b double) partitioned by (p string);
    -- Insert the data from source table src into parttable.
    insert into parttable partition(p) select key, value, current_timestmap() from src;
    -- Query data in parttable.
    select * from parttable;
    The following table lists the results.
    a b p
    0 NULL 2017-01-23 22:30:47.130406621
    0 NULL 2017-01-23 22:30:47.130406621
    Note If your data is ordered, it is randomly scattered when it is inserted in dynamic partition mode, which reduces the compression ratio. In this case, we recommend that you use Tunnel commands to upload the data to dynamic partitions to increase the compression ratio. For more information, see Migrate data from RDS to MaxCompute to implement dynamic partitioning.