MaxCompute allows you to insert data into a dynamic partition by using INSERT INTO
or INSERT OVERWRITE
.
You can execute the statements on the following platforms:
Prerequisites
Before you execute the INSERT INTO
or INSERT OVERWRITE
statement, make sure that you are granted the Update permission on the destination table and the Select permission on the metadata of the source table. For more information, see MaxCompute permissions.
Feature description
When you use MaxCompute SQL to process data, you need to only specify the names of partition key columns in INSERT INTO
or INSERT OVERWRITE
, instead of the values of the partition key columns. After you specify the values of the partition key columns in SELECT
, MaxCompute automatically inserts data into the destination partitions based on the column values.
For more information about how to insert data into a static partition, see Insert or update data into a table or a static partition (INSERT INTO and INSERT OVERWRITE).
Limits
If you insert data into a dynamic partition by using INSERT INTO
or INSERT OVERWRITE
, take note of the following limits:
A maximum of 10,000 dynamic partitions can be generated after
INSERT INTO
is executed. A maximum of 60,000 dynamic partitions can be generated afterINSERT OVERWRITE
is executed.In a distributed environment, an SQL statement used to insert or update data into dynamic partitions can generate a maximum of 512 dynamic partitions. If the number of dynamic partitions exceeds this limit, an exception occurs.
The values of the dynamic partitions cannot be NULL, and cannot contain special characters. Otherwise, the following error is reported:
FAILED: ODPS-0123031: Partition exception - invalid dynamic partition value: province=xxx
.NoteThe value of a partition key column cannot contain double-byte characters, such as Chinese characters. The value of a partition key column must start with a letter and can contain letters, digits, and supported special characters. It must be 1 to 255 bytes in length. The following special characters are supported: spaces, colons (:), underscores (_), dollar signs ($), number signs (#), periods (.), exclamation points (!), and at signs (@). The behavior of other characters is not defined, such as escape characters
\t
,\n
, and/
.Clustered tables do not support dynamic partitions.
Precautions
If you want to update table data into a dynamic partition, take note of the following points:
If you want to use INSERT INTO or INSERT OVERWRITE to
INSERT INTO PARTITION
that does not exist, MaxCompute automatically creates a partition.If you want to run multiple jobs at the same time to
INSERT INTO PARTITION
that do not exist, MaxCompute automatically creates partitions for the first job that is successfully executed. However, only one partition is created for this job.If you cannot control the job concurrency of the
INSERT INTO PARTITION
, we recommend that you run theALTER TABLE
command to create partitions in advance. For more information, see Partition operations.If a destination table has multiple levels of partitions, you can specify some partitions as static partitions in an
INSERT
statement. However, the static partitions must be high-level partitions.To insert data into a dynamic partition, you must specify partition key columns in
SELECT
. Otherwise, the data will fail to be inserted.
Syntax
INSERT {INTO|OVERWRITE} TABLE <table_name> PARTITION (<ptcol_name>[, <ptcol_name> ...])
<select_statement> FROM <from_statement>;
Parameters
Parameter | Required | Description |
table_name | Yes | The name of the destination table into which you want to insert data. |
ptcol_name | Yes | The name of the partition key column in the destination table. |
select_statement | Yes | The If the destination table has only level-1 dynamic partitions, the value of the last field in |
from_statement | Yes | The |
Sample data
--Create a partitioned table sale_detail.
CREATE TABLE IF NOT EXISTS sale_detail
(
shop_name STRING,
customer_id STRING,
total_price DOUBLE
)
PARTITIONED BY (sale_date STRING, region STRING);
--Add a partition to the src table. This operation is optional. If you do not create a partition in advance, a partition is automatically created when you write data to the table.
ALTER TABLE sale_detail ADD PARTITION (sale_date='2013', region='china');
--Append data to the src table. The abbreviated form of INSERT INTO TABLE table_name is INSERT INTO table_name. The TABLE keyword in INSERT OVERWRITE TABLE table_name cannot be omitted.
INSERT INTO sale_detail PARTITION (sale_date='2013', region='china') VALUES ('s1','c1',100.1),('s2','c2',100.2),('s3','c3',100.3);
--Enable a full table scan only for the current session. Execute the SELECT statement to view data in the sale_detail table.
SET odps.sql.allow.fullscan=true;
SELECT * FROM sale_detail;
--Return results.
+------------+-------------+-------------+------------+------------+
| shop_name | customer_id | total_price | sale_date | region |
+------------+-------------+-------------+------------+------------+
| s1 | c1 | 100.1 | 2013 | china |
| s2 | c2 | 100.2 | 2013 | china |
| s3 | c3 | 100.3 | 2013 | china |
+------------+-------------+-------------+------------+------------+
Examples
The following examples use data from the sale_detail table, as shown in Sample data.
Example 1: Insert data from a source table into a destination table. You can obtain the partitions generated based on the region column only after the statement is executed. The following statements show an example:
--Create a destination table total_revenues. CREATE TABLE total_revenues (revenue DOUBLE) PARTITIONED BY (region string); --Insert the data from the sale_detail table into the total_revenues table. SET odps.sql.allow.fullscan=true; INSERT OVERWRITE TABLE total_revenues PARTITION(region) SELECT total_price AS revenue, region FROM sale_detail; --Execute the SHOW PARTITIONS statement to view the partitions of the total_revenues table. SHOW PARTITIONS total_revenues; --Return results. region=china --Enable a full table scan only for the current session. Execute the SELECT statement to view data in the total_revenues table. SET odps.sql.allow.fullscan=true; SELECT * FROM total_revenues; --Return results. +------------+------------+ | revenue | region | +------------+------------+ | 100.1 | china | | 100.2 | china | | 100.3 | china | +------------+------------+
Example 2: Insert data from a source table into a destination table. If the destination table has multiple levels of partitions, the level-1 partition sale_date must be specified. The following statements show an example:
--Create a destination table sale_detail_dypart. CREATE TABLE sale_detail_dypart LIKE sale_detail; --Specify a level-1 partition and insert data into the destination table. SET odps.sql.allow.fullscan=true; INSERT OVERWRITE TABLE sale_detail_dypart PARTITION (sale_date='2013', region) SELECT shop_name,customer_id,total_price,region FROM sale_detail; --Enable a full table scan only for the current session. Execute the SELECT statement to view data in the sale_detail_dypart table. SET odps.sql.allow.fullscan=true; SELECT * FROM sale_detail_dypart; --Return results. +------------+-------------+-------------+------------+------------+ | shop_name | customer_id | total_price | sale_date | region | +------------+-------------+-------------+------------+------------+ | s1 | c1 | 100.1 | 2013 | china | | s2 | c2 | 100.2 | 2013 | china | | s3 | c3 | 100.3 | 2013 | china | +------------+-------------+-------------+------------+------------+
Example 3: The mappings between the columns in select_statement and the columns in dynamic partitions in a destination table are determined by the column sequence, instead of the column names. The following statements show an example:
--Insert data from the sale_detail table into the sale_detail_dypart table. SET odps.sql.allow.fullscan=true; INSERT OVERWRITE TABLE sale_detail_dypart PARTITION (sale_date, region) SELECT shop_name,customer_id,total_price,sale_date,region FROM sale_detail; --Enable a full table scan only for the current session. Execute the SELECT statement to view data in the sale_detail_dypart table. SET odps.sql.allow.fullscan=true; SELECT * FROM sale_detail_dypart; --Return results. The sale_date column in the dynamic partition of the sale_detail_dypart table is determined by the sale_date column in the sale_detail table. The region column in the dynamic partition of the sale_detail_dypart table is determined by the region column in the sale_detail table. +------------+-------------+-------------+------------+------------+ | shop_name | customer_id | total_price | sale_date | region | +------------+-------------+-------------+------------+------------+ | s1 | c1 | 100.1 | 2013 | china | | s2 | c2 | 100.2 | 2013 | china | | s3 | c3 | 100.3 | 2013 | china | +------------+-------------+-------------+------------+------------+ --Insert data from the sale_detail table into the sale_detail_dypart table and change the sequence of columns in select_statement. SET odps.sql.allow.fullscan=true; INSERT OVERWRITE TABLE sale_detail_dypart PARTITION (sale_date, region) SELECT shop_name,customer_id,total_price,region,sale_date FROM sale_detail; --Enable a full table scan only for the current session. Execute the SELECT statement to view data in the sale_detail_dypart table. SET odps.sql.allow.fullscan=true; SELECT * FROM sale_detail_dypart; --Return results. The sale_date column in the dynamic partition of the sale_detail_dypart table is determined by the region column in the sale_detail table. The region column in the dynamic partition of the sale_detail_dypart table is determined by the sale_date column in the sale_detail table. +------------+-------------+-------------+------------+------------+ | shop_name | customer_id | total_price | sale_date | region | +------------+-------------+-------------+------------+------------+ | s1 | c1 | 100.1 | china | 2013 | | s2 | c2 | 100.2 | china | 2013 | | s3 | c3 | 100.3 | china | 2013 | | s1 | c1 | 100.1 | 2013 | china | | s2 | c2 | 100.2 | 2013 | china | | s3 | c3 | 100.3 | 2013 | china | +------------+-------------+-------------+------------+------------+
Example 4 (Incorrect Example): If you insert data into a dynamic partition, you must specify the columns in the dynamic partition in
SELECT
. Otherwise, the data will fail to be inserted. Sample statement of incorrect usage:INSERT OVERWRITE TABLE sale_detail_dypart PARTITION (sale_date='2013', region) SELECT shop_name,customer_id,total_price FROM sale_detail;
Returned result:
FAILED: ODPS-0130071:[1,24] Semantic analysis exception - wrong columns count 3 in data source, requires 4 columns (includes dynamic partitions if any)
Example 5 (Incorrect Example): If you specify only low-level sub-partitions when you insert data into dynamic partitions, you may fail to insert data into high-level partitions. Sample statement of incorrect usage:
INSERT OVERWRITE TABLE sale_detail_dypart PARTITION (sale_date, region='china') SELECT shop_name,customer_id,total_price,sale_date FROM sale_detail_dypart;
Returned result:
FAILED: ODPS-0130071:[1,72] Semantic analysis exception - static partition region must be a high level partition than any dynamic partitions
Example 6: If the data type of a partition key column does not exactly match the data type of the column in
SELECT
, an implicit conversion is performed when MaxCompute inserts data into a dynamic partition. The following statements show an example:--Create a source table src. CREATE TABLE src (c INT, d STRING) PARTITIONED BY (e INT); --Add a partition to the src table. ALTER TABLE src ADD IF NOT EXISTS PARTITION (e=201312); --Append data to the src table. INSERT INTO src PARTITION (e=201312) VALUES (1,100.1),(2,100.2),(3,100.3); --Create a destination table parttable. CREATE TABLE parttable(a INT, b DOUBLE) PARTITIONED BY (p STRING); --Insert data from the src table into the parttable table. SET odps.sql.allow.fullscan=true; INSERT INTO parttable PARTITION (p) SELECT c, d, CURRENT_TIMESTAMP() FROM src; --Query data in the parttable table. SET odps.sql.allow.fullscan=true; SELECT * FROM parttable; --The return result is as follows. +------------+------------+------------+ | a | b | p | +------------+------------+------------+ | 1 | 100.1 | 2024-12-10 15:59:34.492 | | 2 | 100.2 | 2024-12-10 15:59:34.492 | | 3 | 100.3 | 2024-12-10 15:59:34.492 | +------------+------------+------------+
NoteIf your data is ordered, it is randomly scattered when the data is inserted into a dynamic partition. This reduces the data compression ratio. In this case, we recommend that you use Tunnel commands to upload the data to dynamic partitions to increase the data compression ratio. For more information about how to use Tunnel commands, see Migrate data from ApsaraDB RDS to MaxCompute based on dynamic partitioning.