INSERT OVERWRITE SELECT provides high-performance batch data writes for AnalyticDB for MySQL tables by clearing partition data and writing new data in bulk.
How it works
The INSERT OVERWRITE SELECT statement first clears all existing data from a partition and then writes new data to it in batches.
-
If the target table is a partitioned table,
INSERT OVERWRITE SELECToverwrites only the partitions to which the new data belongs. Other partitions are not affected. -
If the target table is a non-partitioned table,
INSERT OVERWRITE SELECTclears all existing data in the table and then writes the new data in batches.
Write jobs for a single table run sequentially (concurrency is fixed at 1). The default cluster-level write concurrency is 2. Do not adjust this value—it may degrade performance and overload the cluster.
To adjust the write concurrency, Submit a ticket to technical support for an evaluation.
Overview
Key characteristics:
-
High resource consumption: Run during off-peak hours to avoid impacting cluster performance.
-
Batch visibility: Written data becomes visible all at once only after the job completes.
-
Partition overwrite: Only partitions containing new data are overwritten.
-
Automatic index building: Indexes are built synchronously during writes, so the target table is query-ready immediately after the job completes.
Common use cases:
-
Partition-level data writes.
-
Data initialization (full data write).
-
Large-scale bulk data writes. Not recommended for small data volumes.
Precautions
Do not use INSERT OVERWRITE SELECT and real-time write methods (such as INSERT INTO, REPLACE INTO, DELETE, and UPDATE) to write to the same table concurrently. Otherwise, data from the real-time writes will be lost.
Syntax
INSERT OVERWRITE table_name (column_name[,...])
select_statement
Parameters
-
table_name: The name of the target table. -
column_name: The name of a column in the target table. -
select_statement: The SELECT statement.The data type of each column in the SELECT statement must match the data type of the corresponding column in the target table.
If the number of columns in the SELECT statement is greater than that in the target table, the write operation fails. If the number of columns in the SELECT statement is less than that in the target table, the remaining columns in the target table are populated with their default values. If a column has no default value, the system populates it with NULL.
Examples
Sample data
Prepare test tables and data for the overwrite examples.
-
Create a source table named
test_sourceand a target table namedtest_target.CREATE TABLE test_source (a BIGINT, b BIGINT) DISTRIBUTED BY HASH(a);CREATE TABLE test_target (a BIGINT, b BIGINT) DISTRIBUTED BY HASH(a) PARTITION BY VALUE(b) LIFECYCLE 10; -
Write test data into the source table
test_source.INSERT INTO test_source VALUES (1,1); INSERT INTO test_source VALUES (1,2); INSERT INTO test_source VALUES (1,3); INSERT INTO test_source VALUES (2,1); INSERT INTO test_source VALUES (2,2); INSERT INTO test_source VALUES (2,3);
Overwrite data
-
Insert initial data into the
test_targettable.INSERT OVERWRITE test_target SELECT * FROM test_source WHERE a = 1;Query the
test_targettable. The query returns the following result.+-----+------+ |1 |1 | |1 |2 | |1 |3 | +-----+------+ -
Overwrite the partition where b=1 in the
test_targettable.INSERT OVERWRITE test_target (a,b) SELECT a,b FROM test_source WHERE a = 2 AND b = 1;After the overwrite, querying
test_targetreturns:+-----+------+ |2 |1 | |1 |2 | |1 |3 | +-----+------+ -
Overwrite the partitions where b=2 and b=3 in the
test_targettable.INSERT OVERWRITE test_target SELECT * FROM test_source WHERE a = 2 AND b >= 2 AND b <= 3;After the overwrite, querying
test_targetreturns:+-----+------+ |2 |1 | |2 |2 | |2 |3 | +-----+------+
Asynchronous writes
Submit a job
SUBMIT JOB runs the write operation asynchronously in the background:
SUBMIT JOB
INSERT OVERWRITE adb_table
SELECT * FROM adb_external_table;
Optimize write performance
Add the hint /* direct_batch_load=true */ before the write operation to improve performance with lower resource consumption:
/* direct_batch_load=true*/
SUBMIT JOB
INSERT OVERWRITE adb_table
SELECT * FROM adb_external_table;
/* direct_batch_load=true*/ requires kernel version 3.1.5 or later. If performance does not improve, Submit a ticket for assistance. To check your kernel version, see View instance version information.
Query job progress
SUBMIT JOB returns a job_id. Use it to query job status:
SHOW JOB STATUS WHERE job='<job_id>';
A status of SUCCEEDED indicates the job is complete.