This topic describes the INSERT OVERWRITE INTO SELECT
statement, which is a high-performance data writing method used in AnalyticDB for MySQL.
Scenarios
INSERT OVERWRITE INTO SELECT
statement is suitable for the following scenarios:
- Data is written to a table by partition.
- All data of a table is initialized.
- Large amounts of data are written to a table.
How it works
The INSERT OVERWRITE INTO SELECT statement writes data to AnalyticDB for MySQL by using external tables. To write data to AnalyticDB for MySQL, you must define an external table of a specific data source in AnalyticDB for MySQL and execute the INSERT OVERWRITE INTO SELECT
statement.
INSERT OVERWRITE INTO SELECT
statement:
- During a data writing task, large amounts of cluster resources are consumed to ensure high performance. We recommend that you use this statement during off-peak hours.
- The written data is invisible when the data writing task is in progress, and becomes visible only after the task is completed. The original data in the destination table does not change until the
INSERT OVERWRITE INTO SELECT
statement is complete. After theINSERT OVERWRITE INTO SELECT
statement is complete, the system writes inserted data to the destination table and clears the original data of the changed partitions. - If a partition is written to AnalyticDB for MySQL by using the
INSERT OVERWRITE INTO SELECT
statement, data of the existing partition that has the same name is overwritten. - Table indexes are automatically being created at the same time when data is being written and available immediately after the data writing task is completed. This improves query performance.
Precautions
Do not execute the INSERT OVERWRITE INTO SELECT
and real-time writing statements (INSERT INTO, REPLACE INTO, DELETE, or UPDATE) on the same table at the same time. Otherwise, the data written in real time is discarded.
Syntax
INSERT OVERWRITE INTO table_name [(column_name,...)]
select_statement
Parameters
- table_name: the name of the destination table.
- column_name: the name of the column in the destination table.
- select_statement: the SELECT statement.
The data type of each column in the SELECT statement must match the data type in [(column_name,...)].
If the number of columns in the SELECT statement is larger than the number of columns in [(column_name,...)], the data writing fails. If the number of columns in the SELECT statement is smaller than the number of columns in [(column_name,...)], the additional columns in [(column_name,...)] are automatically filled with the default values. If no default value is specified, NULL is used.
Examples
- Write data of the cust_id column in the Customers table to the vend_id column of the Vendors table.
INSERT OVERWRITE INTO Vendors (vend_id) SELECT cust_id FROM Customers;
- Write data by partition.
- Create a source table named
test_source
and a destination table namedtest_target
.CREATE TABLE test_source (a bigint, b bigint) distribute by hash(a) ;
CREATE TABLE test_target (a bigint, b bigint) distribute by hash(a) partition by value(b) lifecycle 10;
- Initialize the
test_source
table.INSERT INTO test_source values (1,1); INSERT INTO test_source values (1,2); INSERT INTO test_source values (1,3); INSERT INTO test_source values (2,1); INSERT INTO test_source values (2,2); INSERT INTO test_source values (2,3);
- Initialize the
test_target
table.INSERT OVERWRITE INTO test_target SELECT * FROM test_source where a = 1;
- Write data to the
test_target
table by overwriting partition 1.
Write data to theINSERT OVERWRITE INTO test_target SELECT * FROM test_source WHERE a = 2 and b = 1;
test_target
table by overwriting partitions 2 and 3.INSERT OVERWRITE INTO test_target SELECT * FROM test_source WHERE a = 2 and b >= 2 and b <= 3;
- Create a source table named
Asynchronous writing
Submit a task
SUBMIT JOB
statement to submit an asynchronous task. Sample statement:
SUBMIT JOB
INSERT OVERWRITE INTO adb_table
SELECT * FROM adb_external_table;
Optimize writing performance
/* direct_batch_load=true*/
) to the beginning of a data writing statement to accelerate the writing task while saving resources. Sample statement:
/* direct_batch_load=true*/
SUBMIT JOB
INSERT OVERWRITE INTO adb_table
SELECT * FROM adb_external_table;
/* direct_batch_load=true*/
hint is supported only for AnalyticDB for MySQL V3.1.5 and later. If performance is not improved, you can
Submit a ticket. For more information about how to view the minor engine version of a cluster, see
How can I view the version of an AnalyticDB for MySQL cluster?.
Query the progress
SUBMIT JOB
statement, the value of job_id is returned. You can use the job_id value to query the status of the writing task. Sample statement:
SHOW job status WHERE job='<job_id>';
If the writing task is completed, FINISH is displayed in the status column.