INSERT INTO in StarRocks works similarly to MySQL, but each execution runs as an independent import job rather than an in-place write. The result is synchronously returned when the statement completes. Use it to load a small number of rows for verification or to run extract, transform, and load (ETL) transformations between StarRocks tables.
Use cases
-
Test data loading: Insert a few rows to verify table structure or StarRocks behavior before building a full pipeline.
-
ETL between StarRocks tables: Transform data in one StarRocks table and write the results to another using
INSERT INTO SELECT. -
Loading from a foreign table: Map an external MySQL table as a foreign table, then use
INSERT INTO SELECTto load its data into a StarRocks internal table.
INSERT INTO is not suitable for routine production ingestion. Each execution creates a new data version, and frequent small inserts generate excessive versions that degrade query performance. For streaming or high-frequency small-batch ingestion, use Stream Load or Routine Load instead.
Syntax
INSERT INTO table_name
[ PARTITION (p1, ...) ]
[ WITH LABEL label ]
[ (column [, ...]) ]
[ [ hint [, ...] ] ]
{ VALUES ( { expression | DEFAULT } [, ...] ) [, ...] | query }
Parameters
| Parameter | Description |
|---|---|
table_name |
The destination table. Use the db_name.table_name format. |
PARTITION |
The partitions to load data into. Partitions must exist in table_name. Separate multiple partition names with commas. If omitted, data is loaded into all partitions. |
label |
A unique identifier for the import job within a database. Specify a label manually rather than relying on auto-generated labels — if the connection drops due to a network error, a known label lets you query the job result afterward. |
column_name |
The destination columns to write to. Columns can be listed in any order. If omitted, all columns in table_name are used. Any source column not mapped to a destination column must have a default value, or the statement fails. If column types differ, an implicit conversion is attempted; if conversion fails, a syntax parsing error is returned. |
expression |
An expression that assigns a value to a column. |
DEFAULT |
Assigns the column's default value. |
query |
A query statement whose result set is written to the destination table. All SQL statements supported by StarRocks are valid. |
VALUES |
One or more literal data rows. Use only for quick demo verification — not for testing or production. For loading multiple rows, prefer INSERT INTO SELECT. |
Prerequisites
Before you begin, make sure that you have:
-
A destination table in StarRocks
-
The required permissions to write to that table
Set up the example tables
The examples in this topic use the following database and tables. Run these statements to set them up.
-
Create the database:
CREATE DATABASE IF NOT EXISTS load_test; -
Switch to the database:
USE load_test; -
Create the destination table:
CREATE TABLE insert_wiki_edit ( event_time DATETIME, channel VARCHAR(32) DEFAULT '', user VARCHAR(128) DEFAULT '', is_anonymous TINYINT DEFAULT '0', is_minor TINYINT DEFAULT '0', is_new TINYINT DEFAULT '0', is_robot TINYINT DEFAULT '0', is_unpatrolled TINYINT DEFAULT '0', delta INT SUM DEFAULT '0', added INT SUM DEFAULT '0', deleted INT SUM DEFAULT '0' ) AGGREGATE KEY(event_time, channel, user, is_anonymous, is_minor, is_new, is_robot, is_unpatrolled) PARTITION BY RANGE(event_time) ( PARTITION p06 VALUES LESS THAN ('2015-09-12 06:00:00'), PARTITION p12 VALUES LESS THAN ('2015-09-12 12:00:00'), PARTITION p18 VALUES LESS THAN ('2015-09-12 18:00:00'), PARTITION p24 VALUES LESS THAN ('2015-09-13 00:00:00') ) DISTRIBUTED BY HASH(user) BUCKETS 10 PROPERTIES("replication_num" = "1");
Insert rows with VALUES
Use INSERT INTO VALUES to write a small number of literal rows directly into a table. This is intended only for quick verification — not for production or performance testing.
INSERT INTO insert_wiki_edit VALUES
("2015-09-12 00:00:00","#en.wikipedia","GELongstreet",0,0,0,0,0,36,36,0),
("2015-09-12 00:00:00","#ca.wikipedia","PereBot",0,1,0,1,0,17,17,0);
Expected output:
Query OK, 2 rows affected (0.29 sec)
{'label':'insert_1f12c916-5ff8-4ba9-8452-6fc37fac2e75', 'status':'visible', 'txnId':'601'}
2 rows affected confirms that both rows were loaded. status: visible means the data is immediately queryable.
Load data with INSERT INTO SELECT
Use INSERT INTO SELECT to load data from another StarRocks table or to run ETL transformations as part of the load.
Assign a label to the job so you can query its result even if the connection drops:
INSERT INTO insert_wiki_edit WITH LABEL insert_load_wikipedia
SELECT * FROM routine_wiki_edit;
Expected output:
Query OK, 18203 rows affected (0.40 sec)
{'label':'insert_load_wikipedia', 'status':'visible', 'txnId':'618'}
Check the import job status
If a connection drops before the result is returned, use the label to query the job status. Execute the SHOW LOAD statement and use the label to identify the job. The URL in the returned result can also be used to query error data.
If you did not assign a label, use the auto-generated label from the import result to run the same query.
Understand return values
A successful INSERT INTO statement returns a status line immediately after execution. The following table describes each field.
| Field | Description |
|---|---|
rows affected |
Total number of rows loaded. |
warnings |
Number of rows filtered out (excluded from the load). |
label |
The import job identifier. Auto-generated if not specified with WITH LABEL. |
status |
visible: the loaded data is immediately queryable. committed: the imported data is invisible. |
txnId |
The transaction ID for this import job. |
err |
Present only on unexpected errors. Use SHOW LOAD with the returned URL to inspect filtered-out rows. |
Example return values
Empty source table — no rows loaded:
Query OK, 0 rows affected (0.02 sec)
Successful load with auto-generated label — 4 rows loaded, data immediately queryable:
Query OK, 4 rows affected (0.38 sec)
{'label':'insert_8510c568-9eda-4173-9e36-6adc7d35291c', 'status':'visible', 'txnId':'4005'}
Successful load with custom label — same result, label is user-specified:
Query OK, 4 rows affected (0.38 sec)
{'label':'my_label1', 'status':'visible', 'txnId':'4005'}
Partial load with filtered rows — 2 rows loaded, 2 filtered out due to format errors:
Query OK, 2 rows affected, 2 warnings (0.31 sec)
{'label':'insert_f0747f0e-7a35-46e2-affa-13a235f4020d', 'status':'visible', 'txnId':'4005'}
This can also return status: committed, meaning the imported data is invisible.
Failed load — all rows rejected, no data loaded:
ERROR 1064 (HY000): all partitions have no load data. url: http://10.74.167.16:8042/api/_load_error_log?file=__shard_2/error_log_insert_stmt_ba8bb9e158e4879-ae8de8507c0bf8a2_ba8bb9e158e4879_ae8de8507c0bf8a2
Open the URL to inspect the rows that caused the failure.
Configuration
FE node parameters
| Parameter | Default | Description |
|---|---|---|
insert_load_default_timeout_second |
3600 (1 hour) | Timeout for all INSERT INTO import jobs, in seconds. This setting applies globally — you cannot set a different timeout for individual jobs. If a job exceeds this limit, it is canceled and moves to the CANCELLED state. Adjust this value on the frontend (FE) node if your jobs consistently time out. |
Session parameters
| Parameter | Default | Description |
|---|---|---|
enable_insert_strict |
true |
Controls how the job handles invalid rows. true: the job fails if any row is filtered out. false: the job succeeds if at least one row is loaded; a label is returned on partial failure. Set with SET enable_insert_strict = false. |
query_timeout |
— | Session-level query timeout in seconds. INSERT INTO is subject to this timeout. Increase it with SET query_timeout = <seconds> if your load jobs time out before the FE-level timeout. |
Usage notes
-
By default, rows that do not conform to the destination table's format (for example, a string exceeding the declared length) are silently filtered out. Set
enable_insert_strict = trueto fail the job instead of filtering. -
Each
INSERT INTOexecution generates a new data version. Avoid running it at high frequency in production — excessive versions degrade query performance.
What's next
-
Stream Load — for high-throughput or streaming ingestion
-
Routine Load — for continuous ingestion from message queues