The UPSERT syntax works in the same manner as the combination of the INSERT and UPDATE syntaxes. If the specified row already exists, the UPDATE syntax is executed. If a row does not exist, the INSERT syntax is executed. When you execute an UPSERT statement, you must specify the column that is used as the primary key. The UPSERT syntax can be used to insert individual rows or multiple rows of data with timestamps in a batch. In this case, the UPSERT syntax is equivalent to the INSERT syntax.
Applicable engines and versions
The UPSERT syntax is applicable to all versions of LindormTable and LindormTSDB.
Comparison between the INSERT operation in rational databases and the UPSERT operation in Lindorm
The UPSERT operation in Lindorm is different from the INSERT operation in rational databases.
INSERT operations in rational databases
For a table with a specified primary key, if you perform two consecutive INSERT operations without specifying the ON DUPLICATE KEY clause to write data with the same primary key value, the second INSERT operation fails. In this case, you must explicitly execute the UPDATE statement to update data or specify the ON DUPLICATE KEY clause in the second INSERT operation.
INSERT operations in Lindorm
LindormTable: If you perform two consecutive UPSERT operations to write data with the same primary key, the second UPSERT operation overwrites the row of data that is written by the first UPSERT operation without errors reported. In this case, LindormTable stores the rows written by the two operation as different versions. By default, when a SELECT statement is executed to query the row of data, the latest version of each column in the row is returned.
LindormTSDB: If you perform two consecutive UPSERT operations to write data with the same primary key, the second UPSERT operation overwrites the row of data that is written by the first UPSERT operation.
For the example on how to use the UPSERT syntax to insert rows with the same primary key, see Write rows with the same primary key.
Syntax
upsert_statement ::= { UPSERT | INSERT } [ hint_expression ]
INTO table_identifier columns_delaration
VALUES value_list ( ',' value_list)*
[ ON DUPLICATE KEY column_identifier =
value_literal | IGNORE ]
columns_delaration ::= '(' column_identifier ( ',' column_identifier)* ')'
value_list ::= '(' value_expression( ',' value_expression)* ')'Parameters
HINT expression (hint_expression)
HINT expressions are supported only by LindormTable.
You can use the _l_ts_ hint in an UPSERT statement to specify a timestamp for the row that you want to update or insert. For more information about the _l_ts_ hint, see Parameters of hintOptions.
ON DUPLICATE KEY
The ON DUPLICATE KEY clause is supported only by LindormTable.
The ON DUPLICATE KEY clause is used to check whether the specified row exists and is similar to the CHECK AND PUT operation in HBase. The clause complies to the following rules:
You can use the
ON DUPLICATE KEYclause followed by theUPDATEkeyword to update the values in the specified column.If the specified row already exists, the statement following the
UPDATEkeyword is executed to update the value of the specified column.If the specified row does not exist, data is not updated and no error is reported in LindormTable earlier than 2.7.8. However, an error is reported and data in the
VALUESclause is inserted to the table in LindormTable 2.7.8 and later.
You can also use the
ON DUPLICATE KEYclause followed by theIGNOREkeyword. In this case, if the specified row already exists, no data is updated and no error is reported. If the specified row does not exist, data written to the table.In LindormTable 2.7.8 and later, you can use the
ON DUPLICATE KEYclause followed by theERRORkeyword. In this case, if the specified row already exists, an error is reported. If the specified row does not exist, data written to the table.The ON DUPLICATE KEY clause is applicable only to tables for which the CONSISTENCY parameter is set to
strong. For more information about the CONSISTENCY parameter, see Table attributes (table_options). To change the value of the CONSISTENCY parameter for a table, see ALTER TABLE.
In Lindorm SQL 2.8.8.2 and later, if you want to execute the UPSERT statement to automatically insert the current timestamp, you can include the NOW() function to the statement. Example: UPSERT INTO tb (id, ts) VALUES (1, NOW());. For more information about how to view the version of Lindorm SQL, see SQL versions.
Examples
In the following examples, the sample table sensor is created by executing the following statement:
CREATE TABLE sensor (
device_id VARCHAR NOT NULL,
region VARCHAR NOT NULL,
time TIMESTAMP NOT NULL,
temperature DOUBLE,
humidity BIGINT,
PRIMARY KEY(device_id, region, time)
)WITH(VERSIONS=2);Write data to a table
UPSERT INTO sensor(device_id, region, time, temperature, humidity) VALUES('F07A1260','north-cn','2021-04-22 15:33:00',12.1,45);Verify the result
You can execute the SELECT * FROM sensor; statement to check whether the data is written to the table.
Write data to specific columns
UPSERT INTO sensor(device_id, region, time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',13.2);Verify the result
You can execute the SELECT * FROM sensor; statement to check whether the data is written to the table.
Use the ON DUPLICATE KEY IGNORE clause in the statement to write data to a table
Use the ON DUPLICATE KEY IGNORE clause to write data to a table in LindormTable.
If the row that meets the conditions specified by device_id='F07A1260', region='north-cn', time='2021-04-22 15:33:10', temperature=13.2 exists, no data is written. If the row does not exist, data is written to the table.
UPSERT INTO sensor(device_id,region,time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',13.2) ON DUPLICATE KEY IGNORE;Verify the result
You can execute the SELECT * FROM sensor; statement to check whether the data is written to the table.
Use the ON DUPLICATE KEY UPDATE clause in the statement to update data in a table
Use the ON DUPLICATE KEY UPDATE clause to write data to a table in LindormTable.
LindormTable earlier than V2.7.8:
If the row that meets the conditions specified by
device_id='F07A1260', region='north-cn', time='2021-04-22 15:33:10', temperature=13.2exists, the value of thetemperaturecolumn is updated to30. If the row does not exist, the value of thetemperaturecolumn is not updated and no error is reported.UPSERT INTO sensor(device_id,region,time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',13.2) ON DUPLICATE KEY UPDATE temperature = 30;LindormTable 2.7.8 and later:
If the row that meets the conditions specified by
device_id='F07A1260', region='north-cn', time='2021-04-22 15:33:10', temperature=13.2exists, the value of thetemperaturecolumn is updated to30. If the row does not exist, data in theVALUESclause is inserted to the table.UPSERT INTO sensor(device_id,region,time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',13.2) ON DUPLICATE KEY UPDATE temperature = 30;
Verify the result
You can execute the SELECT * FROM sensor; statement to check whether the data is written to the table.
Use the ON DUPLICATE KEY ERROR clause in the statement to write data to a table
Use the ON DUPLICATE KEY ERROR clause to write data to a table in LindormTable 2.7.8 and later.
If the row that meets the conditions specified by device_id='F07A1260', region='north-cn', time='2021-04-22 15:33:10', temperature=13.2 exists, an error is reported. If the row does not exist, data is written to the table.
UPSERT INTO sensor(device_id,region,time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',13.2) ON DUPLICATE KEY ERROR;Write data with timestamps
Write a row of data to the table sensor in LindormTable and set the timestamp of the row to 111232.
UPSERT /*+ _l_ts_(111232) */ INTO sensor (device_id ,region ,time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:00',12.1);Verify the result
You can execute the SELECT * FROM sensor; statement to check whether the data is written to the table.
Write multiple rows of data at the same time
UPSERT INTO sensor (device_id ,region ,time,temperature) VALUES('F07A1260','north-cn','2021-04-22 15:33:20',10.6), ('F07A1261','south-cn','2021-04-22 15:33:00',18.1), ('F07A1261','south-cn','2021-04-22 15:33:10',19.7);Verify the result
You can execute the SELECT * FROM sensor; statement to check whether the data is written to the table.
Write rows with the same primary key
Use the UPSERT statement to consecutively write two rows of data with the same primary key in LindormTable. Then, use hints to query the data written to the table.
If you perform two consecutive UPSERT operations to write data with the same primary key in LindormTSDB, the second UPSERT operation overwrites the row of data that is written by the first UPSERT operation.
Write data for the first time.
UPSERT INTO sensor(device_id ,region ,time,temperature,humidity) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',13.2,45);Query the data written by the first operation.
SELECT * FROM sensor WHERE device_id='F07A1260' AND region='north-cn';Sample output:
+-----------+----------+-------------------------------+-------------+----------+ | device_id | region | time | temperature | humidity | +-----------+----------+-------------------------------+-------------+----------+ | F07A1260 | north-cn | 2021-04-22 15:33:10 +0000 UTC | 13.2 | 45 | +-----------+----------+-------------------------------+-------------+----------+Write data for the second time.
UPSERT INTO sensor(device_id ,region ,time,temperature,humidity) VALUES('F07A1260','north-cn','2021-04-22 15:33:10',16.7,52);Query the data written by the second operation.
SELECT * FROM sensor WHERE device_id='F07A1260' AND region='north-cn';Sample output:
+-----------+----------+-------------------------------+-------------+----------+ | device_id | region | time | temperature | humidity | +-----------+----------+-------------------------------+-------------+----------+ | F07A1260 | north-cn | 2021-04-22 15:33:10 +0000 UTC | 16.7 | 52 | +-----------+----------+-------------------------------+-------------+----------+The values of the device_id, region, and time primary key columns in the two operations are the same. According to the results, the value of the temperature column written by the first operation is overwritten by the value written by the second operation.
Use the _l_versions_ hint to query all versions of written data.
SELECT /*+ _l_versions_(2) */ device_id, region, time,temperature,humidity FROM sensor WHERE device_id='F07A1260';Sample output:
+-----------+----------+-------------------------------+-------------+----------+ | device_id | region | time | temperature | humidity | +-----------+----------+-------------------------------+-------------+----------+ | F07A1260 | north-cn | 2021-04-22 15:33:10 +0000 UTC | 16.7 | 52 | | F07A1260 | north-cn | 2021-04-22 15:33:10 +0000 UTC | 13.2 | 45 | +-----------+----------+-------------------------------+-------------+----------+According to the results, the values written by the two operations are saved as two versions.