This topic describes how to overwrite data in AnalyticDB for PostgreSQL.

The INSERT ON CONFLICT statement allows you to update an existing row that contains a primary key when you execute the INSERT statement to insert a new row that contains the same primary key. This feature is also known as UPSERT or INSERT OVERWRITE. It is similar to the REPLACE INTO statement of MySQL.

This feature is supported in AnalyticDB for PostgreSQL 6.0 and not supported in AnalyticDB for PostgreSQL 4.3.

SQL syntax

The overwrite syntax is based on the INSERT statement described as follows:

[ WITH [ RECURSIVE ] with_query [, ...] ]
INSERT INTO table_name [ AS alias ] [ ( column_name [, ...] ) ]
    { DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) [, ...] | query }
    [ ON CONFLICT [ conflict_target ] conflict_action ]
    [ RETURNING * | output_expression [ [ AS ] output_name ] [, ...] ]

The valid value of conflict_target:

    ( { index_column_name | ( index_expression ) } [ COLLATE collation ] [ opclass ] [, ...] )

Valid values of conflict_action:

    DO NOTHING
    DO UPDATE SET { column_name = { expression | DEFAULT } |
                    ( column_name [, ...] ) = ( { expression | DEFAULT } [, ...] )
                  } [, ...]
              [ WHERE condition ]
			

The overwrite feature adds an ON CONFLICT clause to the common INSERT syntax. This clause contains the following parts:

  • conflict_target: specifies the columns on which the primary keys conflict. If you specify conflict_action as DO NOTHING, you can omit conflict_target. If conflict_action is a DO UPDATE clause, you must specify the primary key columns or unique index columns for conflict_target.
  • conflict_action: specifies the operations to be performed when the primary key conflict occurs. You can set conflict_action to DO NOTHING or DO UPDATE. DO NOTHING indicates to discard the data to be inserted and retain the original data. DO UPDATE indicates to overwrite the original data and execute the UPDATE statements.

Data to be inserted is stored in a virtual table named EXCLUDED. You can use the DO UPDATE SET clause to reference columns in the EXCLUDED table. For example, a table named tbl contains a primary key column named pri_key and a non-primary key column named col_name. If you want to overwrite an existing col_name value, you can execute the following statements:

insert into tbl values (0, 1), (2, 3), (4, 5)
on conflict (pri_key) do update set tbl.col_name = excluded.col_name;
			

The statement creates a virtual table named EXCLUDED that contains three rows and two columns and stores the inserted data (0, 1), (2, 3), and (4, 5). You can use excluded.col_name to reference columns in the table.

Limits

  • Only AnalyticDB for PostgreSQL 6.0 supports the overwrite feature. AnalyticDB for PostgreSQL 4.3 does not support this feature.
  • The table whose data is to be overwritten must be a row store table. The table cannot be a column store table because column store tables do not support unique indexes.
  • The table cannot be a partition table.
  • You cannot update distribution columns or primary key columns in the UPDATE SET clause.
  • You cannot execute subqueries in the UPDATE WHERE clause.
  • The table cannot be an updatable view.
  • You cannot insert multiple data records for a primary key in an INSERT statement. This is a universal limit based on the standard SQL syntax.

Examples

  • Basic usage

Execute the following statement to create a table named t1 that contains four columns. The a column is the primary key column.

create table t1 (a int primary key, b int, c int, d int default 0);
			

Insert a row whose primary key is 0:

insert into t1 values (0,0,0,0);

select * from t1;
 a | b | c | d
---+---+---+---
 0 | 0 | 0 | 0
(1 row)
			

Insert a new row whose primary key is also 0. The following error message is displayed.

insert into t1 values (0,1,1,1);

ERROR:  duplicate key value violates unique constraint "t1_pkey"
DETAIL:  Key (a)=(0) already exists.
			

However, operations are required instead of an error message in some scenarios. The original data must be retained or updated after the primary key conflict occurs. You can use the overwrite feature to perform these operations.

Execute the following statement to retain the original data and discard the new data:

-- Use the ON CONFLICT DO NOTHING clause.

insert into t1 values (0,1,1,1) on conflict do nothing;

select * from t1;
 a | b | c | d
---+---+---+---
 0 | 0 | 0 | 0
(1 row)
			

Execute the following statement to overwrite the original data with the new data:

-- Use the ON CONFLICT DO UPDATE clause.

insert into t1 values (0,2,2,2) on conflict (a) do update set (b, c, d) = (excluded.b, excluded.c, excluded.d);

select * from t1;
 a | b | c | d
---+---+---+---
 0 | 2 | 2 | 2
(1 row)
			

The ON CONFLICT DO NOTHING and DO UPDATE clauses enable you to overwrite data as needed. The EXCLUDED table indicates the virtual table that stores the (0,2,2,2) row.

The preceding statement can also be written in the following format:

insert into t1 values (0,2,2,2) on conflict (a) do update set b = excluded.b, c = excluded.c, d = excluded.d;
			

The overwrite feature can also be used in the following ways:

  • Execute the following statement to overwrite some columns of the original data with the new data:
-- Overwrite the c column with the value of excluded.c, which indicates the c column of the new data.

insert into t1 values (0,0,3,0) on conflict (a) do update set c = excluded.c;

select * from t1;
 a | b | c | d
---+---+---+---
 0 | 2 | 3 | 2
(1 row)
			
  • Execute the following statement to update some columns of the original data:
-- Add 1 to the original d column value.

insert into t1 values (0,0,3,0) on conflict (a) do update set d = t1.d + 1;

select * from t1;
 a | b | c | d
---+---+---+---
 0 | 2 | 3 | 3
(1 row)
			
  • Execute the following statement to set column values to default values:
-- Set the d column value to the default value 0 as defined in the CREATE TABLE statement.

insert into t1 values (0,0,3,0) on conflict (a) do update set d = default;

select * from t1;
 a | b | c | d
---+---+---+---
 0 | 2 | 3 | 0
(1 row)
			
  • Execute the following statements to insert multiple rows:
-- Insert two rows. The original row with the primary key 0 is retained because a primary key conflict occurs and the row with the primary key 1 is inserted to the table.

insert into t1 values (0,0,0,0), (1,1,1,1) on conflict do nothing;

select * from t1;
 a | b | c | d
---+---+---+---
 0 | 2 | 3 | 0
 1 | 1 | 1 | 1
(2 rows)

-- Insert two rows. The new row with the primary key 0 overwrites the original row because a primary key conflict occurs and the row with the primary key 2 is inserted to the table.

insert into t1 values (0,0,0,0), (2,2,2,2) on conflict (a) do update set (b, c, d) = (excluded.b, excluded.c, excluded.d);

select * from t1;
 a | b | c | d
---+---+---+---
 0 | 0 | 0 | 0
 1 | 1 | 1 | 1
 2 | 2 | 2 | 2
(3 rows)
			
  • Execute the following statements to insert data obtained by subqueries to merge data in two tables or perform complex INSERT INTO and SELECT statements:
create table t2 (like t1);
insert into t2 values (2,22,22,22),(3,33,33,33);

-- Insert rows from the t2 table to the t1 table. If a primary key conflict occurs, the rows that have the same primary keys in t2 overwrite the rows in t1.

insert into t1 select * from t2 on conflict (a) do update set (b, c, d) = (excluded.b, excluded.c, excluded.d);

select * from t1;
 a | b  | c  | d
---+----+----+----
 0 |  0 |  0 |  0
 1 |  1 |  1 |  1
 2 | 22 | 22 | 22
 3 | 33 | 33 | 33
(4 rows)