All Products
Search
Document Center

Hologres:DELETE

Last Updated:Feb 28, 2024

You can execute the DELETE statement to delete rows that meet specified conditions from a table. This topic describes how to use the DELETE statement in Hologres.

Syntax

The DELETE statement uses the following syntax:

DELETE FROM <table_name> [ * ] [ [ AS ] <alias> ]
    [ WHERE <condition>  ]

The following table describes the parameters.

Parameter

Description

table_name

The name of the desired table.

alias

The substitute name for the table.

condition

The condition for deleting rows.

How it works

When you execute the DELETE statement, data is written to a memory table and then flushed to small files. In this process, if the data table is a row-oriented table, the data to be deleted is flushed to a small file, and the small file is deleted during the compaction. If the data table is a column-oriented table, the system generates a tag table in memory and then flushes the table to a tag table file. The tag table file records the file IDs and row IDs of the data to be deleted. The rows that are recorded in the tag table file are deleted during the compaction. To improve the execution efficiency of the DELETE statement, you can execute the DELETE statement by using a fixed plan. For more information, see Accelerate the execution of SQL statements by using fixed plans. You can also set a proper primary key or index (distribution key, segment key, or clustering key) for the table. This enables the system to quickly locate the file that needs to be deleted. Otherwise, full-table scan is performed, which negatively impacts performance. If you execute the SQL statement delete from tablename where pk =xxx to delete data based on the primary key, the deletion efficiency of a row-oriented table is higher than that of a column-oriented table.

Limits

  • You cannot execute the DELETE statement to delete rows from a parent partitioned table if its child tables contain data. To delete rows from the parent partitioned table, you must delete the rows from its child tables first.

  • To delete all rows from a table, we recommend that you use the TRUNCATE statement, which is more efficient. For more information about the TRUNCATE statement, see TRUNCATE.

  • We recommend that you use a fixed plan to optimize the execution efficiency of the DELETE statement. For more information, see DELETE statements in the "Accelerate the execution of SQL statements by using fixed plans" topic.

  • Hologres uses the DELETE statement to delete rows based on tags in the same way as the DELETE statement in PostgreSQL. Storage space is released only after the next compaction is complete. This is because the LSM structure stores the latest data in memory. After the DELETE statement is executed, temporary data whose amount does not reach the threshold for compaction may exist and still occupies storage space. If you want to delete all data, we recommend that you use the TRUNCATE statement.

Example

The following DELETE statements show how to delete rows from a table in Hologres.

CREATE TABLE delete_test (
    id INT PRIMARY KEY,
    a INT,
    b text
);

INSERT INTO delete_test VALUES
(1, 10, 'a'),
(2, 30, 'b'),
(3, 50,  ''),
(4, 70, null);



DELETE FROM delete_test AS dt WHERE dt.a = 10;

DELETE FROM delete_test AS dt WHERE dt.b is null;

DELETE FROM delete_test AS dt WHERE dt.b='';
            

For more information about the DELETE statement, see PostgreSQL DELETE.

FAQ

Why does the metric that indicates the storage usage significantly increase when I execute the DELETE statement but then decrease after the write operations are complete?delete存储用量

Based on the principle of the DELETE statement, Hologres tags the old data during the execution of the DELETE statement and flushes the new data to new small files. Hologres merges these small files during the compaction at the backend. During the compaction, the old data is cleared, and the new data is merged. To accelerate data deletion, Hologres writes the data at the backend, and then compresses and sorts the data when the asynchronous compaction is performed. Therefore, the storage usage significantly increases during the data deletion. After the compaction process is complete, the storage usage decreases. For more information, see How it works in this topic.