Use the DELETE statement to remove one or more rows that meet a condition from a Transactional table or a Delta Table.
Prerequisites
Before you run a DELETE or UPDATE statement, make sure you have the Select and Update permissions on the target Transactional table or Delta Table. For details, see MaxCompute permissions.
Limitations
DELETE and UPDATE work only on Transactional tables and Delta Tables and are subject to the limits described in Parameters for Transaction Table and Delta Table.
The UPDATE syntax for Delta Tables does not support modifying primary key (PK) columns.
Syntax
DELETE FROM <table_name> [[AS] alias] [WHERE <condition>];Parameters
| Parameter | Required | Description |
|---|---|---|
table_name | Yes | The name of the Transactional table or Delta Table to delete rows from. |
alias | No | An alias for the table. Use an alias when you reference the table in the WHERE clause subquery. |
condition | No | A WHERE clause that identifies which rows to delete. Omit the clause to delete all rows in the table. For WHERE clause syntax, see WHERE clause (WHERE_condition). |
Usage notes
When to use DELETE
DELETE generates delta files rather than rewriting table data in place, so it does not immediately reduce storage. Use the following guidance to pick the right operation for your workload.
| Scenario | Recommended operation |
|---|---|
| Small number of rows, infrequent deletes, infrequent reads | DELETE or UPDATE. After multiple operations, merge base files and delta files to reclaim storage. |
| More than 5% of rows, infrequent deletes, frequent reads | INSERT OVERWRITE or INSERT INTO. See Insert or overwrite data. |
For example, if a job deletes or updates 10% of a table's rows ten times a day, compare the total scan cost of ten DELETE runs against ten INSERT OVERWRITE runs. For read-heavy tables, INSERT OVERWRITE is usually more efficient.
Batch your delete operations
MaxCompute runs each DELETE statement as a batch job and charges based on the amount of data scanned. Avoid patterns where a script generates one statement per row — the accumulated scan cost across many small statements is significantly higher than a single statement that covers all affected rows.
Use a subquery to target multiple rows in one statement:
-- Recommended: one statement covers all rows that need updating
UPDATE table1 SET col1 = (SELECT value1 FROM table2 WHERE table1.id = table2.id AND table1.region = table2.region);-- Avoid: one statement per row generates one scan per statement
UPDATE table1 SET col1 = 1 WHERE id = '2021063001' AND region = 'beijing';
UPDATE table1 SET col1 = 2 WHERE id = '2021063002' AND region = 'beijing';Reduce storage after deleting data
DELETE writes delta files on top of the table's base files. To reclaim the storage freed by deleted rows, merge the base files and delta files after you complete your delete operations.
Examples
Example 1: Delete rows from a non-partitioned Transactional table
Create a Transactional table, insert rows, then delete all rows where id = 2.
-- Create a Transactional table.
CREATE TABLE IF NOT EXISTS acid_delete (id BIGINT) TBLPROPERTIES ("transactional"="true");
-- Insert data.
INSERT OVERWRITE TABLE acid_delete VALUES (1), (2), (3), (2);
-- View the inserted data.
SELECT * FROM acid_delete;
+------------+
| id |
+------------+
| 1 |
| 2 |
| 3 |
| 2 |
+------------+
-- Delete all rows where id is 2.
-- On the MaxCompute client (odpscmd), confirm the operation by entering yes or no when prompted.
DELETE FROM acid_delete WHERE id = 2;
-- The following command is equivalent: it uses a table alias in the WHERE clause.
DELETE FROM acid_delete ad WHERE ad.id = 2;
-- View the result.
SELECT * FROM acid_delete;
+------------+
| id |
+------------+
| 1 |
| 3 |
+------------+Example 2: Delete rows from a partitioned Transactional table
Create a partitioned Transactional table, insert rows into two partitions, then delete rows where the partition is 2019 and id = 2.
-- Create a partitioned Transactional table.
CREATE TABLE IF NOT EXISTS acid_delete_pt (id BIGINT) PARTITIONED BY (ds STRING) TBLPROPERTIES ("transactional"="true");
-- Add partitions.
ALTER TABLE acid_delete_pt ADD IF NOT EXISTS PARTITION (ds = '2019');
ALTER TABLE acid_delete_pt ADD IF NOT EXISTS PARTITION (ds = '2018');
-- Insert data.
INSERT OVERWRITE TABLE acid_delete_pt PARTITION (ds = '2019') VALUES (1), (2), (3);
INSERT OVERWRITE TABLE acid_delete_pt PARTITION (ds = '2018') VALUES (1), (2), (3);
-- View the inserted data.
SELECT * FROM acid_delete_pt;
+------------+------------+
| id | ds |
+------------+------------+
| 1 | 2018 |
| 2 | 2018 |
| 3 | 2018 |
| 1 | 2019 |
| 2 | 2019 |
| 3 | 2019 |
+------------+------------+
-- Delete rows in the 2019 partition where id is 2.
-- On the MaxCompute client (odpscmd), confirm the operation by entering yes or no when prompted.
DELETE FROM acid_delete_pt WHERE ds = '2019' AND id = 2;
-- View the result.
SELECT * FROM acid_delete_pt;
+------------+------------+
| id | ds |
+------------+------------+
| 1 | 2018 |
| 2 | 2018 |
| 3 | 2018 |
| 1 | 2019 |
| 3 | 2019 |
+------------+------------+Example 3: Delete rows based on a join with another table
Create a target Transactional table and a reference table, then delete rows from the target table whose id does not appear in the reference table.
-- Create a target Transactional table and a reference table.
CREATE TABLE IF NOT EXISTS acid_delete_t (id INT, value1 INT, value2 INT) TBLPROPERTIES ("transactional"="true");
CREATE TABLE IF NOT EXISTS acid_delete_s (id INT, value1 INT, value2 INT);
-- Insert data.
INSERT OVERWRITE TABLE acid_delete_t VALUES (2, 20, 21), (3, 30, 31), (4, 40, 41);
INSERT OVERWRITE TABLE acid_delete_s VALUES (1, 100, 101), (2, 200, 201), (3, 300, 301);
-- Delete rows from acid_delete_t where the id does not match any id in acid_delete_s.
-- On the MaxCompute client (odpscmd), confirm the operation by entering yes or no when prompted.
DELETE FROM acid_delete_t WHERE NOT EXISTS (SELECT * FROM acid_delete_s WHERE acid_delete_t.id = acid_delete_s.id);
-- The following command is equivalent: it uses table aliases.
DELETE FROM acid_delete_t a WHERE NOT EXISTS (SELECT * FROM acid_delete_s b WHERE a.id = b.id);
-- View the result. Only rows with id 2 and 3 remain.
SELECT * FROM acid_delete_t;
+------------+------------+------------+
| id | value1 | value2 |
+------------+------------+------------+
| 2 | 20 | 21 |
| 3 | 30 | 31 |
+------------+------------+------------+Example 4: Delete rows from a Delta Table
Create a Delta Table, insert rows, then delete rows where val = 2 in a specific partition.
-- Create a Delta Table with a primary key column.
CREATE TABLE IF NOT EXISTS mf_dt (
pk BIGINT NOT NULL PRIMARY KEY,
val BIGINT NOT NULL
) PARTITIONED BY (dd STRING, hh STRING)
TBLPROPERTIES ("transactional"="true");
-- Insert data.
INSERT OVERWRITE TABLE mf_dt PARTITION (dd='01', hh='02') VALUES (1, 1), (2, 2), (3, 3);
-- View the inserted data.
SELECT * FROM mf_dt WHERE dd='01' AND hh='02';
+------------+------------+----+----+
| pk | val | dd | hh |
+------------+------------+----+----+
| 1 | 1 | 01 | 02 |
| 3 | 3 | 01 | 02 |
| 2 | 2 | 01 | 02 |
+------------+------------+----+----+
-- Delete rows where val is 2 in partition dd='01', hh='02'.
DELETE FROM mf_dt WHERE val = 2 AND dd='01' AND hh='02';
-- View the result. Only rows where val is 1 and 3 remain.
SELECT * FROM mf_dt WHERE dd='01' AND hh='02';
+------------+------------+----+----+
| pk | val | dd | hh |
+------------+------------+----+----+
| 1 | 1 | 01 | 02 |
| 3 | 3 | 01 | 02 |
+------------+------------+----+----+What's next
UPDATE: Update column values in rows that meet a condition in partitioned or non-partitioned Transactional tables.
ALTER TABLE: Merge base files and delta files to reclaim storage after delete operations.