Perform lock free DML on large amounts of data - Data Management

Lock-free data manipulation language (DML) in Data Management (DMS) lets you run UPDATE, DELETE, and INSERT...SELECT statements on large datasets without locking tables or degrading database performance. The DMS engine automatically splits the target data into batches and executes the statement batch by batch, pausing briefly between each batch to protect primary-to-secondary replication and reduce database load.

Background

As business data grows, you often need to clear historical data or update large portions of a table. Two common approaches both carry significant risks:

Writing a script to divide data: Using the LIMIT clause to paginate deletes or updates is a frequent mistake. Because LIMIT without an index scan forces a full-table scan, the table can still get locked. Even when division is implemented correctly, an improperly tuned batch interval can cause severe replication lag between primary and secondary databases.
Manual DBA extraction: Manually splitting large datasets is time-consuming, error-prone, and difficult to tune once in progress.

Beyond the risks above, two additional failure modes can occur when operating directly on large tables:

A single SQL statement that affects too many rows can cause the number of log entries to exceed the upper limit, causing the statement to fail entirely.
An SQL statement executed without using specified indexes can cause table locking, increase database load, and in severe cases lead to business failures.

Lock-free DML eliminates both risks by automating batch splitting and throttling inside DMS.

How it works

Lock-free DML processes a single SQL statement through three stages:

Data splitting — The DMS engine analyzes the statement and divides the matching rows into multiple batches based on index ranges, avoiding full-table scans.
Batch execution — DMS runs the statement on one batch at a time, holding only short-lived, row-level locks during each batch.
Buffer interval — After each batch completes, DMS waits briefly before starting the next batch. This interval lets the secondary database catch up with replication, keeping primary-to-secondary lag within acceptable bounds.

Benefits

Your business is not affected during large data change operations.
The impact on database performance and database storage is reduced.
Execution efficiency is ensured even for large amounts of data changes.

Supported databases

Database engine	Supported instances
MySQL	ApsaraDB RDS for MySQL, PolarDB for MySQL, MyBase for MySQL, PolarDB for Xscale, third-party MySQL databases
PostgreSQL	ApsaraDB RDS for PostgreSQL, PolarDB for PostgreSQL, MyBase for PostgreSQL, third-party PostgreSQL databases
MariaDB	ApsaraDB for MariaDB, third-party MariaDB databases
OceanBase	ApsaraDB for OceanBase in MySQL mode
Oracle-compatible	PolarDB for PostgreSQL (Compatible with Oracle)

Limitations

Only UPDATE, DELETE, and INSERT...SELECT statements are supported.

UPDATE and DELETE

Constraint	Details
Table scope	Single table only
WHERE clause	Required. To update or delete all rows, use `WHERE 1=1`
Subqueries	Not supported
LIMIT clause	Not supported — using `LIMIT` for manual pagination forces a full-table scan, which causes the table locking that this feature is designed to prevent

INSERT...SELECT

Constraint	Details
SELECT scope	Single table only
WHERE clause	Required in the SELECT clause. To select all rows, use `WHERE 1=1`
Disallowed clauses	`LIMIT`, `ORDER BY`, and `GROUP BY` are not supported in the SELECT clause

Use cases

Clear historical data — Delete rows that are no longer needed without blocking reads or writes on the production table.
Update all rows — Apply a column update across an entire table without causing replication lag or service disruption.

What's next

Perform lock-free DML operations