Delta tables support two historical query modes: time travel queries and incremental queries. A time travel query reads a snapshot of the table at a specific point in time or version. An incremental query returns only the rows that changed between two points in time or between two versions.
Both query types extend standard MaxCompute Data Query Language (DQL) syntax. The full DQL syntax and limits apply, with one addition: the FROM clause accepts a time or version qualifier.
Syntax
[WITH <cte>[, ...] ]
SELECT [ALL | DISTINCT] <select_expr>[, <except_expr>)][, <replace_expr>] ...
FROM <table_reference>
[TIMESTAMP | VERSION AS OF expr]
[TIMESTAMP | VERSION BETWEEN start_expr AND end_expr]
[WHERE <where_condition>]
[GROUP BY {<col_list> | ROLLUP(<col_list>)}]
[HAVING <having_condition>]
[ORDER BY <order_condition>]
[DISTRIBUTE BY <distribute_condition> [SORT BY <sort_condition>]|[ CLUSTER BY <cluster_condition>] ]
[LIMIT <number>]
[WINDOW <window_clause>]
Use TIMESTAMP | VERSION AS OF expr for time travel queries and TIMESTAMP | VERSION BETWEEN start_expr AND end_expr for incremental queries.
Time travel queries
A time travel query returns a historical snapshot of the table — the state of the data at or before the specified point in time or version.
TIMESTAMP AS OF
SELECT * FROM <table> TIMESTAMP AS OF <expr>
expr accepts any of the following formats:
| Format | Example | Description |
|---|---|---|
| TIMESTAMP string | '2023-01-01 00:00:00.123' |
Precise timestamp with milliseconds |
| DATETIME string | '2023-01-01 00:00:00' |
Timestamp without milliseconds |
| DATE string | '2023-01-01' |
Date only; time defaults to midnight |
current_timestamp() |
current_timestamp() |
Current time |
getDate() + N |
getDate() - 3600 |
N seconds relative to now; negative = past, positive = future |
get_latest_timestamp(tablename [, number]) |
get_latest_timestamp('mf_tt2', 2) |
Timestamp of the Nth most recent DML operation (default: 1 = latest). The returned timestamp may be identical for different values of number. |
For cross-project access, format tablename as ProjectName.TableName. For the three-layer model, use ProjectName.SchemaName.TableName.
Limits:
-
The valid query range is
[CreateTableTimestamp, expr], whereCreateTableTimestampis the time the table creation was committed. -
If
expris earlier than the table creation time or more than N hours ago, an error is returned. N is set by theacid.data.retain.hoursproperty at table creation time. For example, ifacid.data.retain.hoursis72andexpris 80 hours ago, the query fails. -
If
expris exactly N hours ago, an error may also be returned due to second-level latency in internal systems.
Avoid using TIMESTAMP AS OF current_timestamp() - <seconds> for queries near the retention boundary. Use get_latest_timestamp() to safely reference recent commits.
VERSION AS OF
SELECT * FROM <table> VERSION AS OF <expr>
expr accepts:
| Format | Example | Description |
|---|---|---|
| BIGINT constant | 3 |
A specific version number |
get_latest_version(tablename [, number]) |
get_latest_version('mf_tt2', 2) |
Version of the Nth most recent DML operation (default: 1 = latest). Unlike get_latest_timestamp, the returned version varies with each value of number. |
For tablename formatting, follow the same rules as get_latest_timestamp().
Limits:
-
Each DML operation generates a strictly incrementing version number. Run
SHOW HISTORY FOR TABLE/PARTITIONto view all versions. -
The valid version range is
[CreateTableVersion, expr]. TheCreateTableVersiondefaults to1. -
If the version corresponds to a commit time older than N hours (where N =
acid.data.retain.hours), or if the version is less than1, an error is returned. -
If
exprexceeds the latest DML version, an error is returned.
Use get_latest_version() to retrieve a valid version number and avoid out-of-range errors.
Incremental queries
An incremental query returns only the rows that were added or modified within a time range or version range — equivalent to the delta between two snapshots.
Data generated by compaction is not treated as new data and is excluded from incremental query results.
TIMESTAMP BETWEEN
SELECT * FROM <table> TIMESTAMP BETWEEN <start_expr> AND <end_expr>
The time range is (start_expr, end_expr] — left-open, right-closed. Both expressions follow the same formats supported by TIMESTAMP AS OF.
Limits:
-
If
start_expris earlier than the table creation time or more than N hours ago, an error is returned. N =acid.data.retain.hours. -
If
end_expris later than the last DML commit time, the behavior depends onacid.incremental.query.out.of.time.range.enabled:-
Default (
false): an error is returned. -
Set to
true: the query returns all incremental data within(start_expr, end_expr].
-
To allow queries that extend beyond the latest commit, set the property to true:
ALTER TABLE <table> SET tblproperties("acid.incremental.query.out.of.time.range.enabled"="true");
VERSION BETWEEN
SELECT * FROM <table> VERSION BETWEEN <start_expr> AND <end_expr>
The version range is (start_expr, end_expr] — left-open, right-closed. Both expressions follow the same formats supported by VERSION AS OF.
Limits:
-
The system resolves
start_exprto a commit time. If that time is more than N hours ago or the version is less than1, an error is returned. N =acid.data.retain.hours. -
If
end_exprexceeds the latest DML version, the behavior depends onacid.incremental.query.out.of.time.range.enabled:-
Default (
false): an error is returned. -
Set to
true: the query returns all incremental data within(start_expr, end_expr].
-
Usage notes
-
Only Delta tables support time travel queries and incremental queries.
-
Duplicate keys: When multiple rows share the same primary key, only the latest row is returned. Rows in the
DELETEstate are excluded. -
Change Data Capture (CDC): Querying the update state of data in formats similar to Change Data Capture (CDC) is not yet supported and is planned for a future release.
-
Dropped or renamed tables: You cannot query historical data from a table after it has been dropped or renamed. Restore the table first, then run the query.
-
Same table, multiple qualifiers: If you want to perform a time travel query or an incremental query on the same table in an SQL statement, you must set the timestamps or versions for the queries to the same values.
-
Partitioned tables: Specify a partition in the
WHEREclause to limit the scan to that partition and reduce query time. -
Concurrency: Delta tables use Multi-Version Concurrency Control (MVCC) to isolate concurrent reads and writes. The Read Committed isolation level is supported.
Examples
The following examples use a partitioned Delta table mf_tt2.
Set up example data
-- Table creation. Version = 1.
-- Run "SHOW HISTORY FOR TABLE mf_tt2" to confirm.
CREATE TABLE mf_tt2 (
pk bigint NOT NULL PRIMARY KEY,
val bigint NOT NULL)
PARTITIONED BY (dd string, hh string)
tblproperties ("transactional"="true");
-- INSERT OVERWRITE. Version = 2.
INSERT OVERWRITE TABLE mf_tt2 PARTITION (dd='01', hh='01') VALUES (1, 1), (2, 2), (3, 3);
-- INSERT INTO. Version = 3.
INSERT INTO TABLE mf_tt2 PARTITION (dd='01', hh='01') VALUES (3, 30), (4, 4), (5, 5);
To check the table creation time and version history before running queries:
-- Get the table creation timestamp
DESC EXTENDED mf_tt2;
The following result is returned.
+------------------------------------------------------------------------------------+
| Owner: ALIYUN$****_doctest@test.aliyunid.com | Project: doc_test_prod |
| TableComment: |
+------------------------------------------------------------------------------------+
| CreateTime: 2023-06-26 09:31:38 |
| LastDDLTime: 2023-06-26 09:31:38 |
| LastModifiedTime: 2023-06-26 09:32:31 |
+------------------------------------------------------------------------------------+
| InternalTable: YES | Size: 8541 |
+------------------------------------------------------------------------------------+
| Native Columns: |
+------------------------------------------------------------------------------------+
| Field | Type | Label | ExtendedLabel | Nullable | DefaultValue | Comment |
+------------------------------------------------------------------------------------+
| pk | bigint | | | false | NULL | |
| val | bigint | | | false | NULL | |
+------------------------------------------------------------------------------------+
| Partition Columns: |
+------------------------------------------------------------------------------------+
| dd | string | |
| hh | string | |
+------------------------------------------------------------------------------------+
| Extended Info: |
+------------------------------------------------------------------------------------+
| TableID: bec515a56cc9492c8f906a224c62**** |
| IsArchived: false |
| PhysicalSize: 25623 |
| FileNum: 9 |
| StoredAs: AliOrc |
| CompressionStrategy: normal |
| ClusterType: hash |
| BucketNum: 16 |
| ClusterColumns: [pk] |
| SortColumns: [pk ASC] |
+------------------------------------------------------------------------------------+-- Get all DML version numbers and commit times
SHOW HISTORY FOR TABLE mf_tt2 PARTITION (dd='01', hh='01');
The following result is returned.
ID = 20230626021756157ghict5k****
ObjectType ObjectId ObjectName VERSION(LSN) Time Operation
PARTITION 4764c8e1cb634a4fb9c21f3fc850**** dd=01/hh=01 0000000000000002 2023-06-26 09:31:56 CREATE
PARTITION 4764c8e1cb634a4fb9c21f3fc850**** dd=01/hh=01 0000000000000003 2023-06-26 09:32:32 APPEND
The SHOW HISTORY output shows each operation, its version number, commit time, and operation type (CREATE, APPEND, etc.).
Time travel query examples
Snapshot at a specific datetime — all data as of 09:33:00:
SELECT * FROM mf_tt2 TIMESTAMP AS OF '2023-06-26 09:33:00' WHERE dd = '01' AND hh = '01';
Returns all 5 rows written by versions 2 and 3 (pk 1–5, with pk=3 showing val=30 from the latest write).
Snapshot at version 2 — before the INSERT INTO:
SELECT * FROM mf_tt2 VERSION AS OF 2 WHERE dd = '01' AND hh = '01';
Returns the 3 rows from the INSERT OVERWRITE: pk=1, pk=2, pk=3 (val=3).
Snapshot at the current time:
SELECT * FROM mf_tt2 TIMESTAMP AS OF current_timestamp() WHERE dd = '01' AND hh = '01';
Snapshot 10 seconds ago:
SELECT * FROM mf_tt2 TIMESTAMP AS OF current_timestamp() - 10 WHERE dd = '01' AND hh = '01';
Snapshot at the second most recent commit (using `get_latest_timestamp`):
SELECT * FROM mf_tt2 TIMESTAMP AS OF get_latest_timestamp('mf_tt2', 2) WHERE dd = '01' AND hh = '01';
Returns the 3 rows from version 2.
Snapshot at the second most recent version (using `get_latest_version`):
SELECT * FROM mf_tt2 VERSION AS OF get_latest_version('mf_tt2', 2) WHERE dd = '01' AND hh = '01';
Returns the 3 rows from version 2.
Incremental query examples
Changes between two commit timestamps:
SELECT * FROM mf_tt2 TIMESTAMP BETWEEN '2023-06-26 09:31:40' AND '2023-06-26 09:32:00' WHERE dd = '01' AND hh = '01';
Returns the 3 rows written by version 2.
Changes between version 2 and version 3:
SELECT * FROM mf_tt2 VERSION BETWEEN 2 AND 3 WHERE dd = '01' AND hh = '01';
Returns the 3 rows written by version 3: pk=3 (val=30), pk=4, pk=5.
Last 300 seconds with `acid.incremental.query.out.of.time.range.enabled` set to `false` (default):
SELECT * FROM mf_tt2 TIMESTAMP BETWEEN current_timestamp() - 301 AND current_timestamp() WHERE dd = '01' AND hh='01';
Returns an error because end_expr exceeds the latest commit timestamp:
FAILED: ODPS-0130071:[0,0] Semantic analysis exception - physical plan generation failed:
com.aliyun.odps.meta.exception.MetaException: ...
Incremental query can't exceed current version. Current version timestamp: 2023-06-26 09:32:32, input timestamp is: 2023-06-26 10:47:55
To allow the query to extend beyond the latest commit, enable the property:
ALTER TABLE mf_tt2 SET tblproperties("acid.incremental.query.out.of.time.range.enabled"="true");
Then re-run the query. The result is empty (no new data was written in the last 300 seconds):
+------------+------------+----+----+
| pk | val | dd | hh |
+------------+------------+----+----+
+------------+------------+----+----+
Changes from the third most recent commit to the most recent commit:
SELECT * FROM mf_tt2 TIMESTAMP BETWEEN get_latest_timestamp('mf_tt2', 3) AND get_latest_timestamp('mf_tt2') WHERE dd = '01' AND hh = '01';
Returns all 5 rows (covers both version 2 and version 3 commits).
Changes from the third most recent version to the most recent version:
SELECT * FROM mf_tt2 VERSION BETWEEN get_latest_version('mf_tt2', 3) AND get_latest_version('mf_tt2') WHERE dd = '01' AND hh = '01';
Returns all 5 rows.