This topic describes the atomicity, consistency, isolation, and durability (ACID) semantics for concurrent jobs in MaxCompute, and the ACID semantics for transactional tables.

Terms

  • Operation: a single job submitted in MaxCompute.
  • Data object: an object that stores data, such as a non-partitioned table or a partition.
  • INTO job: an SQL job that contains the INTO keyword, such as INSERT INTO or DYNAMIC INSERT INTO.
  • OVERWRITE job: an SQL job that contains the OVERWRITE keyword, such as INSERT OVERWRITE or DYNAMIC INSERT OVERWRITE.
  • Data upload by using Tunnel: an INTO or OVERWRITE job.

Description of ACID semantics

  • Atomicity: An operation is fully complete or not performed at all. That is, an operation is not partially performed.
  • Consistency: The integrity of data objects is maintained when an operation is performed.
  • Isolation: An operation can be performed independent of other concurrent operations.
  • Durability: After an operation is complete, modified data is permanently valid and is not lost even if a system failure occurs.

ACID semantics for concurrent write jobs in MaxCompute

  • Atomicity
    • If multiple jobs conflict with each other, MaxCompute ensures that only one job succeeds.
    • The atomicity of the CREATE, OVERWRITE, and DROP operations on a single table or partition can be ensured.
    • The atomicity of cross-table operations such as MULTI-INSERT cannot be ensured.
    • In extreme cases, the following operations may not be atomic:
      • A DYNAMIC INSERT OVERWRITE operation that is performed on more than 10,000 partitions.
      • An INTO operation. The atomicity of INTO operations cannot be ensured because data cleansing fails during a transaction rollback. However, the data cleansing failure does not cause loss of original data.
  • Consistency
    • The consistency can be ensured for OVERWRITE jobs.
    • If an INTO job fails due to a conflict, data from the failed job may remain.
  • Isolation
    • For non-INTO operations, MaxCompute ensures that read operations are submitted.
    • For INTO operations, some read operations may not be submitted.
  • Durability
    • MaxCompute ensures data durability.

ACID semantics for transactional tables

In addition to the ACID semantics for concurrent write jobs, MaxCompute supports the following ACID semantics for transactional tables:
  • For INTO operations, MaxCompute ensures that read operations are submitted. If an INTO job fails due to a conflict, data from the failed job does not remain.
  • The atomicity of the UPDATE, DELETE, and small file MERGE operations on a non-partitioned table or a partition can be ensured.

    For example, if two UPDATE operations are performed on a partition at the same time, only one UPDATE operation succeeds. The following cases do not exist: 1. An UPDATE operation is partially performed. 2. Both UPDATE operations succeed.

Conflict of concurrent operations

When jobs are concurrently performed on the same destination table, a conflict may occur. In the event of a conflict, the job that ends earlier succeeds, and the job that ends later may fail due to the conflict.

The following table describes the results of jobs that are submitted at the same time on a non-partitioned table or a partition.

Job type INSERT OVERWRITE or TRUNCATE job that ends later INSERT INTO job that ends later UPDATE or DELETE job that ends later Small file MERGE job that ends later
INSERT OVERWRITE or TRUNCATE job that ends earlier
  • Both jobs succeed.
  • Data from the INSERT OVERWRITE or TRUNCATE job that ends later overwrites data from the INSERT OVERWRITE or TRUNCATE job that ends earlier.
  • Both jobs succeed.
  • The INSERT INTO job that ends later appends its data to data from the INSERT OVERWRITE or TRUNCATE job that ends earlier.
  • The UPDATE or DELETE job that ends later reports an error.
  • The INSERT OVERWRITE or TRUNCATE job that ends earlier modifies the data of the non-partitioned table or partition on which the UPDATE or DELETE job that ends later is performed.
  • The small file MERGE job that ends later reports an error.
  • The INSERT OVERWRITE or TRUNCATE job that ends earlier modifies the data of the non-partitioned table or partition on which the small file MERGE job that ends later is performed.
INSERT INTO job that ends earlier
  • Both jobs succeed.
  • Data from the INSERT OVERWRITE or TRUNCATE job that ends later overwrites data from the INSERT INTO job that ends earlier.
  • Both jobs succeed.
  • The INSERT INTO job that ends later appends its data to data from the INSERT INTO job that ends earlier.
  • The UPDATE or DELETE job that ends later reports an error.
  • The INSERT INTO job that ends earlier modifies the data of the non-partitioned table or partition on which the UPDATE or DELETE job that ends later is performed.
  • The small file MERGE job that ends later reports an error.
  • The INSERT INTO job that ends earlier modifies the data of the non-partitioned or partition on which the small file MERGE job that ends later is performed.
UPDATE or DELETE job that ends earlier
  • Both jobs succeed.
  • Data from the INSERT OVERWRITE or TRUNCATE job that ends later overwrites data from the UPDATE or DELETE job that ends earlier.
  • Both jobs succeed.
  • The INSERT INTO job that ends later appends its data to data from the UPDATE or DELETE job that ends earlier.
  • The UPDATE or DELETE job that ends later reports an error.
  • The UPDATE or DELETE job that ends earlier modifies the data of the non-partitioned table or partition on which the UPDATE or DELETE job that ends later is performed.
  • The small file MERGE job that ends later reports an error.
  • The INSERT INTO job that ends earlier modifies the data of the non-partitioned or partition on which the small file MERGE job that ends later is performed.
Small file MERGE job that ends earlier
  • Both jobs succeed.
  • Data from the INSERT OVERWRITE or TRUNCATE job that ends later overwrites data from the small file MERGE job that ends earlier.
  • Both jobs succeed.
  • The INSERT INTO job that ends later appends its data to data from the small file MERGE job that ends earlier.
  • The UPDATE or DELETE job that ends later reports an error.
  • The small file MERGE job that ends earlier modifies the data of the non-partitioned table or partition on which the UPDATE or DELETE job that ends later is performed.
  • The small file MERGE job that ends later reports an error.
  • The small file MERGE job that ends earlier modifies the data of the non-partitioned or partition on which the small file MERGE job that ends later is performed.
In conclusion, conflicting jobs succeed or report errors based on the following rules:
  • INSERT operations do not report errors due to conflicts when data changes.
  • The UPDATE, DELETE, and small file MERGE operations report errors due to conflicts when data in the destination non-partitioned table or partition changes.
Note In extreme cases, if multiple jobs are concurrently performed when metadata is updating, the jobs may report errors due to conflicts caused by metadata changes.