All Products
Search
Document Center

MaxCompute:SQL processing optimization

Last Updated:Aug 01, 2023

To help you use Transaction Table 2.0 in an efficient manner, MaxCompute supports all DQL and DML statements in SQL. The modules of SQL engine kernel, such as Compiler, Optimizer, and Runtime, are developed to support features related to Transaction Table 2.0 and optimize computing performance. This topic describes the core architecture of DQL and DML statements.

DQL statements

  • DQL statements are developed and optimized to support time travel queries and incremental queries.

  • If data is read from a Transaction Table 2.0 source table rather than a common table, a large number of delta files are merged and the intermediate historical status is deleted. This significantly affects reading efficiency. For more information, see Time travel. In business scenarios that require high query frequency and efficiency, we recommend that you perform the compaction operation on a Transaction Table 2.0 table on a regular basis. This helps effectively reduce the number of delta files and delete the intermediate historical status. This way, query performance increases. However, the compaction operation requires computing and storage costs. Therefore, you need to configure appropriate compaction policies based on your business requirements.

  • When you use DQL statements for Transaction Table 2.0 in specific scenarios, the computing cost is reduced. For example, you execute SQL statements to query data of a Transaction Table 2.0 source table of the pay-as-you-go MaxCompute service. The table may contain a large number of intermediate status data of the UPDATE and DELETE types. When the storage reader reads all intermediate status data, the storage reader merges the data and returns only the last status data for subsequent computing. The computing fee is calculated based on the actual I/O size of the data that is queried, which includes the I/O size of intermediate status data. To save the computing costs of data queries, we recommend that you perform the compaction operation to delete the intermediate status in the scenarios where the UPDATE and DELETE operations are frequently performed and data is frequently queried. This way, the I/O size of the data that is queried is reduced. This helps decrease query costs and increase query performance.

DML statements

DML statements can be used to write and modify multiple data records at a time. The SQL engine provides optimized DML syntax for Transaction Table 2.0 tables. The optimizations of DML syntax include parsing of specific syntax, planner optimization for specific operators, deduplication logic for primary key columns, and concurrent writing of data of the UPSERT type that is constructed in the runtime.

After data processing is complete, Meta Service detects transaction conflicts and atomically updates the metadata of data files to ensure read/write isolation and transactional consistency. For more information, see Transaction management.