This topic describes the benefits of Query Engine (QE) used in Hologres and how a query is executed in Hologres.
- Distributed execution
Hologres QE is a distributed execution model that works with the Storage Disaggregation architecture. An execution plan is represented by a directed acyclic graph (DAG) that consists of asynchronous operators. Execution plans can express a variety of complex queries and fit the data storage model of Hologres. This way, Query Optimizer (QO) can easily optimize the queries based on a variety of optimization technologies.
- Fully asynchronous execution
Hologres QE provides an end-to-end fully asynchronous framework that can eliminate the bottleneck of a high concurrency system and make full use of resources. This minimizes the impact of the read latency caused by the Storage Disaggregation architecture.
- Vectorization and column-oriented processing
Hologres QE processes data in a vectorized manner in operators whenever possible. Hologres QE is deeply integrated with Storage Engine (SE). Flexible execution models are built to take full advantage of a variety of indexes. Vectorization and materialization are deferred as much as possible to prevent unnecessary data reads or computing.
- Adaptive incremental processing
Hologres QE implements adaptive incremental processing for regular real-time data.
- Optimization of specific queries
Hologres QE provides unique optimization for specific queries.
Query execution process
- A frontend (FE) authenticates and parses the SQL query and distributes the SQL query to different modules of Hologres QE.
- Hologres QE selects an execution path based on the characteristics of the SQL query.
- If the SQL query is a point query, the FE distributes the SQL query to Fixed QE to obtain data by skipping QO. This shortens the execution path and improves query performance. This execution path is called a fixed plan. Point queries, key-value queries in ApsaraDB for HBase, and point writes are processed by using fixed plans.
- If the SQL query is an online analytical processing (OLAP) query, the FE distributes the SQL query to QO. QO parses the SQL query and generates an execution plan. The execution plan includes information such as the estimated cost for operator execution, statistics, and query range that is narrowed down. QO determines to use HQE, PostgreSQL Query Engine (PQE), Seahawks Query Engine (SQE), or Hive Query Engine (Hive QE) to compute the operators based on the execution plan. The following content introduces HQE, PQE, and SQE:
- Hologres Query Engine (HQE)
Developed by Alibaba Cloud, HQE uses a scalable Massively Parallel Processing (MPP) architecture to implement full parallel computing. HQE uses vectorization operators to make maximum use of CPUs and achieve ultimate query performance. HQE is the main module of Hologres QE.
- PostgreSQL Query Engine (PQE)
PQE provides compatibility with PostgreSQL. PQE supports a variety of PostgreSQL extensions, such as PostGIS and user-defined functions (UDFs) that are written in PL/Java, PL/SQL, or PL/Python. The functions and operators that are not supported by HQE can be executed by using PQE. HQE has been continuously optimized in each version. The final goal is to integrate all features of PQE.
- Seahawks Query Engine (SQE)
SQE allows Hologres to seamlessly connect to MaxCompute. This provides high-performance access to all types of MaxCompute files, without the need to migrate or import data. This also allows Hologres to access complex tables such as hash tables and range-clustered tables, and implement interactive analysis of PB-level batch data.
- Hologres Query Engine (HQE)
- After Hologres QE determines the correct execution plan, Hologres QE uses SE to obtain data, merges the data from different shards, and returns the query result to the client.