PolarDB-X supports hybrid transaction/analytical processing (HTAP). PolarDB-X supports highly concurrent requests, transactional requests, and analytical queries.

Note Analytical queries involve large amounts of data and complex calculations. An example of analytical queries is the aggregation of data within a specified period of time. Compared with common simple queries, analytical queries require a longer time for execution and consume more computing resources. Several seconds or minutes is required to execute an analytical query.
To accelerate complex analytical queries, PolarDB-X splits each compute task and assigns the subtasks to multiple compute nodes. Therefore, you can use the computing capabilities of multiple nodes to accelerate query execution. This request processing method is also known as massively parallel processing (MPP). By default, in PolarDB-X, only the read-only instance clusters support MPP. 456789

Query optimizer

The optimizer of PolarDB-X can be used to handle HTAP loads and supports complex queries. In a TP query, the number of tables used is limited to a specific number such as 3, the JOIN operations are performed based on the index, and a small amount of data is used. To handle complex queries that do not have these characteristics, the optimizer must meet higher requirements.

PolarDB-X uses a cost-based optimizer that can search for an optimal execution plan based on the volume of the queried data and the data distribution. For example, the optimizer can adjust the order in which the JOIN operations are performed, select an appropriate join or aggregation algorithm, and disassociate associated subqueries. The quality of the execution plan determines the query efficiency. Query optimization is crucial for analytical queries.

Intelligent routing provided by HTAP

Analytical processing (AP) affects transaction processing (TP). This is one of the main issues in the application of HTAP databases. To resolve this issue, we recommend that you deploy a PolarDB-X read-only cluster whose hardware is isolated from that of the primary instance. This minimizes the impact of AP on TP.

The PolarDB-X optimizer classifies requests into TP workloads and AP workloads based on the estimated costs. AP workloads are rewritten as distributed execution plans and sent to read-only clusters. This ensures that AP workloads do not affect TP workloads on the primary instance.


Distributed execution

Each distributed execution plan is divided into multiple stages. The request in each stage is divided into multiple shards. The shards are distributed to multiple compute nodes for execution. Compute nodes are connected by using high-speed networks. In each stage, the calculation result of a subtask is used in other subtasks. In the last stage, the calculation results of all subtasks are collected and sent back to the client that initiates the query. 456789

Globally consistent reads

If the read/write splitting architecture is used, the latency of data replication may cause read-after-write inconsistency. By default, globally consistent reads are supported for the queries routed to PolarDB-X read-only instances. This ensures that the expired data is not read. After data is written to the primary instance, the written data can be read only from the read-only instances.