Features - PolarDB - Alibaba Cloud Documentation Center

This topic describes the key features of PolarDB-X.

Distributed linear scalability

PolarDB-X horizontally partitions data in a table into multiple data nodes. Data is partitioned by using partitioning functions. PolarDB-X supports common partitioning functions such as hash and range.

Scale-out and migration

The amount of data increases when your business grows. In most cases, you need to add data nodes to handle the increasing amount of data. When a new data node is added to a cluster, PolarDB-X automatically triggers a scale-out task to rebalance data.

The following figure provides an example in which the data of the orders table is distributed into four data nodes. After two more data nodes are added to the cluster, a rebalancing task is triggered. PolarDB-X migrates some of the data partitions from the original data nodes to the new data nodes. The migration process is completed in the background by using idle resources and does not affect your online business.

High availability and disaster recovery

When you deploy a database cluster in a production environment, multiple replicas are created to ensure high availability and data durability. To ensure strong consistency among replicas, modern databases often use a majority consensus protocol such as Paxos for data replication. Paxos requires at least three nodes in a cluster. Each time the cluster performs a write operation, the write request must be accepted by more than half of the nodes. This way, even if one of the nodes stops responding, the cluster can still provide services. PolarDB-X uses X-Paxos for data replication. X-Paxos is developed by Alibaba to implement Paxos, and provides optimized features and enhanced performance. X-Paxos is known for providing reliable and highly available services to ensure business stability during peak hours of the Double 11 Shopping Festival for more than 10 years.

PolarDB-X instances can be deployed in multiple data centers to implement disaster recovery based on data centers. Common deployment methods include three data centers in the same zone and three data centers across two zones. The second method is used in hybrid cloud deployment scenarios. In most cases, one of the three data centers functions as the primary data center due to the characteristics of the Paxos protocol. The primary data center is responsible for providing external services.

Distributed transactions

PolarDB-X supports distributed transactions and can ensure that the transactions follow the ACID principles: atomicity, consistency, isolation, and durability.

PolarDB-X uses Timestamp Oracle (TSO) and multiversion concurrency control (MVCC) to ensure the consistency of the snapshots that are read. This way, the intermediate status of a distributed transaction such as a money transfer transaction is not read. In the following figure, when the compute node commits a transaction, the compute node executes the transaction and obtains the timestamp from TSO. Then, the compute node commits the timestamp and the data to the multi-version storage engine that a data node runs. During the read process, if the data that you want to query is stored in multiple partitions, PolarDB-X obtains a global timestamp as the version of the data to be read. PolarDB-X checks whether each version of each row of data is visible. This can ensure that PolarDB-X reads only the data that is written by the transactions that are committed before the global timestamp.

Distributed transactions are a fundamental feature in distributed systems. For example, in a read/write splitting solution, the multiple versions of data for a transaction are also synchronized to the learner replicas. This ensures that read-only instances do not read stale data due to synchronization latency. In a log file that records global data changes, distributed transactions are sorted by timestamp. When PolarDB-X performs a point-in-time recovery (PITR), PolarDB-X uses the timestamps of distributed transactions to accurately identify the globally consistent version of data at the corresponding point in time.

HTAP

PolarDB-X supports hybrid transaction/analytical processing (HTAP). This allows PolarDB-X to support highly concurrent requests, transactional requests, and complex analytical queries. Analytical queries are performed on large amounts of data and require complex computations. For example, you can perform analytical queries to aggregate data within a specific period of time. Compared with common simple queries, analytical queries require a longer period of time to execute and consume more computing resources. It takes several seconds or minutes to execute an analytical query.

To accelerate complex analytical queries, PolarDB-X splits each computing task and assigns the subtasks to multiple compute nodes. This way, you can use the computing capabilities of multiple nodes to accelerate query execution. This request processing method is known as massively parallel processing (MPP).

The optimizer of PolarDB-X is developed to handle HTAP workloads and can provide services for complex queries. PolarDB-X uses a cost-based optimizer that can search for an optimal execution plan based on the data volume and data distribution. For example, the optimizer can adjust the order in which JOIN operations are performed, select an appropriate join or aggregation algorithm, and disassociate subqueries.

The PolarDB-X optimizer classifies requests into transaction processing workloads and analytical processing workloads based on the estimated costs. Analytical processing workloads are rewritten as distributed execution plans and sent to the read-only instances for computing. This ensures that analytical processing workloads do not affect transaction processing workloads on the primary instance.

Compatibility with the MySQL ecosystem

PolarDB-X is developed to ensure full compatibility with the MySQL ecosystem. This section describes the compatibility between PolarDB-X and MySQL in terms of SQL syntax, transaction behavior, and data import and export. For more information, see relevant topics.

PolarDB-X is compatible with the MySQL protocol. PolarDB-X instances can communicate with common MySQL clients by using drivers such as Java Database Connectivity (JDBC) drivers, Open Database Connectivity (ODBC) drivers, and Go drivers. PolarDB-X can be connected to MySQL clients by using protocols such as SSL, the prepared statement protocol, and Load.

PolarDB-X is compatible with DML, Data Access Language (DAL), and DDL statements in MySQL.

PolarDB-X is compatible with most MySQL functions, including JSON functions, encryption functions, and decryption functions.
PolarDB-X is compatible with views, common table expressions (CTEs), window functions, and analytic functions in MySQL 8.0.
PolarDB-X supports all data types in MySQL, including TIMESTAMP and DECIMAL.
PolarDB-X is compatible with common strings, character sets, and collations in MySQL.
PolarDB-X is compatible with most information_schema views.

For more information, see Developer Guide.