Apsara PolarDB uses a cloud native architecture. PolarDB takes advantage of the benefits of commercial databases and open source cloud databases. Commercial databases offer the following benefits: stability, reliability, high performance, and scalability. Open source cloud databases offer the following benefits: simplicity, openness, and rapid iteration. This topic describes the architecture and features of PolarDB .

Figure 1. Architecture

One primary node and multiple read-only nodes

PolarDB uses a distributed cluster-based architecture. Each standard PolarDB cluster consists of a primary node and a maximum of 15 read-only nodes. At least one read-only node must be used to implement failovers to ensure high availability of PolarDB databases. The primary node processes read and write requests and the read-only nodes process only read requests. PolarDB uses an active-active architecture for the primary node and read-only nodes in each cluster. This architecture allows you to implement failovers to ensure high availability of PolarDB databases.

Compute and storage decoupling

PolarDB decouples compute from storage. This allows you to scale clusters that are deployed on Alibaba Cloud to meet your business requirements. Compute nodes store only metadata and remote storage nodes store data files and redo logs. Database engine servers function as compute nodes and database storage servers function as storage nodes. You need only to synchronize the metadata of redo logs among your compute nodes. This reduces the replication delay between the primary node and read-only nodes. If the primary node fails, a read-only node can function as the primary node in a short period.

Read/write splitting

By default, read/write splitting is enabled for PolarDB clusters. The read/write splitting feature is available for free. This feature is transparent to users. This feature provides the capabilities of high availability and self-adaptive load balancing. The read/write splitting feature automatically forwards SQL requests to each node of PolarDB clusters based on cluster endpoints. This allows you to process a large number of concurrent SQL requests in high-throughput scenarios. For more information, see Read/write splitting.

High-speed network connections

High-speed network connections are used between compute nodes and storage nodes of PolarDB databases. The Remote Direct Memory Access (RDMA) protocol is used for data transmission between compute nodes and storage nodes. These two features eliminate the bottlenecks of I/O performance.

Shared distributed storage

In PolarDB, compute nodes share one set of data. This reduces your storage costs. PolarDB uses distributed storage and the distributed file system. This allows you to perform online scale-ups to increase the storage capacity of databases in a smooth manner. The online scale-ups are not affected by the storage capacity of each individual database server. The online scaling allows your databases to process hundreds of terabytes of data.

Multiple data replicas and the Parallel-Raft protocol

Storage nodes of PolarDB databases maintain multiple data replicas to ensure reliability and use the Parallel-Raft protocol to ensure data consistency among these replicas.