All Products
Search
Document Center

PolarDB:Product overview

Last Updated:May 19, 2025

PolarDB is a new generation of cloud-native relational database developed by Alibaba. With a storage-compute decoupled architecture and integrated software and hardware, it provides database services with auto scaling within seconds, high performance, high availability, mass storage, and reliability. It is 100% compatible with MySQL and PostgreSQL ecosystems, highly compatible with Oracle syntax, and supports both centralized and distributed architectures. Compared with self-managed databases, PolarDB delivers up to 6 times the transaction performance and 400 times the analytical performance of open-source databases at 50% of the total cost of ownership (TCO).

With PolarDB, you can choose the database engine that best suits your application while maintaining compatibility with the underlying database engine.

PolarDB database

Ecosystem compatibility

Architecture

Product type

PolarDB for MySQL

100% compatible with MySQL

Shared storage and compute-storage decoupled architecture

Public cloud, Apsara Stack Enterprise Edition, DBStack

PolarDB for PostgreSQL

100% compatible with PostgreSQL, highly compatible with Oracle syntax

PolarDB-X

Share nothing and Integration of centralized and distributed architectures

Benefits

Ecosystem compatibility

  • 100% compatible with MySQL and PostgreSQL ecosystems.

  • Highly compatible with Oracle syntax. Provides an end-to-end Oracle migration solution with zero service interruption, minimal risks, and predictable progress. This solution has helped over 500 customers migrate from Oracle.

  • With centralized or distributed architecture, PolarDB can easily integrate into existing systems and help you smoothly upgrade databases.

High performance

  • Provides online transaction processing (OLTP) performance up to 6 times that of open-source databases.

  • Provides online analytical processing (OLAP) performance up to 400 times that of open-source databases, and complex query acceleration and real-time analysis capabilities such as parallel query and in-memory column index (IMCI).

High availability

  • Supports multiple high availability configurations including single-zone deployment, dual-zone deployment, three-zone deployment (RPO=0), and cross-region deployment to prevent failures and ensure data security.

  • Provides up to 99.995% service availability (SLA).

Mass storage

  • PolarDB for MySQL and PolarDB for PostgreSQL supports up to 500 TB of storage.

  • PolarDB-X supports storage capacity of petabytes of data.

Scalibility

  • Intelligent proxy supports multiple read consistency levels.

  • Supports serverless dynamic and elastic scalability and distributed linear scalability.

  • Uses low-latency physical replication technology to improve the efficiency and stability of replication between nodes.

Security

  • Access control: RAM users, IP whitelists, security groups, virtual private clouds (VPCs).

  • Data security: transparent data encryption (TDE), backup and restoration, flashback query.

  • Transmission security: SSL encryption.

Architecture

PolarDB for MySQL

Designed based on the cloud-native architectural philosophy, PolarDB for MySQL combines the stability, high performance, and scalability of commercial databases with the simplicity, openness, and rapid iteration of open source cloud databases. PolarDB decouples computing from storage and effectively and seamlessly integrates software and hardware to deliver a database service that provides auto scaling within seconds, high performance, massive storage capacity, and robust security and reliability.

产品概述-流程图 (8)

  • PolarProxy

    PolarProxy in a PolarDB cluster serves as a proxy between an application and the cluster. It receives and routes all requests from the application, performs authentication, and provides advanced features, such as automatic read/write splitting, load balancing, consistency levels, connection pools, and overload protection.

  • Compute nodes

    • PolarDB uses multi-node clusters to provide services. Each PolarDB cluster consists of one primary node that handles both read and write operations and multiple read-only nodes. A Multi-master Cluster contains multiple read/write nodes and multiple read-only nodes.

    • An active-active failover mechanism is used to enable smooth and automatic role transitions between nodes and ensure high database availability.

    • Compute nodes primarily provide the database SQL engine and are available in general-purpose and dedicated specifications.

  • Shared distributed storage

    Multiple compute nodes share one copy of data, which significantly reduces storage costs. The new distributed storage system and distributed file system allow the storage capacity to seamlessly and dynamically scale without being constrained by the storage limits of individual database servers. This allows the cluster to handle data volumes of up to hundreds of terabytes.

PolarDB for PostgreSQL

PolarDB for PostgreSQL clusters support both centralized and distributed architectures:

  • Centralized

    Designed based on the cloud-native architectural philosophy, PolarDB for PostgreSQL combines the stability, high performance, and scalability of commercial databases with the simplicity, openness, and rapid iteration of open source cloud databases. PolarDB decouples computing from storage and effectively and seamlessly integrates software and hardware to deliver a database service that provides auto scaling within seconds, high performance, massive storage capacity, and robust security and reliability.

  • Distributed

    Based on the centralized PolarDB for PostgreSQL clusters, distributed databases adopt a compute node/data node dual-layer architecture and provide distributed scaling capabilities with decoupled compute and storage resources. It also supports all features of centralized PolarDB for PostgreSQL clusters and meets the performance and reliability requirements of enterprises.

Centralized (Enterprise Edition and Standard Edition)

产品概述-流程图 (8)

  • Database proxy (Proxy)

    The database proxy is a network proxy service located between the database and applications, which handles all requests when applications access the database. The proxy layer provides security authentication and supports advanced features such as automatic read/write splitting, load balancing, consistency levels, connection pooling, persistent connections, and overload protection.

  • Database compute nodes

    • PolarDB uses a multi-node cluster architecture. A one-write-multiple-read cluster has one read/write node and multiple read-only nodes.

    • The read/write node and read-only nodes use an Active-Active Failover approach to provide high availability database services.

    • Compute nodes primarily provide database SQL engine functionality and are available in general-purpose and dedicated specifications.

  • Shared distributed storage

    Multiple compute nodes share a single copy of data rather than each compute node storing its own copy, which significantly reduces storage costs. Based on newly built distributed block storage (Distributed Storage) and file system (Distributed Filesystem), storage capacity can be smoothly expanded online, avoiding the storage capacity limitations of a single database server and effectively handling data volumes of hundreds of terabytes.

Distributed

产品概述-流程图 (18)

  • Database nodes

    • The cluster consists of compute nodes and data nodes. Compute nodes manage cluster metadata and create distributed execution plans. Data nodes store actual data shards.

    • Each data node and compute node in the cluster uses a centralized architecture that decouples storage and computing. The cluster supports database proxy and one-write-multiple-read mode and provides failover mechanisms for compute nodes and data nodes. The cluster allows the addition of read-only nodes to enhance the read capability of individual compute nodes or data nodes.

  • Distributed features

    • The cluster supports manual sharding and provides horizontal scaling capabilities, making it suitable for business scenarios that involve data volumes below petabytes (PB).

    • The cluster ensures consistency in distributed transactions.

    • The cluster supports dual-zone deployment. Operations are performed on the active working cluster in the primary zone. In the event of a failure in the primary zone, the hot standby cluster in the secondary zone can seamlessly take over to ensure uninterrupted service.

    • The cluster offers 24/7 non-disruptive operation during maintenance, upgrades, and configuration changes. The cluster supports the addition of heterogeneous compute nodes and data nodes.

PolarDB-X

In PolarDB-X, data nodes independently operate in a centralized manner and are fully compatible with the single-node database model (100% compatible with MySQL 5.7 and 8.0). If distributed scaling is required due to business expansion, you can upgrade the architecture to a distributed architecture with ease. The corresponding distributed components integrate seamlessly with the original data nodes to facilitate scaling. This allows you to take advantage of the availability and scalability brought by the distributed architecture without data migration or application modification.

Centralized (Standard Edition)

产品概述-流程图 (11)

  • Data node

    Data nodes are used to persistently store data in PolarDB-X and provide reliable and consistent storage services based on the Paxos protocol. The self-developed Lizard transaction system offers higher availability and approximately 35% performance improvement compared to the MySQL distributed engine.

  • Multi-replica architecture

    PolarDB-X uses Paxos to ensure strong consistency between replicas (RPO = 0). Paxos requires each write operation to be confirmed by more than half of the nodes. This ensures that a cluster can continue to provide services as expected even if one of the cluster nodes fails. Paxos can ensure strong consistency by eliminating inconsistency issues among replicas. Replicas are classified into the following roles:

    • Leader

      A leader processes client requests and makes decisions. A leader maintains logs to ensure data consistency and recoverability.

    • Follower

      A follower accepts and executes instructions from the leader. When the leader fails or becomes inaccessible, a follower can be elected to become the new leader.

    • Logger

      Similar to a follower, a logger supports the majority consensus protocol but does not provide data services. When the leader fails or becomes inaccessible, loggers participate in the leader election. A logger may be elected as a temporary leader, but it does not provide data services. After the followers are updated and synchronized their status with the most recent data, the logger gives up leadership.

    • Learner

      A learner passively receives state information from the system and does not participate in the leader election or decision-making process. Therefore, a learner produces minimal overhead on the system.

Distributed (Enterprise Edition)

产品概述-流程图 (9)

  • Global meta service (GMS)

    Responsible for maintaining globally consistent system metadata such as table metadata, schema metadata, and statistics metadata. GMS manages security-related information such as user accounts and permissions. GMS provides the Timestamp Oracle (TSO) service.

  • Compute node

    Compute nodes are the entrance of the system. These nodes use the stateless design, and include models such as the SQL parser, optimizer, and executor. Compute nodes are responsible for distributed data routing, computation, dynamic scheduling, distributed transaction coordination based on the Two-Phase Commit (2PC) protocol, distributed DDL execution, and global secondary index maintenance. Compute nodes provide enterprise-level features such as SQL throttling and the three-role mode.

  • Data Node

    Data nodes are responsible for persistent row-oriented data. Data nodes ensure data durability and guarantee strong consistency based on Paxos. Data nodes use MVCC to maintain the visibility of distributed transactions. Data nodes can also meet the requirements of operations that need to push down computing tasks in distributed architectures, such as Project, Filter, Join, and Aggregation.

  • Columnar node

    Columnar nodes provide persistent CCIs, builds CCIs based on OSS, and consumes the binary logs of distributed transactions in real time to meet the requirements of real-time updates. When combined with compute nodes, columnar nodes can provide snapshot-consistent query capabilities for CCIs.

  • Change data capture (CDC) node

    CDC nodes provide incremental subscription capabilities fully compatible with MySQL binlog format and protocol. CDC nodes also provide primary/secondary replication capabilities compatible with MySQL replication protocol.

Tutorial video

How to use PolarDB

You can use the following methods to manage PolarDB clusters:

  • Console: The PolarDB console is a friendly web-based GUI.

  • API: You can use the API to perform all operations that are available in the PolarDB console.

  • SDK: You can use SDKs to perform all operations that are available in the PolarDB console.

  • CLI: You can use Alibaba Cloud CLI to perform all operations that are available in the PolarDB console.

References

PolarDB database

Billing

User guide

Whitepapers

PolarDB for MySQL

Billing

User Guide

Performance White Paper

PolarDB for PostgreSQL

Billable items

Performance White Paper

PolarDB for PostgreSQL (Compatible with Oracle)

Billable items

PolarDB-X

Billing overview

User Guide