All Products
Search
Document Center

PolarDB:Integrated centralized-distributed architecture

Last Updated:Apr 16, 2025

This topic describes the integrated centralized-distributed architecture of PolarDB-X.

Background information

When purchasing a database, you must choose between a centralized or a distributed database. For most small and medium-sized enterprises (SMEs), centralized databases are enough to meet their daily business needs. These databases have moderate resource requirements, reasonable costs, and are easy to manage. On the other hand, distributed databases provide higher performance, handle complex business scenarios, and efficiently meet demands for high throughput, large storage capacity, low latency, easy scalability, and ultra-high availability. However, distributed databases are more expensive and have higher technical barriers and O&M costs, making them less suitable for SMEs.

When SMEs experience sudden spikes in business activities, they require high-concurrency and high-throughput databases to handle the increased load. As the business grows, operations initially relying on centralized databases may require distributed expansion.

To address these challenges, PolarDB for Xscale (PolarDB-X) introduces an architecture that integrates centralized and distributed capabilities. The architecture provides the availability and scalability of distributed databases while maintaining the functionalities and performance of centralized ones.

Features and benefits

In an integrated centralized-distributed database, data nodes (DNs) operate independently in a centralized mode, fully replicating the functionality and behavior of a standalone database. If a business requires distributed scalability, the architecture can be upgraded in place to a distributed mode. During the transition, the distributed components integrate with the existing data nodes, eliminating the need to migrate or modify data on the application side. This allows users to benefit from the enhanced availability and scalability provided by distributed systems.

Instance editions

PolarDB-X instances are available in Standard Edition (Centralized) and Enterprise Edition (Distributed).

image
  • Standard Edition (Centralized)

    PolarDB-X Standard Edition operates in centralized mode. In this edition, the distributed DNs use a multi-replica mechanism to independently provide services. The minimum resource requirements for this edition is 2 CPU cores and 4 GB of memory.

    image

    PolarDB-X Standard Edition uses the Paxos majority-based replication protocol. Compared with the primary-secondary replication protocol of MySQL, Paxos ensures strong consistency across replicas and can help achieve financial-grade high availability (RPO = 0, RTO<10 seconds). The Lizard distributed transaction engine provides enhanced availability and has an approximately 35% higher performance than native MySQL engines.

  • Enterprise Edition (Distributed)

    PolarDB-X Enterprise Edition operates in distributed mode and provides a complete set of distributed components, including compute nodes (CNs), DNs, change data capture nodes (CDCs), columnar nodes (COLUMNAR), and global meta service (GMS). This edition is highly compatible with the MySQL ecosystem and supports strongly consistent distributed transactions, distributed parallel query execution, and horizontal scaling. The following figure shows the technical architecture:

    image

Upgrade Standard Edition to Enterprise Edition

As business grows, PolarDB-X Standard Edition may encounter bottlenecks due to various issues such as degraded query performance caused large centralized tables, sustained high loads from concurrent queries, and inability to meet analytical demands. In such scenarios, vertical scaling is not enough and cost-effective to address these challenges.

To overcome these challenges, PolarDB-X supports the upgrade from Standard Edition to Enterprise Edition. Enterprise Edition uses a distributed architecture and hybrid transaction/analytical processing (HTAP) capabilities to resolve the issues related to centralized databases while maintaining the efficiency and performance of standalone MySQL databases.

image
Note
  • Standard Edition and Enterprise Edition share the same set of DNs. During an upgrade, the system enhances the architecture by adding distributed components, such as CNs, CDCs, and GMS, to the Standard Edition without migrating data. In addition, both editions operate on the same dataset. Data inconsistency or corruption does not occur during upgrades or rollbacks.

  • Centralized tables can be converted in place into distributed tables with the online data definition language (DDL) capabilities. This grants the system distributed scalability.

  • The connection endpoints remain unchanged after an upgrade from Standard Edition to Enterprise Edition. You do not need to modify the logic or application code. During the upgrade, you may experience a short connection interruption (a few minutes).

  • Enterprise Edition uses a transparent distribution architecture and is highly compatible with the centralized MySQL ecosystem, which eliminate the need for application modifications.

  • To address the additional performance overhead introduced by distributed characteristics, PolarDB-X uses table groups and partition groups to ensure that associated data is distributed in a centralized manner. This improves the performance of distributed transactions and complex queries. In scenarios where operations are confined to a single partition, this ensures that the performance matches that of a traditional centralized database.

Storage pools and elastic specifications

PolarDB-X incorporates the concepts of storage pool and locality to provide a distributed architecture that supports linear scalability while retaining the benefits of a centralized one.

  • Storage pools organize DNs into distinct, non-overlapping groups to ensure that each DN belongs to only one pool. You can add or remove a DN from a storage pool.

  • You can use the locality attribute to bind databases, tables, or partitions to a specific storage pool.

image
Note

This passage uses two scenarios to describe how to use the storage pool and locality attribute features to achieve distributed scalability in a database system.

  • Scenario 1: A multi-tenant Software as a Service (SaaS) system operates on a PolarDB-X Standard Edition instance. After the instance is upgraded to Enterprise Edition, the tenant data can be vertically distributed into different storage pools. Within each pool, each table is retained in the centralized structure, as shown in Storage Pool 1 in the preceding figure.

  • Scenario 2: An e-commerce business runs on a PolarDB-X Standard Edition instance. As the user base and transaction volume increase, business data can be horizontally distributed into multiple DNs in a storage pool, as shown in Storage Resource Pool 3 in the preceding figure.

The following figure shows how tables are redistributed when a centralized database architecture transitions into a distributed one by using online DDL.

image
Note
  • If the business uses multiple centralized tables, the tables can be vertically distributed across multiple DNs within the storage pool while retaining their original centralized structure.

  • If the business uses large tables, the tables can be converted into distributed tables online by using horizontal partitioning. The system automatically redistributes the partitions to balance the data load across all available nodes within the storage pool.

  • If the business uses large tables and smaller centralized tables, large tables are converted into distributed tables by using horizontal partitioning. The smaller centralized tables retain their original form but are vertically distributed to different storage pools.

  • Different DNs may have varying resource demands due to differences in data distribution and workload patterns. PolarDB-X provides the data node management capabilities to allow you to independently adjust the resource allocation of each DN. This ensures that each node has sufficient resources to handle its workload, improving overall resource utilization and cost efficiency.