This page explains why distributed databases have become a necessity and how PolarDB-X evolved to address that need. It is intended for technical decision-makers, architects, and engineers who want to understand the forces shaping PolarDB-X's design.
Why distributed databases
Databases, operating systems, and middleware are the three pillars of enterprise system software. Every application routes its data through the database layer, so its performance and stability directly determine how well upper-layer applications behave.
The scale of database infrastructure reflects this centrality. Gartner's 2017 data shows that worldwide revenue for basic enterprise software totaled USD 195.852 billion, with databases accounting for USD 38.8 billion—20% of the total.
The limits of traditional architectures
Traditional enterprise payment systems relied on the IOE architecture: mainframes or minicomputers from IBM, databases from Oracle, and storage from EMC. This architecture is costly and creates strong vendor dependency.
More critically, it does not scale horizontally. As the mobile internet grew and e-commerce volumes surged—Double 11 shopping festivals alone exposed exponential traffic spikes across logistics, payment, and warehouse systems—the performance ceiling of a single high-performance server became a hard constraint. Moore's law was slowing down, and vertical scaling was no longer a viable answer.
Financial institutions faced similar pressure. Inclusive financing and digital payments created demand for systems that could handle high-frequency transactions between core accounts, support mobile payment scenarios, and provide real-time transaction monitoring—all at scale.
What the next generation requires
Standalone databases cannot meet these demands. The next generation of distributed databases must support:
High availability and disaster recovery Financial-grade data durability and fault tolerance.
Horizontal scaling Scale compute and storage independently, on demand.
Low-cost storage Cloud-native economics instead of expensive proprietary hardware.
Transparent distribution Applications interact with the database the same way they would with a standalone MySQL instance.
Hybrid transactional and analytical processing (HTAP) Handle both transaction workloads and analytical queries in a single system.
New hardware integration Leverage technologies such as remote direct memory access (RDMA) to push performance further.
PolarDB-X overview
PolarDB-X is a cloud-native distributed database service built by Alibaba Cloud. It integrates the SQL engine from DRDS (Distributed Relational Database Service) and the storage technology from X-DB, Alibaba's self-developed distributed database. You interact with it the same way you would with a standalone MySQL database—the distributed architecture handles scale transparently beneath the surface.
PolarDB-X supports tens of millions of concurrent connections and hundreds of petabytes of data storage. It is designed to solve four core problems:
Mass data storage
Ultra-high concurrent throughput
Performance bottlenecks on large tables
Complex analytical query efficiency
PolarDB-X has been validated across multiple Double 11 shopping festivals and by customers in finance, payment, education, communications, and public utilities. It is the standard distributed database service for all core online business at Alibaba Group.
Evolution of PolarDB-X
PolarDB-X's architecture is the result of more than a decade of production experience at Alibaba scale.
2003–2009: From LAMP to de-IOE
Taobao launched in 2003 on the LAMP stack (Linux, Apache, MySQL, PHP). As user growth outpaced standalone MySQL capacity, Taobao migrated to Oracle. But Oracle databases still could not keep up with the scale of Taobao's business.
In 2009, Alibaba Group launched the de-IOE campaign—a systematic effort to replace IBM, Oracle, and EMC dependencies with independent technologies. This was the starting point for PolarDB-X.
TDDL: Solving scalability with sharding
The de-IOE campaign required a replacement for Oracle. Alibaba developed TDDL (Taobao Distributed Data Layer) combined with AliSQL—both built on the open source MySQL engine. TDDL addressed scalability through database sharding and a distributed architecture. By this time, the x86 architecture had matured to the point where commodity servers offered stability comparable to minicomputers, and MySQL's lightweight thread model made it competitive for high-concurrency workloads.
TDDL was an internal system architecture, not a deliverable product.
DRDS: Cloud database service (2014)
In 2014, once the TDDL architecture reached maturity, Alibaba Cloud productized the sharding technology and launched DRDS, a distributed cloud database service built on DRDS and ApsaraDB RDS for MySQL. DRDS used a share-nothing architecture focused on storage expansion, delivering database capabilities as a managed service for the first time.
PolarDB-X 1.0: From middleware to distributed database (2019)
Alibaba Cloud continued extending DRDS with capabilities that moved it from middleware toward a full distributed database service:
Kernel features: distributed transactions, global secondary indexes (GSI), asynchronous DDL
SQL compatibility: subquery unnesting, JOIN query pushdowns
Operations: smooth scale-out, consistent backup and restoration, flashback queries, SQL audit
In 2019, this evolution reached a milestone: the launch of PolarDB-X as a distinct product line.
PolarDB-X 2.0: Cloud-native architecture (2021)
By 2018, the computing layer had exposed fundamental limitations: it could not guarantee the Repeatable Read isolation level, calculation pushdowns were constrained by SQL compatibility, data queries and transmission were inefficient, and linear consistency across replicas could not be ensured. These problems pointed to a single conclusion: the computing and storage layers needed deep integration.
The solution drew on three proven components:
DRDS SQL engine: the SQL processing and optimization layer
X-DB: Alibaba's self-developed distributed storage, built on AliSQL, using the X-Paxos protocol library and X-Engine storage engine, with triplicate storage for reliability at low cost
PolarDB cloud-native architecture: RDMA-based storage-compute decoupling, enabling a cluster of one primary node and one or more read-only nodes, with auto scaling and data restoration within seconds
In 2021, Alibaba Cloud released PolarDB-X 2.0, combining all three. PolarDB-X 2.0 focuses on the problems standalone databases cannot solve: strong consistency in distributed systems, smooth migration from standalone to distributed deployments, low-cost cloud-native storage, and on-demand auto scaling.
PolarDB-X 2.0 is available in three deployment modes:
Alibaba Cloud (managed service)
Apsara Stack (on-premises)
Lightweight software edition