This topic describes the benefits, architectures, and scenarios of PolarDB X-Engine Edition .
Challenges and requirements for archiving historical data
In most cases, new data is read or updated more frequently than historical data. Historical data such as messages or orders generated one year ago is seldom accessed. A large volume of data that is not often accessed or never accessed is stored in your database system as your business develops. This can cause the following issues:
- Historical data and new data are stored in the same database system. This can result in insufficient disk space.
- A large volume of data shares the memory, cache space, and disk IOPS capabilities of the database system. This can deteriorate the database performance.
- The operation to back up a large volume of data requires a long period of time and can fail. Even if the operation is successful, the storage of the backup files is an issue that needs to be solved.
These issues can be resolved by archiving historical data. Historical data can be stored as files by using low-cost storage services, such as Object Storage Service (OSS) or Database Backup (DBS). In real business scenarios, historical data is not completely static. Historical data generated multiple months or years ago may be queried or updated in real time or occasionally. For example, historical data such as historical orders in Taobao or Tmall, historical messages in DingTalk, and historical Cainiao logistics orders can be queried within Alibaba Group.
To resolve the issues related to reads and updates of historical data, X-Engine can be used as a separate database to store only archived data. X-Engine must meet the following requirements:
- It must provide a large storage capacity to save online data that is continuously generated. This way, you do not need to worry about the storage capacity.
- It must provide the same interfaces as your online databases. For example, the archive database must support MySQL protocols in the same manner as the online databases. This ensures that your applications can access the online databases and X-Engine, without the need to modify your code.
- It must be cost-efficient. For example, you can compress data to reduce the consumed disk space and use low-cost storage media to store large volumes of data.
- It must provide read and write capabilities that meet the requirements of low-frequency reads and writes.
MySQL fails to provide a solution that meets all of the previous requirements, though MySQL is the most widely used open source database system in the world. Engines such as TokuDB and MyRocks provide high compression ratios. However, the volume of data that can be stored by using one of these engines is limited by the disk capacity of each physical machine.
Solution: PolarDB X-Engine Edition
To address the preceding challenges and meet the requirements to store archived data, PolarDB provides X-Engine Edition. X-Engine provides features that are developed based on the following technological innovations and breakthroughs:
- X-Engine is developed by Alibaba Cloud based on the log-structured merge-tree (LSM tree). X-Engine provides powerful data compression capabilities that allow you to use archive databases at a low cost. X-Engine uses the LSM tree and the Zstandard (ZSTD) data compression algorithm to increase the data compression ratio. Compared to InnoDB, X-Engine helps you save up to 70% storage space. For more information about X-Engine, see Introduction to X-Engine. X-Engine Edition has limits, especially in terms of the compatibility with MySQL. For more information, see Limits.
- PolarDB supports online expansion of the storage capacity based on shared distributed storage.
PolarDB connects computing resources and storage resources over a high-speed network
and transmits data by using the remote direct memory access (RDMA) protocol. This
eliminates the bottleneck of I/O performance. X-Engine integrated in PolarDB provides these benefits.
X-Engine is integrated in PolarDB by using the following technological innovations. This enables PolarDB to run in a dual-engine architecture.
- The write-ahead logging (WAL) log streams of X-Engine are combined with the redo log streams of InnoDB. This way, the same log streams and transmission channels are used to support InnoDB and X-Engine. The management logic and the logic of interaction with the shared storage remain unchanged. This architecture can be reused by other engines that are introduced later.
- The I/O module of X-Engine is adapted to Polar File System (PFS) of PolarDB InnoDB. This ensures that InnoDB and X-Engine use the same distributed storage. Backups are accelerated based on the underlying distributed storage.
Compute node architectures of X-Engine Edition
A cluster that runs X-Engine Edition provides a primary node and multiple read-only nodes. The primary node processes read and write requests, and an Archive Database cluster contains at least one read-only node. An Archive Database cluster supports the Dedicated and General-purpose specifications.
- X-Engine provides a large storage capacity. Based on the 200 TB storage capacity and the compression capability, X-Engine can store more than 500 TB raw data. X-Engine uses a serverless architecture so that the storage capacity can automatically increase as the data volume increases. This way, you do not need to specify the storage capacity when you purchase the PolarDB cluster. You are charged for the actual storage capacity that you use.
- PolarDB X-Engine Edition supports the official MySQL protocols. Compared to other solutions that back up historical data to NoSQL services such as HBase, X-Engine allows applications to access both online databases and X-Engine Edition without the need to modify the code.
- X-Engine uses the backup capability provided by the underlying distributed storage of PolarDB to back up a large volume of data in a short period. The backup files can be uploaded to and permanently stored in low-cost storage, such as OSS.
- The multi-node architecture of X-Engine Edition uses X-Engine, which provides powerful data compression capabilities to reduce storage costs and ensure high availability of clusters. When the primary node in a cluster fails, the cluster can automatically fail over to a read-only node. Then, the read-only node serves as the new primary node. This ensures that the service availability is at least 99.99%.
PolarDB X-Engine Edition provides a large storage capacity and can be used to store the historical data of multiple services. This ensures centralized storage and management for all historical data. X-Engine is suitable for the following scenarios:
- PolarDB X-Engine Edition is used to store cold data of self-managed databases. The self-managed databases can be MySQL, TiDB, PostgreSQL, SQL Server, or other relational databases.
- PolarDB X-Engine Edition is used to store archived data for ApsaraDB RDS for MySQL or PolarDB for MySQL. You can migrate the historical data that is not often accessed to PolarDB for MySQL X-Engine. This way, the storage space of online databases can be released to reduce costs and improve performance.
- PolarDB X-Engine Edition is used as a relational database service that provides a large storage capacity. This is applicable to scenarios in which a large volume of data needs to be written but the data is accessed at a low frequency, such as monitoring logs.
You can use Alibaba CloudYou can use Data Transmission Service (DTS) to continuously migrate data from the online database to PolarDB X-Engine Edition in real time. You can also use Data Management (DMS) to periodically import online data to PolarDB X-Engine Edition.
Supported kernel versions
Only PolarDB for MySQL that runs MySQL 8.0 is supported.
Node specifications and pricing
X-Engine Edition supports General-purpose and Dedicated specifications. For more information, see Specifications of compute nodes.
For more information about billing rules for X-Engine, see Billing rules of compute nodes.