This article discusses what X-Engine is and how it works compared to other high-compression storage engines and historical database architecture.
By Hudao and Zhouji
In today's digital age, rapid business growth results in massive amounts of data every day, increasing the storage costs of databases. A common solution is to archive historical data and migrate data that has not been accessed for a long time to inexpensive storage devices in the form of files. These storage devices include Alibaba Cloud Object Storage Service (OSS) and Database Backup Service (DBS).
However, for some core businesses, old data that was generated several months or years ago is still occasionally queried or updated in real-time. Such data includes historical orders from Taobao and Tmall, chat data generated by DingTalk (an enterprise-dedicated instant messaging tool) several years ago, and historical logistic order details from Cainiao's massive logistic data.
If you query old data that must be restored from historical backups, the query may take several days, which is unacceptable.
If you store historical data that is still occasionally queried in real-time in the same distributed database cluster as data that is frequently accessed, you have to deal with the following challenges:
- The storage cost is extremely high and far exceeds your revenue. For example, the amount of chat data in DingTalk is close to 50 PB even after being highly compressed, not to mention uncompressed data that results in unimaginably high storage costs.
- The performance challenge is huge. Performance deteriorates rapidly after the single-instance capacity exceeds 5 TB, even when you store increasing amounts of data in distributed mode. This affects the query performance for actively accessed data and can cause clusters to crash.
- O&M is extremely complex. It takes a long time to apply a table schema change to massive data volumes.
It is not advisable to store cold backup data and online historical data in the same physical table. To support real-time queries of historical databases, you must take the following factors into account:
Controllable Costs: The cost of storing data in an online database usually increases linearly, but you will find it financially unsustainable to store historical data in the same way.
Real-Time Queries: You must ensure that the response time (RT) for historical data queries is almost the same as that of online active database queries.
Low Query Frequency: Generally, old data is queried infrequently.
Unified Query Entry: You must provide the same query entry for active data and historical data.
Minimizing Transformation Costs: Ideally, no code should be modified and historical databases should be completely transparent to program developers.
- You may need to update historical data.
- Data volumes are large, usually more than 100 TB.
Introduction to X-Engine
X-Engine is an online transaction processing (OLTP) database storage engine that was independently developed by Alibaba Cloud's database business unit. X-Engine is one of the storage engines used by Apsara PolarDB, a database independently developed by Alibaba Cloud. X-Engine is widely used in many of the internal business systems of Alibaba Group, including the historical transaction database, DingTalk historical database, and other core applications. It is used as a key database technology for the Double 11 Global Shopping Festival, where it had successfully processed traffic peaks hundreds of times greater than average.
Different from the InnoDB storage engine, X-Engine uses a tiered storage architecture called Log-Structured Merge-tree (LSM-tree.) Tiered storage provides two significant advantages:
- Hot spot datasets to be indexed feature small sizes and higher write performance.
- The persistent data pages of the underlying layer are read-only and adopt a compact storage format. They are compressed by default, reducing storage costs.
By virtue of its data features, X-Engine reduces storage consumption by 10% to 50% compared with the InnoDB storage engine. We tested X-Engine's storage efficiency on two datasets: the well-known Link-Bench and Alibaba's internal transaction business dataset. According to the test results, X-Engine reduces storage space usage by a factor of two compared with InnoDB with compression enabled, and by a factor of 3-5 compared with InnoDB with compression disabled.
Why Did We Choose the Real-Time Historical Database Solution Over Other High-Compression Storage Engines?
- MySQL is currently the most popular open-source database and is often the first choice for online core database clusters. X-Engine requires no SQL code modification and features transaction support, low-cost access, and near-zero learning costs. These benefits are not provided by other high-compression storage engines.
- Compared with RocksDB, which also uses LSM-tree, X-Engine improves performance more than 10 times over.
- X-Engine introduces data reuse technology to the storage layer to optimize compaction performance. This reduces the impact of compaction on system resources in the LSM-tree and maintains stable system performance.
- X-Engine introduces multi-level caches and is integrated with the cache backfilling and prefetch mechanisms. X-Engine implements fine-grained access and caching to make up for the low read performance that is typical of LSM-tree-based engines. X-Engine provides a point query capability comparable to that of InnoDB.
The following table compares X-Engine and mainstream historical data storage solutions:
|Historical Data Storage Model
||Backup to OSS
||Support for parsing queries of historical backup files
|Cost of Application Code Modification
||Almost No Modification
||Support for single-line transactions only
||Big Data Ecosystem
The Design and Implementation of the Real-Time Historical Database Architecturef
The Alibaba Cloud database team launched a real-time online database and historical database solution, which uses X-Engine to store historical data, Data Transmission Service (DTS, an ecosystem tool) to transmit online and historical data, and Data Management (DMS) to ensure safe deletion of historical data. Alibaba Cloud provides multiple implementation methods to satisfy the needs of different business scenarios and customers. The following figure shows the main architecture of the real-time online database and historical database solution. Its design features are listed below:
- The time-tested InnoDB engine is used as the core engine of the OLTP online database to process frequent queries and update requests. This meets the needs of online active data access scenarios, such as high concurrency, high performance, and effective range queries.
- X-Engine is used as the core engine of the historical database to respond to requests for storing, querying, and updating historical data. This meets the needs of historical data access scenarios, such as different access frequencies for cold and hot data, low storage costs, and high performance (although the range query performance may be limited.) As mentioned above, X-Engine is a storage engine that was independently developed by the Alibaba Cloud Database Team and has passed stringent stress testing.
- At the database access layer, requests are forwarded to different storage engines based on the business time attribute that you set. Data access requests across different storage engines are aggregated at the application layer.
- To ensure the stability and reliability of online and historical data, you can use DTS, an ecosystem tool provided by Alibaba Cloud, to migrate historical data and delete expired data.
Online Database and Historical Database Splitting Solution
A real-time historical database is typically used to store data volumes that are too large to be stored on a single host. An online database may be a large Distributed Relational Database Service (DRDS) cluster or consist of multiple ApsaraDB for RDS nodes that are split horizontally or vertically based on business needs. To ensure the online database performance, we recommend that you decouple the online database from the historical database:
- Store full data in the historical database cluster
- Use DTS to implement real-time data synchronization between the online database and historical database
- Use DTS to filter delete operations
- Use the latest version of DMS to configure periodic deletion of historical data
Using a DRDS Cluster as the Source Database
ApsaraDB for RDS for Data Synchronization
- The underlying ApsaraDB for RDS nodes are interconnected by multiple DTS links, which improves synchronization performance.
- Batch API creation and configuration are supported for the large number of ApsaraDB for RDS nodes.
- Link stability is improved.
- The source and destination databases must have the same number of tables and consistent data routing rules.
Using DRDS for Data Synchronization
- Only one DTS link is configured, which is convenient and cost-efficient.
- Data synchronization performance is poor.
- The DTS link is affected when the source database scales out its DRDS proxies.
- The number of instances and the data routing rules can be configured as needed for the source and destination databases.
Using Multiple ApsaraDB for RDS Nodes as the Source Database
Using Multiple ApsaraDB for RDS Nodes as the Destination Database
- No code modification is required.
- The historical database nodes may encounter disk capacity shortages after running for a period of time.
Using a DRDS Cluster as the Destination Database
- Minor code modifications may be required. The destination database performance cannot be guaranteed without a database and a table shard key.
- Multiple online database services can be merged into a historical database cluster, which simplifies the database architecture.
- DTS implements data synchronization based on ApsaraDB for RDS or DRDS.
ApsaraDB for RDS for Data Synchronization
DRDS for Data Synchronization
Using Different Storage Engines on the Same Instance
The online database and historical database splitting solution is complex. ApsaraDB for RDS supports the use of different storage engines on the same instance. When the total data volume is not large, you can use both InnoDB and X-Engine on the same instance.
You can use DMS, DataWorks, and data orchestration to easily transmit data and delete expired data on the same instance. The following figure shows the relevant architecture.
- You can deploy different storage engines on the same instance in an easy and flexible manner.
- It is difficult to strike a balance between the performance of the different storage engines by optimizing database kernel parameters.
- Historical data compaction may cause performance jitter on the instance.
- Databases or tables on the same instance cannot have the same name. Minor business transformation is required.
Using DTS to Transmit Online and Historical Data
DTS supports full and incremental synchronization, as well as data synchronization between different database products. DTS provides a powerful conditional filtering function, which is essential for the online database and historical database solution. You can configure DTS tasks to filter delete operations, and create custom data synchronization policies through simple UI operation.
Using DMS to Delete Expired Data from the Online Database
When you delete expired data from the online database, be sure to avoid performance jitter on the online database while ensuring data deletion efficiency. The latest version of DMS allows you to create a data change task to implement historical data scrubbing. The relevant operations are listed below:
- Delete historical data periodically and specify the scheduling time and duration for individual deletions
- Split large transactions to reduce the time, tables are locked during transaction execution, and avoid synchronization latency between primary and secondary nodes
- Retry data scrubbing in case of an interruption
- View the task running status and analyze the causes of failures
- This solution allows for a convenient and simple configuration.
The Historical Data Scrubbing Procedure
You can delete expired data without using DMS, but the deletion process is complex. You can perform splitting based on the primary key of a table to ensure that an appropriate amount of data is deleted at a time. This ensures the deletion efficiency without affecting online services.
The historical data deletion policy for the online database (assuming that the primary key is "id", the data retention period is 180 days, and the time attribute column is date_col):
- Initialize the
value Y=select min(id) from $table_name.
- During off-peak hours, run
DELETE FROM $table_name WHERE date_col< SUBDATE(CURDATE(),INTERVAL 180 DAY) and id >= Y and id <= Y+100000. When execution is successful, assign
Y=Y+100000 to the code.
- Have the program sleep for three seconds and then repeat Step 2.
- During peak hours, stop execution and record the current value Y. On the next day, start the execution from Y.
Important considerations for data scrubbing in the online database
- Avoid highly concurrent deletion in the code, which means Step 2 is started over again before it ends.
- Set the program's sleep duration to an appropriate and tested value. The 3-second duration is just an estimated value. The appropriate value depends on the synchronization latency between the primary and secondary nodes.
- 100000 is also an estimated value. We recommend that you perform testing to determine a value that helps maximize efficiency and minimize business impact. The value 100000 does not indicate 100,000 records because the DRDS sequence does not increment by 1 for every single point.
- If the data deletion program stops responding or some data is not deleted after several days of running the program, you can estimate the number of records with an ID less than 1000000 and manually delete a small amount of residual data. If the record quantity is not large, run
DELETE FROM logs_trans WHERE reqdate < SUBDATE(CURDATE(),INTERVAL 30 DAY) and id<100w. Then, initialize the program to repeatedly run the SQL statement that deletes a small number of records at a time.
Extreme Scenario Analysis
During critical times, the real-time historical database solution may encounter extreme scenarios, causing services to read dirty data from the database. As an example, assume that the online database retains data for 180 days.
- A request to update the data that was generated at 23:59:59 (179 days ago) is routed to the online database.
- The update request cannot be promptly sent to the historical database due to the interruption or latency of the data synchronization link.
- When a service queries this data record, the query request is routed to the historical database because the data record has existed for more than 180 days. Therefore the historical database returns dirty data.
- Configure link exception alerts to promptly detect and handle link exceptions.
- Generally, the affected data is the data near the critical time before the DTS link is restored. We recommend correcting the data according to business logic.
- We recommend setting a conservative deletion policy for expired data. For example, if you need to keep data for 180 days, set the program to only delete data that was generated 190 days ago. In extreme scenarios, you can compare data in the source and destination databases to correct the data.