The Alibaba Cloud 2021 Double 11 Cloud Services Sale is live now! For a limited time only you can turbocharge your cloud journey with core Alibaba Cloud products available from just $1, while you can win up to $1,111 in cash plus $1,111 in Alibaba Cloud credits in the Number Guessing Contest.
By Beilou and Lengzhi
Tmall and Taobao trade hundreds of millions of physical and virtual commodities every day. The entire chain of a successful transaction includes many steps, such as member information verification, commodity information retrieval, order creation, inventory deduction, discount application, order payment, logistics information updates, and payment confirmation.
Each link of the chain involves the entry creation and status update in the database. A successful transaction corresponds to hundreds of database transactions in the backend information system. The entire database cluster behind the trade system has tens of billions of transaction reads and writes per day. This not only incurs great performance challenges to the database system but also great storage cost pressure due to the massive data that grows on a daily basis.
Orders, as the most critical information, must be stored in the database permanently, since they may involve transaction disputes, and information on them may be necessary to address inquiries at any time. In the 17 years since Taobao was founded, the total number of order-related database entries has reached trillions, and the disk space consumption is in petabytes. With such a huge dataset, it is a major technical challenge to achieve low latency for users' queries while maintaining a low storage cost. Meanwhile, historical orders from users are massive in number, but retaining this data is necessary.
Since Taobao's foundation in 2003, the architecture of the trade order database has evolved several times with the increasing traffic.
In the first stage, due to low traffic, an Oracle database was used to store all the order information. The creation of new orders and querying historical orders were performed in the same database.
In the second stage, as the historical order data grew, a single database could no longer meet the performance and capacity requirements. Therefore, the database for trade orders was split. A separate Oracle database was built to store the historical data. Orders older than three months were migrated to this historical database. At the same time, due to the huge amount of data, the query performance could not meet the requirements. Therefore, queries on historical orders were not available at that time. Instead, users could only query orders created within the last three months.
In the third stage, the historical database was migrated to HBase to improve scalability and reduce storage costs. The HBase solution successfully provided business inquiry capability and reduced storage costs. The solution combined primary tables with index tables. Primary tables were used to query order details, while index tables were used to retrieve the order number before querying the order through the ID of the buyer or seller.
However, this solution had a problem that orders were not migrated strictly according to the 90-day period. Many types of orders were not migrated to the history database. As a result, the purchase order list was out of order, rather than being sorted by time in descending order. If a user scrolled down the order list one page after another, the user would find that a recent order was suddenly lost. In fact, the order was still there but not listed in sequence by time.
In the fourth stage, the historical database used the PolarDB-X cluster based on X-Engine. This solution met the storage cost requirements and provided the same search capability as the online database and solved the out-of-order problem.
Looking back at the evolution of Taobao's trade database, in the 10 years since a separate historical database was split off, the business team and the database team have dealt with several core challenges:
In 2018, Taobao users raised an increasing number of complaints about disordered orders. Database storage issues caused this problem, and this troubled the users a lot. Therefore, the business team decided to fix this problem. Identified from the preceding analysis, an ideal trade history database needs to meet three requirements: low cost, low latency, and abundant features. By using the InnoDB engine, which is also used for online databases, the solution cannot achieve a low storage cost, whereas, by using HBase, the solution cannot leverage consistent secondary indexes.
In 2018, the proprietary X-Engine was gradually implemented within Alibaba Group. Based on the streamlined feature of the trade business, Alibaba had designed a native architecture that separates hot data from cold data. Cold data in X-Engine is compactly packed in data pages, and all the data blocks are compressed by default. This architecture achieves high performance at a low cost. Therefore, X-Engine was quickly implemented in many internal services, such as the case described in the article, "How X-Engine Supported the Surge in DingTalk Data Volumes".
When we explore the trade history database solutions, one idea is to merge the online database with the history database. We could leverage the X-Engine's hot-cold separation capabilities to achieve high-performance access to orders within the last 90 days and low-cost storage of orders from more than 90 days ago. At the same time, the order database also provides functions like secondary indexes; this not only solves the order sorting issue but also simplifies the code for the business layer.
However, the transaction order system has been iterated for ten years under the architecture that separates the online database from the history database. The code of many business systems is compatible with this separated architecture. Hence, considering the risks of business code transformation and migration, we inherited the architecture that separates the online and history databases. The only modification was that we replaced the original HBase cluster with a PolarDB-X cluster, a cluster based on X-Engine.
This solution has similar storage costs for the history database as the HBase solution. At the same time, as the history database creates the same indexes as the online database, the support for sorting orders by time has come back. In addition, read-write latency is low.
Considering the continuity of the historical code architecture at the business level, Taobao's trade order database adopts a solution that separates the history database on X-Engine from the online database on InnoDB. In this architecture, the X-Engine history database simultaneously handles write operations from the online database and read-write operations to the historical orders that were created 90 days ago.
In fact, the recently written entries are frequently accessed and the access frequency decreases sharply over time, the hot-cold separation mechanism within X-Engine handles this streamline business pattern. Hence, a database cluster on X-Engine alone meets all the requirements.
For a new business or an existing business that requires storing massive streamline data and has not performed the hot-cold separation, we recommend using one X-Engine only to reduce storage costs and simplify access code for the database layer. Both X-Engine-based distributed databases and PolarDB-X enable scale-out capability and reduce costs.
Alibaba Cloud has launched X-Engine, and it has been verified by the internal services of Alibaba. If you need high performance at a low cost, X-Engine is the perfect match. For more information, click here.
Alibaba Clouder - November 25, 2020
ApsaraDB - October 24, 2018
Alibaba Cloud Storage - May 14, 2019
Apache Flink Community China - September 27, 2020
Alibaba Clouder - November 27, 2018
Hologres - July 1, 2021
ApsaraDB for HBase is a NoSQL database engine that is highly optimized and 100% compatible with the community edition of HBase.Learn More
Alibaba Cloud e-commerce solutions offer a suite of cloud computing and big data services.Learn More
Block-level data storage attached to ECS instances to achieve high performance, low latency, and high reliabilityLearn More
More Posts by ApsaraDB