From: Alibaba Cloud Database 2021-07-08 587
Recently, at the 2021 ACM SIGMOD of the International Database top conference, a paper themed on PolarDB Serverless, it is believed by the jury to guide the development direction of the next generation of database services.
This article is entitled PolarDB Serverless: A Cloud Native Database for Disaggregated Data Centers this paper introduces the latest technical architecture research progress of Alibaba Cloud self-developed database PolarDB based on the separation Serverless computing and storage.
The employment of PolarDB Serverless paper marks that Alibaba Cloud PolarDB database has taken a leading step in exploring the latest generation architecture.
The following is an introduction to the core content of this breakthrough:
01 challenges of the first generation of cloud-native databases
These cloud database architectures are still traditional database architectures, only it runs on the cloud infrastructure, and the database itself does not make too much modifications and adaptation for the cloud. Limited by its architecture, the ratio of various resources is limited within a range, and its elastic range and resource utilization are greatly limited, which makes it impossible to take full advantage of the benefits of the cloud.
The first generation of cloud-native databases represented by Amazon Aurora and Alibaba Cloud PolarDB, this is the first time that the database architecture has been modified to separate storage and computing, based on this, one write and multiple reads are implemented, which is adapted to the cloud architecture to a certain extent, and storage is pooled and paid on a pay-as-you-go basis. This is a great improvement for cloud databases.
However, in this architecture, CPU and memory are still strongly bound, which makes it very difficult for computing to achieve real on-demand supply. In other words, CPU resources and memory resources are a whole and can only be used as the smallest unit for upgrading and downgrading. For example, in Amazon Aurora, the ratio of computing resources to cache resources is 1Core CPU:2GB memory.
However, binding the ratio of CPU to memory resources is unreasonable for users in some scenarios:
for example, an analyticdb user uses a few CPUs to synchronize and update data on a regular basis. However, a large amount of memory is required because dimension table data, or intermediate results need to be cached in memory to avoid the delay of reading from the disk.
In transaction databases, such as e-commerce and other Internet application scenarios, customers' applications often have hot spots. Therefore, a small amount of memory is sufficient to ensure that the cache hit rate exceeds 99%, however, the CPU needs to be bounced to 64C or more cores during peak hours, and the CPU demand is higher than the memory demand.
In short, the first-generation cloud-native database cannot decouple computing and memory resources, which is the core reason why the current cloud-native database price is still higher than that of RDS and self-built databases and cannot occupy most of the market.
02 breakthrough of new architecture
however, with the introduction of the new architecture of PolarDB Serverless, this situation may change greatly.
The biggest innovation of PolarDB Serverless lies in: for the first time in the industry, memory is decoupled from computing/storage, the memory is further pooled to form a three-layer pooling, which improves the elasticity by an order of magnitude. At the same time, the memory pooling greatly reduces the cost and achieves full on-demand usage and on-demand elasticity, and fits various scenarios.
PolarDB Serverless built a new database form, namely DCaaDB. (Datacenter as a Database) :
the entire IDC forms a multi-tenant large database, which consists of three independent resource pools: CPU, memory, and storage. When the resource pool is not exhausted, any user (tenant) can flexibly scale any resource to any specification. The CPU consumed by users for their SQL statements, pay for memory and storage, do not need to preset any specifications.
In this case, the usage of CPU and memory resources will be greatly increased due to their pooling, and the cost of cloud-native data will be much lower than that of integrated databases such as self-built and RDS, the value of cloud native technology will be fully reflected and the database market will be reshuffled.
03 technical difficulties
before the PolarDB Serverless, the academic community had done some research on the separation architecture and conducted some technical experiments, but none of them solved the problems of database performance and elasticity under the separation architecture.
PolarDB Serverless solved the problems that puzzled the industry through technological innovation:
The first challenge is to ensure that the system can execute transactions correctly after adding the memory pool design. For example, a modified data page should not read old data, even across nodes. We use a global cache consistency mechanism (similar to the cache consistency mechanism between multi-core CPUs). To implement.
In addition, when the master node is splitting or merging a B + Tree index, other nodes should not see inconsistent B- tree structures. We need to use global page locks to protect it. When a node runs a read-only transaction, it must avoid reading anything written to the uncommitted transaction. We implement this by synchronizing the global view between database nodes.
the second challenge is to execute transactions efficiently. Serverless architecture negatively affects database performance because the database needs to remotely access data (memory pool or storage pool), which causes additional network latency.
We use RDMA optimization, especially one-sided RDMA verbs, including RDMA CAS to optimize the process of obtaining global page locks. To improve concurrency, database nodes use optimistic lock technology to avoid unnecessary global page locks.
In addition, PolarDB kernel introduces some technologies to reduce the read/write bandwidth. For example, after using the redo log ingest technology, the storage can directly release the latest version of the page from the redo log, therefore, database processes no longer need to write dirty pages to remote storage. When the database accesses a page and the local cache fails, it needs to obtain the page from remote memory and remote storage across networks, which is slower than the local memory and disk, therefore, using prefetch to improve the hit rate of the local cache is the key to improve the load performance of the analysis and query class.
in Serverless architecture, the database is transformed from a single-kernel system to a cross-node deployment, and some logic of the database is embedded and run in the memory pool and storage pool services. The architecture becomes more complex, thus increasing the types and possibilities of system failures.
As a cloud database service, the third challenge is to build a reliable system. PolarDB designed a strategy to deal with single node crashes of different node types to ensure that there is no single point of failure in the system. In addition, the memory and storage states are decoupled from the database nodes, so the crash recovery time of the Serverless node using the PolarDB architecture is 5.3 times faster than that of the PolarDB kernel using the standalone architecture.
These results also give us reasons to predict, using a fully resource-separated architecture to implement cloud-native Serverless databases will become the development trend of cloud databases in the next five years.
Text | Alibaba Cloud database engineer Jiang Yi, Han Yi