By Lijiu
In May 2022, PolarDB-X began to develop a hybrid row-column storage architecture to enhance the HTAP (Hybrid Transactional/Analytical Processing) integration capability of databases. It aims to further improve the Analytical Processing (AP) capability by storing data in column storage format, which offers better data compression ratios and thus reduces storage costs. This can provide users with a better experience in data analysis scenarios.
PolarDB-X V2.4, released in April 2024, introduced in-memory column indexes (IMCI), adding columnar engine nodes to the original architecture. The following figure shows the current architecture.
• Compute Node (CN)
As the entry point of the system, the CN is designed to be stateless, including SQL parser, optimizer, executor, and other modules. CNs are responsible for distributed data routing, computation, dynamic scheduling, two-phase commit (2PC) coordination for distributed transactions, and global secondary index maintenance. CNs also provide enterprise-level features such as SQL throttling and the three-role mode.
• Data Node (DN)
The DN is responsible for data persistence. It provides high data reliability and strong consistency assurance based on the Paxos protocol. It also maintains the visibility of distributed transactions through Multi-version Concurrency Control (MVCC).
• Global Meta Service (GMS)
GMS maintains globally consistent system metadata such as tables, schema, and statistics. It maintains security information such as accounts and permissions and also provides Timestamp Oracle (TSO).
• Change Data Capture (CDC)
CDC nodes provide incremental data subscription capabilities that are fully compatible with the MySQL Binlog format and protocol, and master-slave replication capabilities that are compatible with the MySQL Replication protocol.
• Column Store Node
Column store nodes provide persistent IMCIs, consume the binary logs of distributed transactions in real time, and build IMCIs based on OSS to meet the requirements of real-time updates. When combined with CNs, column store nodes can provide snapshot-consistent query capabilities.
The first step is to determine the architecture, which needs to meet the following expectations:
• The architecture should not affect the existing row storage. To this end, resources need to be isolated, so column store nodes are added.
• The architecture should separate CNs from DNs, which significantly improves resource utilization and high availability.
• The architecture should continue to use a distributed framework to enhance scalability, so shared storage is adopted.
Given these three considerations, a basic outline of a distributed database begins to take shape.
As shown in the figure, the PolarDB-X on the left is the previous row-store architecture. On this basis, a columnar engine is added to synchronize row-store data and store the data in the shared storage in a column storage format. The read-only column-store compute node is responsible for receiving SQL requests from users, parsing them into operators to be queried, and returning data to users by reading files in the shared storage. A complete data link is thus formed with clear functions of each component. The next step is to refine the design of each component.
The second step is to determine the form of shared storage. Column storage is more oriented to analytical computing scenarios. This data scenario has several requirements for storage:
• Massive storage: The AP scenario requires large storage capacity, high scalability for storage, and expandability.
• High bandwidth: The AP scenario involves a large computing volume and high multi-thread concurrency, which requires a high bandwidth.
• Persistence and security: These are essential features for storage.
• Minimized storage cost: Every user expects to reduce storage costs as much as possible while meeting performance and reliability requirements.
Considering these factors, PolarDB-X opts for Alibaba Cloud Object Storage Service (OSS) as a common scenario. OSS is a massive, secure, low-cost, and highly reliable cloud storage service, which can provide 99.9999999999% of data durability and 99.995% of data availability. Regular users can achieve 10Gbps bandwidth in performance, and most importantly, its storage cost is extremely low.
The low cost of OSS allows for significantly reduced column storage expenses. Moreover, column store files support different compression algorithms, usually with 3 to 10 times the data compression efficiency, further reducing the compression cost. However, OSS is more suitable for public cloud users. For other application scenarios, other shared storage, such as NFS and S3, needs to be supported. Therefore, the columnar engine and CNs need to abstract the file system interface to facilitate subsequent compatibility with different storage models.
The third step is to design a read-only column-store compute node. While PolarDB-X row store is a relatively mature compute node design solution, there are great differences between column storage and row storage in terms of the optimizer and executor. AP and TP scenarios are fundamentally different. Therefore, read-only column-store compute nodes need to design an optimizer and executor tailored for AP scenarios. PoldarDB-X chooses to design them based on existing row-store compute nodes, using the same codebase but launching in different modes. This approach facilitates the design of the optimal execution plan for hybrid row-column storage, making full use of the strengths of row storage and column storage. When users execute an SQL statement, it can be processed concurrently by both row and column storage with results returned. In the new database architecture of hybrid row-column storage, we are exploring best practices to better adapt to the diversified data processing needs and drive technological innovation at the forefront of the database.
The fourth step is to design an appropriate columnar engine, complete data synchronization, and convert row storage to column storage, which will be detailed in the following sections of this article.
The columnar engine is mainly responsible for two tasks: transaction synchronization and column storage. Synchronization refers to the ability to synchronize data from row storage to column storage. This ensures consistency between row and column data. Storage refers to the ability to convert row data into column data to form column store files. As a product, the columnar engine needs to be designed with considerations for high availability, persistence, data verification, and storage space to eventually meet all business production requirements, thus developing into a mature industrial-grade solution.
PolarDB-X is a distributed database product in the MySQL ecosystem. Data synchronization is an important means for data flow between different products in the MySQL ecosystem. MySQL uses the binlog mechanism to synchronize incremental data with downstream systems in real time. For example, synchronization tools such as Kafka, Canal, and DTS support upstream and downstream synchronization through binlogs. PolarDB-X launched the global binlog plan at an early stage. The CDC node is already available to merge and sort the binlogs of multiple DNs under the distributed database to form globally consistent binlogs, which can be used by other MySQL ecosystems or tools for incremental data synchronization.
To ensure the high availability and consistency of distributed transactions, PolarDB-X introduces a global timestamp oracle feature (TSO). Each transaction is committed with a Commit TSO value (CTS). CDC nodes sort transactions based on CTS values to form global binlogs. This means that compared with ordinary binlogs, the PolarDB-X global binlog adds a CTS value record for each transaction. The CTS value represents the commit timestamp of the transaction, and the value increases sequentially. It can also be considered as a snapshot point of row storage, which can then evolve into a snapshot version of the row storage snapshot. This value is also closely related to the column store index, so please remember it.
The columnar engine continues to be compatible with the MySQL ecosystem. It uses binlogs to synchronize incremental data. In PolarDB-X, it uses global binlogs to consume data to make column-oriented data consistent with row-oriented data. Global binlogs can be regarded as the log records of all DML operations, essentially a series of transaction operation records, as shown in the following figure.
All data changes are recorded as transactions, and each Commit event is followed by a CTS record, which represents the commit TSO value of the transaction. If the columnar engine detects tables that contain IMCIs by consuming binlogs, the corresponding INSERT, DELETE, and UPDATE events need to synchronize data and write data to column storage. The columnar engine consumes binlogs on a per-transaction basis to ensure transaction consistency and that transactions are not committed in the middle of the process. Consuming binlog events is a serial operation. To improve consumption efficiency, batch processing can be used to significantly improve the speed of binlog consumption. However, batching implies a waiting period, so balancing performance and row-column delay needs to be noted.
As shown in the preceding figure, users write data to PolarDB-X row storage and commit transactions 1 to 5 in sequence. The CDC node of PolarDB-X needs time to generate global binlogs. During this process, the latency is relatively small, generally within several hundred milliseconds. If the columnar engine commits transactions in batches, it must wait until a batch is accumulated, which can increase the latency between column storage and row storage. While batch processing accelerates the consumption of binlogs, the columnar engine dynamically determines the commit time based on the traffic. When the data volume reaches a certain size or the column storage latency reaches a threshold, a commit is triggered to ensure that the latency between column storage and row storage remains minimal. At the same time, each time the columnar engine commits data, a snapshot of the column storage version will be generated, as seen in the V1, V2, and V3 of the above figure. The binlog event contains the CTS value and the column storage version will record the TSO value with this value as the version number. The commit TSO value of the transaction is related to the visibility of the row storage transaction. Therefore, a row storage snapshot that corresponds to the column storage snapshot can be obtained, providing a theoretical basis for the subsequent verification of consistency between column-oriented and row-oriented data.
With the data input of the columnar engine, transactions are committed at intervals of one or more by consuming binlog events. Each transaction contains INSERT, DELETE, and UPDATE events, which represent DML operations. It can be seen that these events are combined based on row storage and consume data row by row. The columnar engine needs to convert these data into column storage and upload them to shared storage OSS.
First, PolarDB-X chooses the Apache ORC format for column storage files. Apache ORC is an efficient column storage format designed for the Hadoop ecosystem to improve the efficiency of large-scale data storage and processing. It is widely used in data warehouse scenarios and analytical queries.
The ORC format organizes data into units called stripes. Each stripe stores data and contains index information. The location of each stripe is recorded at the end of the file, enabling quick data retrieval. This format has the following features:
• High performance: After long-term optimization, Apache ORC is a mature product with high I/O read and write performance.
• High compression ratio: Apache ORC supports multiple compression algorithms, such as ZLIB, SNAPPY, LZO, LZ4, and ZSTD.
• Flexible index structure and statistics: Apache ORC supports a variety of indexes and is embedded with multiple statistics, such as maximum and minimum values.
• Rich data structures: Apache ORC supports complex data types, including structs, lists, and maps.
The column storage format is determined. Next, we need to consider how to efficiently convert row storage to column storage. To design an appropriate storage model, there are several restrictions to be considered:
• Shared storage OSS supports only append writes rather than random modifications.
• Files in the ORC columnar format typically contain batched data, with individual files being relatively large.
Imagine the simplest scenario that contains only the INSERT operation, where all data is valid. The simplest storage structure is shown in the following figure.
The complete process involves initially appending data to a CSV file, and once the file is full, converting the row storage file to an ORC columnar storage file. In actual scenarios, there are DELETE and UPDATE operations, and the modification of existing data increases the complexity of the storage architecture. The following are some common practices.
The first one is the copy-on-write (COW) model. When the data is updated, a replica is copied from the original data and the data update is performed on the replica. The original data is replaced with the current replica when the update is completed. As shown in the following figure, this method can be disastrous for file updates. In most cases, only a small part of data may be changed in each file, but the entire file needs to be rewritten, resulting in serious write amplification issues. Therefore, the COW model is more suitable for data structures with smaller data blocks like the B+ tree. For this data structure, data nodes can be updated through the COW model, which brings acceptable write amplification.
The second method is a log-structured merge (LSM) tree mode. It refers to using the LSM tree structure to store data, changing random write to sequential write, and appending data to DELETE and UPDATE operations to represent the new version of the value. After it reaches the write threshold in the memory, SST files are formed and data is merged through compaction in the background to remove redundant data and sort data. As shown in the following figure, changing the SSTable file format to column storage can better adapt to write operations, while the read performance is greatly compromised due to the need to read multiple layers of data, involving a large number of files and requiring merge processing.
The third method is the Delta Main model, similar to the LSM tree structure. Based on the ideas of the LSM tree, data is still written sequentially, and data changes are also achieved by appending records. Appended records are considered as Delta Data, and when Delta Data reaches the threshold, it is converted into Main Data. Data within a unit can be regarded as data within a range, a partition, or a file. You can separate data based on your business requirements, which optimizes read performance. For data within a range, only Main Data and Delta Data need to be accessed, greatly reducing the amount of data to be merged. You can also use background asynchronous threads to convert Delta Data into Main Data as much as possible to further reduce the read overhead.
PolarDB-X's columnar engine uses the Delta Main model (similar to LSM tree structure) to balance both write and read performance. On this basis, the storage structure of synchronizing column-oriented data through binlogs is designed to find the most appropriate storage model in this scenario.
First, in terms of data layout, the columnar engine follows the partitioning strategy of PolarDB-X. It can divide data into several partitions based on specific partitioning rules, and divide columnar index data into partition data, which benefits parallel writing and reading. Currently, partition strategies such as hash, key, range, and list are available. For more information about partition content, please refer to the link.
In a single data partition, with the continuous synchronization of binlog data, the columnar engine commits transactions in batches. This action is called group commit. It writes INSERT, DELETE, and UPDATE operations to the partition where the data is located. All partition changes are committed in parallel. Data is written to files and metadata including IMCI metadata and file metadata is written to MetaDB. Transaction commit must ensure atomicity. If you commit the data at the same time, the version of the column-oriented data is updated, and the synchronized binlog file position is recorded. If the commit fails, the columnar engine can restart the synchronization from the binlog file position of the committed version. The following figure shows the specific group commit operation.
The group commit operation handles binlog events in different ways:
• The INSERT event represents adding data, namely appending the data to the CSV file (representing the new data written by append).
• The DELETE event means deleting a row of data, which needs to add a delete marker for which row of data in which file is invalid, so a bitmap marker is built for each file. If the data needs to be deleted, an identifier will be added. The bitmap uses RoaringBitmap, which supports serialization, deserialization, and OR operation to merge bitmap. In each commit, if a file needs to add a delete marker, the serialized data of the bitmap is added. When reading, all bitmaps of the file are merged into one bitmap through the OR operation, representing all delete markers of the file. The bitmap data of all files in a partition will be continuously written into the DEL file (actually bitmap serialized data), because the bitmap data is generally small and is aggregated into one file to avoid writing small pieces of data.
• The UPDATE event is converted to the DELETE operation and INSERT operation.
• The DDL event is committed separately, not together with data change operations. Therefore, DDL events execute their operations separately.
There is also a primary key index in the figure, which is used to quickly locate the row of data (and the file location) for the DELETE event, so as to add a delete marker. Therefore, the columnar engine will maintain a key-value store. Each row of data will have a primary key value that represents its uniqueness, and the location information is {file ID, the row number of index}, with 4B + 4B bytes in size, so the primary key index is a KV persistent storage with a small value.
Group commit means committing transactions in batches, including multiple INSERT, DELETE, and UPDATE operations. Therefore, the columnar engine can aggregate changes in memory first. For different operations of the same primary key value, the redundant historical versions can be directly removed in memory to reduce commit operations. For example, insert and then delete the row of data without additional writes. For any write behavior of the same primary key, it can be broken down into several cases.
Operation type (same primary key value) | Operation | Results of merge |
---|---|---|
Insert | mem.add(pk, row) | \ |
Delete | mem.remove(pk) | \ |
Update | mem.remove(pk, row1)mem.add(pk, row2) | \ |
Insert-Insert | \ (Primary key conflict, unexpected) | \ |
Insert-Delete | mem.add(pk, row)mem.remove(pk) | nothing |
Insert-Update | mem.add(pk, row1)mem.remove(pk, row1)mem.add(pk, row2) | mem.add(pk,row2) |
Update-Insert | \ (Primary key conflict, unexpected) | \ |
Update-Delete | mem.remove(pk, row1)mem.add(pk, row2)mem.remove(pk, row2) | mem.remove(pk,row1) |
Update-Update | mem.remove(pk, row1)mem.add(pk, row2)mem.remove(pk, row2)mem.add(pk, row3) | mem.remove(pk,row1)mem.add(pk,row3) |
Delete-Insert | mem.remove(pk)mem.add(pk, row) | mem.remove(pk)mem.add(pk, row) |
Delete-Delete | \ (Duplicate deletion, unexpected) | \ |
Delete-Update | \ (Modify non-existent data, unexpected) | \ |
It can be concluded that some operations can be merged and simplified to reduce intermediate data submissions.
From the perspective of reading column storage data from a read-only column-store node, each partition contains three types of files:
• CSV file: This contains row-store data written in an append-only manner. Since the data is a combination of multiple columns of data written row by row, which looks similar to the CSV format, it is called a CSV file but not a file in the text format. The CSV data format of PolarDB-X adopts a special binary serialization, which can avoid issues such as delimiters in common CSV formats. In addition, when the size of the CSV file or the number of rows reaches the threshold, compaction is performed and the file is converted into an ORC file for column storage.
• ORC file: This contains data in column storage format. Once generated, it will not be modified. It is also the main data of column storage. Each file represents a range of data, with entries inside being sorted.
• DEL file: This records the deletion bitmap marker data of all files in a partition. Essentially, each CSV or ORC file has a corresponding bitmap stored in the DEL file.
Delta Data, especially DEL data, should be cached as much as possible so that invalid data can be quickly filtered when reading an ORC file. It is more beneficial to keep DEL data in memory. At the same time, ORC files are orderly in range, and sorted by sort keys, so you can easily crop them and read qualified data. In contrast, CSV files are unordered and cannot be cropped. Caching this part of the data can also improve read performance. Therefore, Delta Data should be as small as possible and cached first.
The primary key index of the column storage is not read by the column-store read node. The reasons lie in:
As mentioned earlier, the columnar engine includes a primary key index to quickly locate the file of the deleted primary key, so as to quickly add a delete marker to the bitmap of the file. Therefore, the primary key index of the column storage is a KV storage with a small value. The KV storage structure also needs to be designed. First, the usage scenario features of KV storage need to be considered:
• KV consists of the primary key plus file location. The file location information is {file ID, the row of index}, totaling 8 bytes (4B + 4B).
• The KV storage structure needs to be persistent and stored in shared storage.
• Requests include READ, INSERT, and DELETE without range-based read.
• High-performance read and operate each request in batches.
Common KV storage structures mainly include Hash, B+ Tree, Radix-Tree, and the LSM tree. Without the need for range-based read, the Hash structure is the best choice for performance. However, there are few persistent Hash products currently, which are commonly used for cache in memory. Given the requirement for persistence and the characteristic of sequential writes in shared storage, PolarDB-X chose to develop its own KV storage suitable for this scenario. By combining the high performance of the Hash structure and the I/O optimization of the LSM tree, PolarDB-X designed a Hash-LSM tree KV storage. The LSM tree structure without range-based read is shown in the following figure.
Changes are made based on the LSM tree structure without range-based sorting. When the append log is full, hash rules are used to disperse data. Each hash bucket uses SST files to store data. If a single SST file is large, data can be dispersed again through hash rules. The data size of each SST file can be controlled to prevent the hash bucket from being too large, resulting in a decrease in read performance. When reading, hash rules can be used to quickly locate the SST data where the hash bucket is located and read the KV value.
Compaction is also one of the important operations in the columnar engine and is a critical method for reorganizing files. Both space reclamation and file sorting must be completed through background asynchronous compaction processes, as shown in the following figure.
Data is first written into a row-based CSV file. Data in the file is unordered. When a row-based CSV file is full, it needs to be converted into an ordered ORC file for a column storage file. The operation of converting rows into columns can be called CSV compaction.
The column-oriented ORC file currently has only two levels:
The reason for designing only two levels is to make the range of ORC files not overlap as much as possible to increase the cropping capability of read requests and improve read performance. In most cases, data is written randomly. Therefore, the larger the ratio of level 1 files to level 0 files, the greater the write amplification. It is necessary to increase the number of levels to balance the read and write performance. However, the PolarDB-X's IMCI also has the partitioning capability, which can control the data volume of a partition to avoid unlimited data bloat and doubled write amplification. Therefore, the two-level structure can adapt to the data volume.
To prevent the infinite increase of level 0 files, multiple ORC files need to be reordered for the sake of the range not overlapping. This process involves selecting files from Level 0 and any intersecting files from Level 1 within the selected range, then performing a re-sort of these ORC files before placing them back into Level 1. This operation is known as ORC compaction.
PolarDB-X's columnar engine marks the deletion holes in each file by adding a delete bitmap. When the holes in the ORC file are large, the file is rewritten to remove invalid data holes and reclaim storage space. This process is referred to as delete compaction.
In summary, all file reorganization operations within the PolarDB-X columnar storage engine fall under the umbrella term compaction. These tasks are scheduled and run asynchronously in the background. There are also a variety of compaction tasks for different purposes, which will not be detailed here.
The preceding section describes how the columnar engine synchronizes data. If an IMCI is created with no initial data, it can start importing data through binlog events and operate normally. However, in actual scenarios, tables often already contain data or the historical binlog file has expired. This situation requires special processing. The following figure shows the IMCI construction process.
When the columnar engine detects the creation of an IMCI DDL event in the binlog event stream, it starts to synchronize the DML operations of the table. The data generated is called incremental data. At the same time, it starts another thread to retrieve all existing data from the PolarDB-X row storage through the select logic; this data is termed full data. After the full data is retrieved, the system merges the full data with the incremental data. This merged dataset constitutes the entire set of data for the IMCI. Then, the engine continues to synchronize the binlog data. There are quite a few implementation details involved in this process, and a follow-up article will detail the construction process.
The columnar engine uses a primary-secondary architecture for high availability. As shown in the following figure, the resident secondary node is in the startup state. Once the primary node fails, the secondary node can quickly select the primary node to synchronize data from row to column. This ensures low column storage latency. The hot standby mode can meet most application scenarios. The reason only two nodes are used is that the columnar engine does not store any data and is stateless. The metadata of column storage is stored in MetaDB, which is a three-node Paxos architecture featuring high availability. The file data of column storage is stored in shared storage. OSS also has extremely high availability. It can participate in selecting the primary node at any time by adding columnar engine write nodes, so one secondary node is usually sufficient.
The columnar engine node selects the primary node using the lease mechanism. A system table exists in MetaDB to record all node information of the columnar engine. Only one node can be selected as the primary node at a time, and the remaining nodes are secondary nodes. The primary node can be continuously renewed to avoid frequent primary-secondary switchovers (HA). The secondary node also tries to be selected as the primary node at regular intervals. When the primary node fails and the lease expires, then, the secondary node can obtain the lease and become the new primary node to complete a primary-secondary switchover.
Considering that the secondary node is in the hot standby state, with no load and low resource utilization, and only the primary node is providing services, PolarDB-X's columnar engine also supports the primary node to schedule some background tasks to the secondary node for execution, such as some compaction tasks, to reduce the pressure on the primary node and improve the resource utilization of the secondary node.
The PolarDB-X columnar index was gradually rolled out on the public cloud in November 2023. Many users have used it so far, with some deploying it for production services. Through extensive actual usage scenarios, PolarDB-X has found several common usage methods of IMCIs, including high-performance query, historical data snapshot reading, and column store-based archive data.
PolarDB-X is a distributed HTAP database. For complex analytical AP queries, routing them to read-only column-store instances will greatly improve query performance without affecting the traffic of TP workloads, which can better meet user scenarios. For load traffic diversion, PolarDB-X supports intelligent routing and manual routing, as shown in the following figure.
You can configure parameters to automatically route traffic to read-only column-store instances to read column-oriented data. You can also access row-oriented or column-oriented data based on different database connection strings to achieve best practices. For more information, see Row-column routing mechanism.
As mentioned earlier, the columnar engine synchronizes data through the global binlog and commits data in batches based on transactions to create column-oriented data versions. Moreover, column-oriented data is written to files by appending or creating new files without modification. Therefore, each column-oriented data version contains a series of files and their length metadata. The column storage version contains a TSO value, which is obtained from the transaction TSO recorded in the binlog. Each version is a readable snapshot version. The following figure shows an example.
In addition to the snapshot offsets automatically committed by the column storage, PolarDB-X provides a user command call polardbx.columnar_flush()
. The command is written to the binlog event and returns the user's TSO value. When the event is identified in the binlog synchronization, the system forcibly commits a column storage version, which is equivalent to creating a required column-store snapshot version. You can query column-oriented data based on the TSO value, and can view different versions of data through the select * from table_name as of tso xxx;
command and query historical data, as shown in the following figure.
As long as the PolarDB-X columnar engine does not clean up expired files, historical snapshots can be read. You need to retain data files for a longer time, but OSS storage is acceptable at a low storage cost. You can customize the snapshot retention period to meet your business requirements.
Data archiving is required in many business scenarios. Data that is no longer frequently accessed but still needs to be retained is moved to low-cost storage media. PolarDB-X's IMCI stores large amounts of data on low-cost OSS, which meets the storage characteristics of archiving. In addition, column storage can effectively compress data, with a compression efficiency of about 3 to 10 times, further reducing storage costs. The following figure shows how PolarDB-X archives data by using IMCIs.
A typical archiving scenario is to archive data by time dimension. The data is divided into partitions by time partitioning rules. After the IMCI is created, the columnar engine synchronizes data through binlogs. When you need to archive data from a previous time period, you can delete data in the row storage by using a special transaction mark. A special value in a binlog event represents an archived transaction. Upon detecting an archiving transaction, the columnar storage engine ignores its synchronization and does not delete the corresponding data. As a result, the row storage removes the data that needs to be archived, while the columnar storage retains it. You can access the column-oriented data to read the archived data, achieving the archiving effect. When archived data needs to be removed, column storage can delete the partition in January 2024 and clean up the file data in the partition, thus clearing the data that no longer needs to be archived. It can be found that centralized data is still archived through partition management, which facilitates management and cleanup. PolarDB-X is exploring support for row-level archiving. Through archiving transactions, row data of any dimension can be deleted, while column storage retains these data. Column-oriented data can also clean up archived data according to special conditions. These features will be implemented one by one in the future.
This article mainly introduces the infrastructure and general implementation of the PolarDB-X's columnar engine and also introduces various usage scenarios of the current IMCI. More articles will detail the design of each module of the columnar engine.
PolarDB-X Clustered Columnar Index | Snapshot Reading for Transaction Consistency
ApsaraDB - March 26, 2025
ApsaraDB - April 10, 2024
ApsaraDB - October 16, 2024
ApsaraDB - March 26, 2025
ApsaraDB - June 19, 2024
ApsaraDB - March 19, 2025
Alibaba Cloud PolarDB for Xscale (PolarDB-X) is a cloud-native high-performance distributed database service independently developed by Alibaba Cloud.
Learn MoreAlibaba Cloud PolarDB for PostgreSQL is an in-house relational database service 100% compatible with PostgreSQL and highly compatible with the Oracle syntax.
Learn MoreAlibaba Cloud PolarDB for MySQL is a cloud-native relational database service 100% compatible with MySQL.
Learn MoreLeverage cloud-native database solutions dedicated for FinTech.
Learn MoreMore Posts by ApsaraDB