All Products
Search
Document Center

PolarDB:Overview

Last Updated:Apr 02, 2024

This topic describes the In-Memory Column Index (IMCI) feature of PolarDB for MySQL.

Background information

PolarDB for MySQL is intended for online transaction processing (OLTP) scenarios that involve online business and large amounts of data. However, row store-based PolarDB for MySQL cannot easily meet the query performance requirements of all scenarios. In most cases, performing complex analytic queries requires you to export data from PolarDB for MySQL and then import the data to an online analytical processing (OLAP) system for analysis and queries. This requires two database systems and increase costs, architecture complexity, and O&M loads.

PolarDB for MySQL provides the IMCI feature for OLAP scenarios that involve large amounts of data and complex query requirements. With the IMCI feature, PolarDB for MySQL can provide integrated real-time transaction processing and real-time data analysis and offer a one-stop hybrid transaction/analytical processing (HTAP) solution. As a result, you can meet the requirements of OLTP and OLAP scenarios with only one database system.

Required versions

Your cluster meets one of the following requirements:

  • A cluster of PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.22 or later.

  • A cluster of PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.12 or later.

Implementation

To isolate computing resources between OLAP and OLTP services, you can implement IMCI only on read-only nodes. OLAP query requests are sent only to read-only nodes and are not sent to the primary nodes. You can configure request distribution rules to specify whether OLAP query requests are sent to read-only row store nodes or read-only column store nodes.

Billing

The IMCI feature is provided free of charge. However, you are charged for read-only column store nodes based on the pricing of common compute nodes. For more information, see Billing rules of pay-as-you-go compute nodes and Billing rules of subscription compute nodes. IMCIs also occupy billable storage space. For more information, see Storage pricing.

Usage notes

  • The In-Memory Column Index (IMCI) feature cannot be used together with the Voting Disk Service (VDS) module of the failover with hot standby feature. For more information about the failover with hot standby feature, see Overview. If a cluster contains read-only nodes for which the failover with hot standby feature is enabled, VDS is enabled. In this case, you cannot add a read-only column store node to the cluster.

    If you want to add a read-only column store node to the cluster, contact the technical support to disable VDS. When VDS is being disabled, all nodes in the cluster are automatically restarted.

  • If you enable the serverless feature for a PolarDB for MySQL 8.0.2 Enterprise Edition cluster, you cannot add a read-only column store node to the cluster. For more information, see Overview.

How it works

The following figure shows how the IMCI feature works in PolarDB for MySQL.Multi-zone architecture

Three layers are involved: storage engine, operator, and optimizer.

  • Storage engine layer: supports the hybrid row-column storage that ensures real-time transactional consistency.

  • Operator layer: uses the vectorized parallel operator that is oriented to column store. Single-table queries and multi-table queries can be implemented with minimal latency.

  • SQL parser and optimizer layer: uses the cost-based optimizer (CBO) that is oriented to hybrid row-column storage. The optimizer automatically selects row store or column store based on the cost threshold to execute query requests.

This architecture helps PolarDB for MySQL accelerate queries by several orders of magnitude while PolarDB for MySQL is completely compatible with the MySQL protocol.

Benefits

Equipped with the IMCI feature, PolarDB for MySQL has the following benefits:

  • Complete compatibility with MySQL: A system is provided for column store. This system is consistent with the system provided for row store and supports flexible type conversion.

  • Ultimate HTAP performance: PolarDB provides ultimate performance in terms of OLTP. The IMCI feature provides OLAP with performance commensurate with that provided by the OLAP system.

  • Hybrid row-column storage: Both row store and column store are supported, which saves costs. Moreover, transactional consistency is ensured for row store and column store. Column store also has an advantage in lower costs.

Scenarios

The IMCI feature of PolarDB for MySQL provides a one-stop HTAP experience that can be used in a variety of business scenarios.

  • Scenarios where real-time analysis of online data is required, such as real-time reports.

  • Data warehousing scenarios that depend on the large volume data storage capacity of PolarDB to aggregate multiple upstream data sources and use PolarDB as the dedicated data warehouse.

  • Extract, transform, load (ETL)-faced accelerated data computing scenarios that depend on the powerful and flexible computing capability of IMCI provided by PolarDB to implement ETL features by using SQL syntax.Scenarios

Performance improvement

The IMCI feature notably accelerates queries executed by using SQL statements by up to one hundred folds. The following section provides a query test to verify the acceleration effects. The data tables and SQL statements contained in the standard TPC-H benchmark are used in the example.

  • Test method: TPC-H is a commonly used benchmark that is developed and released by the Transaction Processing Performance Council (TPC) to evaluate the analytic query capabilities of databases. The TPC-H benchmark contains eight tables and 22 complex SQL statements. Most of the queries contain JOIN clauses on several tables, subqueries, and GROUP BY clauses.

Note

In this example, a test based on the TPC-H benchmark is implemented, but it does not meet all the requirements of the TPC-H benchmark test. Therefore, the test results cannot be compared with the published results of the TPC-H benchmark test.

  • Data size: 100 GB.

  • Test results:

    • Comparison between IMCI-enabled and IMCI-disabled scenarios

      The following figure shows the response time difference between IMCI-enabled and IMCI-disabled scenarios when 22 complex SQL statements of the TPC-H benchmark are executed.Comparison between IMCI-enabled and IMCI-disabled scenarios

    • Comparison between ClickHouse and IMCI-enabled PolarDB for MySQL

      The following figure shows the response time difference between ClickHouse and IMCI-enabled PolarDB for MySQL when 21 complex SQL statements of the TPC-H benchmark are executed. The two databases have the same amount of data and the same data schema. Query Statement 21 is not executed because ClickHouse does not support Query Statement 21.Comparison between ClickHouse and IMCI-enabled PolarDB for MySQL

  • Conclusions:

    • The IMCI feature notably accelerates most complex queries by up to one hundred folds.

    • Each of the traditional OLAP database service ClickHouse and IMCI-enabled PolarDB for MySQL has their own advantages. IMCI-enabled PolarDB for MySQL excels in scenarios such as the SCAN and AGGREGATE operations on tables and the JOIN operation. In the future, the IMCI feature of PolarDB for MySQL will be tuned on an ongoing basis and make a breakthrough in terms of aggregation acceleration and window functions.