All Products
Search
Document Center

PolarDB:Release notes for IMCIs

Last Updated:Mar 11, 2024

This topic describes the In-Memory Column Index (IMCI) feature that is released for PolarDB for MySQL.

Background information

PolarDB for MySQL is oriented to online transaction processing (OLTP) scenarios that involve online business and large amounts of data. It is difficult for row store-based PolarDB for MySQL to meet the query performance requirements of all scenarios. In most cases, to implement complex analytic queries, you need to export data from PolarDB for MySQL, and then import the data to an online analytical processing (OLAP) system for analysis and queries. In this context, two database systems are required, and costs, architecture complexity, and O&M loads increase.

PolarDB for MySQL releases the IMCI feature oriented to OLAP scenarios that involve large amounts of data and complex query requirements. PolarDB for MySQL provides the IMCI feature to implement integrated real-time transaction processing and data analysis features and offers a one-stop hybrid transaction/analytical processing (HTAP) solution. PolarDB for MySQL allows you to use only one system to meet requirements of OLTP and OLAP scenarios.

Required versions

Your cluster meets one of the following requirements:

  • A cluster of PolarDB for MySQL 8.0.1 whose revision version is 8.0.1.1.22 or later.

  • A cluster of PolarDB for MySQL 8.0.2 whose revision version is 8.0.2.2.12 or later.

Implementation

To isolate computing resources between OLAP and OLTP, you can only implement IMCIs on read-only nodes. OLAP query requests are sent only to the read-only node, not to the primary nodes. You can set request distribution rules to decide whether OLAP query requests are sent to read-only row store nodes or to read-only column store nodes.

How it works

The following figure shows how the IMCI feature works in PolarDB for MySQL.Multi-zone architecture

Three layers are involved: storage engine, operator, and optimizer.

  • Storage engine layer: supports the hybrid row-column storage that ensures real-time transactional consistency.

  • Operator layer: uses the vectorized parallel operator that is oriented to column store. Single-table queries and multi-table queries can be implemented with minimal latency.

  • SQL parser and optimizer layer: uses the cost-based optimizer (CBO) that is oriented to hybrid row-column storage. The optimizer automatically selects row store or column store based on the cost threshold to execute query requests.

This architecture helps PolarDB for MySQL accelerate queries by several orders of magnitude while PolarDB for MySQL is completely compatible with the MySQL protocol.

Benefits

Equipped with the IMCI feature, PolarDB for MySQL has the following benefits:

  • Complete compatibility with MySQL: A system is provided for column store. This system is consistent with the system provided for row store and supports flexible type conversion.

  • Ultimate HTAP performance: PolarDB provides ultimate performance in terms of OLTP. The IMCI feature provides OLAP with performance commensurate with that provided by the OLAP system.

  • Hybrid row-column storage: Both row store and column store are supported, which saves costs. Moreover, transactional consistency is ensured for row store and column store. Column store also has an advantage in lower costs.

Scenarios

The IMCI feature of PolarDB for MySQL provides a one-stop HTAP experience that can be used in a variety of business scenarios.

  • Scenarios where real-time analysis of online data is required, such as real-time reports.

  • Data warehousing scenarios that depend on the large volume data storage capacity of PolarDB to aggregate multiple upstream data sources and use PolarDB as the dedicated data warehouse.

  • Extract, transform, load (ETL)-faced accelerated data computing scenarios that depend on the powerful and flexible computing capability of IMCI provided by PolarDB to implement ETL features by using SQL syntax.Scenarios

Performance improvement

The IMCI feature notably accelerates queries executed by using SQL statements by up to one hundred folds. The following section provides a query test to verify the acceleration effects. The data tables and SQL statements contained in the standard TPC-H benchmark are used in the example.

  • Test method: TPC-H is a commonly used benchmark that is developed and released by the Transaction Processing Performance Council (TPC) to evaluate the analytic query capabilities of databases. The TPC-H benchmark contains eight tables and 22 complex SQL statements. Most of the queries contain JOIN clauses on several tables, subqueries, and GROUP BY clauses.

Note

In this example, a test based on the TPC-H benchmark is implemented, but it does not meet all the requirements of the TPC-H benchmark test. Therefore, the test results cannot be compared with the published results of the TPC-H benchmark test.

  • Data size: 100 GB.

  • Test results:

    • Comparison between IMCI-enabled and IMCI-disabled scenarios

      The following figure shows the response time difference between IMCI-enabled and IMCI-disabled scenarios when 22 complex SQL statements of the TPC-H benchmark are executed.Comparison between IMCI-enabled and IMCI-disabled scenarios

    • Comparison between ClickHouse and IMCI-enabled PolarDB for MySQL

      The following figure shows the response time difference between ClickHouse and IMCI-enabled PolarDB for MySQL when 21 complex SQL statements of the TPC-H benchmark are executed. The two databases have the same amount of data and the same data schema. Query Statement 21 is not executed because ClickHouse does not support Query Statement 21.Comparison between ClickHouse and IMCI-enabled PolarDB for MySQL

  • Conclusions:

    • The IMCI feature notably accelerates most complex queries by up to one hundred folds.

    • Each of the traditional OLAP database service ClickHouse and IMCI-enabled PolarDB for MySQL has their own advantages. IMCI-enabled PolarDB for MySQL excels in scenarios such as the SCAN and AGGREGATE operations on tables and the JOIN operation. In the future, the IMCI feature of PolarDB for MySQL will be tuned on an ongoing basis and make a breakthrough in terms of aggregation acceleration and window functions.

References