ApsaraDB for SelectDB supports a brand-new optimizer, Nereids. Compared with the old optimizer, Nereids provides higher performance for complex queries such as multi-table join queries and multi-layer nested subqueries. In addition, Nereids minimizes the logical errors that may occur in optimization rules, facilitates the extension of optimization rules and processing stages, and delivers more stable and immediate responses to meet your business requirements.
Overview
Modern query optimizers face challenges such as more complex query statements and more diverse query scenarios. At the same time, users are more and more eager to quickly obtain query results. The outdated architecture of the old optimizer fails to meet the requirements of rapid iteration in the future.
To solve the issues, ApsaraDB for SelectDB provides Nereids for modern architectures. Nereids efficiently optimizes query requests in ApsaraDB for SelectDB, improves processing efficiency, and provides excellent extensibility, which lays a solid foundation for Nereids to process more complex queries and further extend the system in the future.
Benefits
More intelligent
Nereids presents the rule-based optimization (RBO) and cost-based optimization (CBO) optimization points in the form of rules. For each rule, Nereids provides a set of patterns used to describe the shape of the query plan, which can exactly match the destination query plan. This way, Nereids can provide higher performance for more complex query statements such as multi-layer nested subqueries.
In addition, the CBO of Nereids is developed based on the advanced Cascades framework, uses various data statistics, and applies a cost model with more scientific dimensions. This makes Nereids more suitable for multi-table join queries.
The following figure shows the comparison between the new and old optimizers in terms of the query speed in a Transaction Processing Performance Council (TPC)-H Scale Factor (SF) 100 (TPC-H SF100) scenario. In this scenario, three backend (BE) nodes are used and each node has 16 CPU cores and 64 GB of memory. Nereids performs the original SQL query, and the statistics is collected before the SQL query is performed. The old optimizer performs the original SQL query and the SQL query that is manually optimized. The query performance of Nereids is much superior to that of the old optimizer before manual optimization, and is similar to the query performance of the old optimizer after manual optimization.
More robust
All optimization rules of Nereids are implemented on the logical execution plan tree. After the syntax and semantics of a query are analyzed, the query is transformed into a tree structure. Compared with the old optimizer, the tree structure presents a more reasonable and unified internal data structure. For example, when Nereids processes a subquery, Nereids prevents the subquery from being separately processed by multiple rules in the old optimizer. This minimizes the logical errors that may occur in optimization rules.
More flexible
The architecture of Nereids is more elegant and reasonable. This allows Nereids to extend the optimization rules and processing stages with ease and immediately respond to your new requirements.
Use Nereids
Enable Nereids.
NoteBy default, Nereids is enabled.
SET enable_nereids_planner=true;Enable the automatic fallback feature to use the old optimizer.
SET enable_fallback_to_original_planner=true;
To make full use of the CBO capabilities of Nereids, we recommend that you execute the ANALYZE statement on the table for which you want to monitor the query performance to collect more statistics information. For more information, see Statistics.
Usage notes
Unsupported features
If the automatic fallback feature is enabled, the old optimizer is used for execution when an issue occurs in Nereids.
The JSON, ARRAY, MAP, and STRUCT data types are not supported. If a table to be queried contains data of the preceding data types, or the functions in query statements return data of the preceding data types, Nereids cannot optimize the query.
Nereids supports only the following DML statements:
INSERT INTO SELECT,UPDATE, andDELETE.Nereids does not support materialized views with filter conditions.
Function alias are not supported.
User-defined functions (UDF) in Java and Hadoop Distributed File System (HDFS) are not supported.
Nereids cannot optimize high-concurrency point queries.
Known issues
Nereids cannot use partition cache to accelerate queries.