All Products
Search
Document Center

PolarDB:Subquery optimization

Last Updated:Dec 13, 2025

Redundant or pre-computable subqueries in SQL statements increase the overhead of query analysis and execution, which degrades database performance. This issue is common when Object-Relational Mapping (ORM) frameworks automatically generate complex nested queries.PolarDB for MySQL provides a subquery optimization feature that rewrites SQL statements during the optimizer stage. It automatically removes unnecessary subqueries and pre-computes constant subqueries to simplify execution plans and significantly improve query performance.

Scope

  • Product series: Cluster Edition and Standard Edition.

  • Kernel version: MySQL 8.0.2, revision 8.0.2.2.19 or later.

Enable subquery optimization

You can set the loose_simplify_subq_mode parameter to control the behavior of this optimization feature.

The method for modifying PolarDB cluster parameters differs between the console and a database session. The differences are as follows:

  • In the PolarDB console

    • Compatibility: Some cluster parameters in the PolarDB console have the loose_ prefix for compatibility with MySQL configuration files.

    • Procedure: Find and modify the parameters that have the loose_ prefix.

  • In a database session (using the command line or a client)

    • Procedure: When you connect to the database and use the SET command to modify a parameter, remove the loose_ prefix and use the original parameter name.

Parameter

Level

Description

loose_simplify_subq_mode

Global/Session

The main switch for this feature. Valid values are:

  • REPLICA_ON (default): Enables this feature only on read-only (RO) nodes.

  • ON: Enables the feature.

  • OFF: Disables the feature.

Optimization scenarios and examples

Scenario 1: Eliminate redundant SELECT nesting

If a subquery only wraps an aggregate function or expression and has no other complex logic, such as in SELECT (SELECT SUM(a) FROM t2) FROM dual, the optimizer removes the outer SELECT and runs the core expression directly.

-- In a projection column
SELECT (SELECT SUM(a) FROM t2) FROM dual;
-- After optimization
SELECT SUM(`test`.`t2`.`a`) AS `sum(a)` FROM `test`.`t2`

-- In a HAVING clause
SELECT SUM(a) FROM t2 HAVING (SELECT(SELECT(SELECT count(b))));
-- After optimization
SELECT SUM(`testdb`.`t2`.`a`) AS `SUM(a)` from `testdb`.`t2` HAVING (0 <> count(`testdb`.`t2`.`b`))
  1. Prepare test data:

    DROP TABLE IF EXISTS t2;
    CREATE TABLE t2 (
        id INT PRIMARY KEY AUTO_INCREMENT,
        a INT,
        b INT
    );
    INSERT INTO t2 (a, b) VALUES (10, 100), (20, NULL), (50, 200), (120, NULL);
  2. Disable optimization and run the query: First, disable the subquery optimization feature in the session to view the execution plan before optimization.

    SET simplify_subq_mode = 'OFF';
    EXPLAIN SELECT SUM(a) FROM t2 HAVING (SELECT(SELECT(SELECT count(b))));
    SHOW warnings;
    
    -- Result
    /* select#1 */ select sum(`testdb`.`t2`.`a`) AS `SUM(a)` from `testdb`.`t2` having (0 <> (/* select#2 */ select (/* select#3 */ select (/* select#4 */ select count(`testdb`.`t2`.`b`)))))
  3. Enable optimization and run the query: Now, enable the subquery optimization feature in the session and run the same query.

    SET simplify_subq_mode = 'ON';
    EXPLAIN SELECT SUM(a) FROM t2 HAVING (SELECT(SELECT(SELECT count(b))));
    SHOW warnings;
    
    -- Result
    /* select#1 */ select sum(`testdb`.`t2`.`a`) AS `SUM(a)` from `testdb`.`t2` having (0 <> count(`testdb`.`t2`.`b`))

Scenario 2: Pre-evaluate [NOT] EXISTS subqueries

If the optimizer determines that a [NOT] EXISTS subquery will always evaluate to true (non-empty result) or false (empty result), it replaces the [NOT] EXISTS clause with TRUE or FALSE. This replacement prevents the subquery from being run.

-- Non-empty set
SELECT * FROM t1 WHERE EXISTS(SELECT MAX(a) FROM t2);
-- After optimization
SELECT * FROM t1

-- Empty set, WHERE/HAVING clause=false, or LIMIT 0
SELECT * FROM t1 WHERE EXISTS(SELECT max(a) FROM t2 HAVING 1=2 );
-- After optimization
SELECT * FROM t1 WHERE false
  1. Prepare test data:

    DROP TABLE IF EXISTS t1;
    DROP TABLE IF EXISTS t2;
    CREATE TABLE t1 (id INT);
    CREATE TABLE t2 (val INT);
    INSERT INTO t1 VALUES (1), (2);
  2. Disable optimization and run the query: First, disable the subquery optimization feature in the session to view the execution plan before optimization.

    SET simplify_subq_mode = 'OFF';
    EXPLAIN SELECT * FROM t1 WHERE EXISTS(SELECT MAX(a) FROM t2);
    SHOW warnings;
    
    -- Result
    /* select#1 */ select `testdb`.`t1`.`id` AS `id` from `testdb`.`t1` where exists(/* select#2 */ select max(`testdb`.`t1`.`id`) from `testdb`.`t2`)
  3. Enable optimization and run the query: Now, enable the subquery optimization feature in the session and run the same query.

    SET simplify_subq_mode = 'ON';
    EXPLAIN SELECT * FROM t1 WHERE EXISTS(SELECT MAX(a) FROM t2);
    SHOW warnings;
    
    -- Result
    /* select#1 */ select `testdb`.`t1`.`id` AS `id` from `testdb`.`t1`

Scenario 3: Add LIMIT 1 to constant projections in ANY/ALL subqueries

If a subquery in an ANY or ALL clause queries only a constant and does not involve any columns, retrieving multiple identical rows is unnecessary. The optimizer automatically adds LIMIT 1 to these subqueries to avoid unnecessary full table scans on the table in the subquery.

-- Before optimization
SELECT * FROM t1 WHERE a > ANY (SELECT 1 FROM t2);
-- After optimization
SELECT * FROM t1 WHERE a > ANY (SELECT 1 FROM t2 LIMIT 1);

Recommendations for production environments

  • Test thoroughly: Although this optimization guarantees logical equivalence in most cases, a risk exists in rare scenarios that depend on a specific execution order or the number of subquery executions. Before you use this feature in a production environment for core business operations, enable it in a staging environment and perform full regression testing.

  • Update statistics regularly: Some optimizations, such as inferring empty or non-empty sets, rely on table statistics. Outdated statistics can cause the optimizer to make suboptimal choices. To prevent this, run ANALYZE TABLE on your tables regularly.