When a query selects only a few columns from a derived table or view with many columns, performance can decrease. This occurs because the query reads and processes unnecessary data. This issue is common when you work with wide tables or reuse large, general-purpose views. The column pruning feature of PolarDB for MySQL automatically optimizes these queries. During the query parsing phase, the optimizer identifies and removes columns from the derived table or view that are not used in the final query. This significantly reduces the amount of data scanned, network transfer, and memory consumption. This also lowers query latency and improves system throughput.
Applicability
Product series: Cluster Edition, Standard Edition.
Kernel version: MySQL 8.0.2 with revision version 8.0.2.2.31.1 or later.
Enable column pruning
You can control the behavior of this optimization feature by setting the loose_derived_table_pruning_mode parameter.
The method for modifying PolarDB cluster parameters differs between the console and a database session. The differences are as follows:
In the PolarDB console
Compatibility: Some cluster parameters in the PolarDB console have the loose_ prefix for compatibility with MySQL configuration files.
Procedure: Find and modify the parameters that have the
loose_prefix.
In a database session (using the command line or a client)
Procedure: When you connect to the database and use the
SETcommand to modify a parameter, remove theloose_prefix and use the original parameter name.
Parameter | Level | Description |
| Global/Session | The main switch for this feature. Valid values are:
|
How it works and examples
The core of column pruning optimization is query rewrite. During the logical optimization phase of a query, the optimizer analyzes the entire search statement. It identifies columns in a derived table or view that are not referenced by the final SELECT list, WHERE clause, or JOIN clause. If a column is redundant, the optimizer rewrites the inner query. This avoids reading and processing the column's data at the source, which reduces I/O and CPU consumption.
Example 1: Optimize a derived table query
This example shows how column pruning optimizes a query on a derived table.
-- Before optimization
SELECT c1 FROM (SELECT c1 FROM (SELECT c1, c2 FROM t1 GROUP BY c1, c2) v) t;
-- After optimization
SELECT c1 FROM (SELECT c1 FROM t1 GROUP BY c1, c2) vPrepare test data: Create a table named
t1with multiple columns in your database.CREATE TABLE t1 (id INT PRIMARY KEY, c1 INT, c2 INT, c3 VARCHAR(100)); INSERT INTO t1 VALUES (1, 10, 100, 'data'), (2, 20, 200, 'data');Run a query with optimization disabled: First, disable the column pruning feature in the session to see the execution plan before optimization.
SET derived_table_pruning_mode = 'OFF'; EXPLAIN SELECT c1 FROM (SELECT c1 FROM (SELECT c1, c2 FROM t1 GROUP BY c1, c2) v) t;SHOW warnings; +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | /* select#1 */ select `v`.`c1` AS `c1` from (/* select#3 */ select `testdb`.`t1`.`c1` AS `c1`,`testdb`.`t1`.`c2` AS `c2` from `testdb`.`t1` group by `testdb`.`t1`.`c1`,`testdb`.`t1`.`c2`) `v` | +-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+Run a query with optimization enabled: Now, enable the column pruning feature in the session and run the same query.
SET derived_table_pruning_mode = 'ON'; EXPLAIN SELECT c1 FROM (SELECT c1 FROM (SELECT c1, c2 FROM t1 GROUP BY c1, c2) v) t;SHOW warnings; +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | /* select#1 */ select `v`.`c1` AS `c1` from (/* select#3 */ select `testdb`.`t1`.`c1` AS `c1` from `testdb`.`t1` group by `testdb`.`t1`.`c1`,`testdb`.`t1`.`c2`) `v` | +-------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+Interpret the execution plan: The execution plan changes significantly after optimization. The optimizer detects that the outer query only needs the
c1column, so it rewrites the query.
Example 2: Optimize a view query
Column pruning for views is similar to that for derived tables. It avoids unnecessary calculations.
CREATE VIEW v1 AS SELECT COUNT(*) AS a, (SELECT a FROM t2 WHERE a=FLOOR(COUNT(t1.a)/2)) AS s FROM t1;
-- Before optimization
SELECT a FROM v1;
-- After optimization
SELECT COUNT(*) AS a FROM t1;Optimization limits
Column pruning optimization is triggered only if the rewritten query is guaranteed to be semantically equivalent to the original SQL statement. A column in a derived table or view that appears to be unreferenced is not pruned if it is used in any of the following operations:
It is used in an
ORDER BY,GROUP BY,HAVING, orDISTINCTclause.It is used as a partition key (
PARTITION BY) or sort key (ORDER BY) in a window function.The optimizer can prune columns generated by non-deterministic functions, such as
RAND()orUUID(). In this case, the function is not executed, which avoids unnecessary computational overhead. This behavior is an intended optimization effect, not a limitation.