Use the EXPLAIN statement to inspect execution plans and confirm whether a query runs in parallel on PolarDB for MySQL.
How it works
When elastic parallel query is active, PolarDB splits a query into sub-tasks and assigns them to multiple workers across one or more nodes. The execution plan reflects this by introducing parallel-specific operators alongside the standard EXPLAIN output fields.
Standard EXPLAIN output fields:
| Field | Description |
|---|---|
Id | The step identifier in the execution plan |
Select_type | The type of SELECT (for example, SIMPLE, PRIMARY, DEPENDENT SUBQUERY) |
Table | The table accessed by this step, or a virtual node such as <gather2> |
Type | The join or access type (for example, index, range, ALL) |
Key | The index used |
Rows | The estimated number of rows examined |
Filtered | The estimated percentage of rows that pass the filter condition |
Extra | Additional execution information, including parallel operators |
Parallel-specific operators:
| Operator | Where it appears | What it does |
|---|---|---|
Gather | Top-level or sub-tree | Collects partial results from all workers and merges them into a single result set |
Parallel scan (N workers) | Extra field | Indicates a parallel table or index scan with N workers |
Parallel pushdown | Extra field | Sends a dependent subquery to workers for parallel execution |
Broadcast | EXPLAIN FORMAT=TREE output | Sends a copy of a dataset to all workers |
Shared access | EXPLAIN FORMAT=TREE output | Runs a parallel subquery once and shares the result among workers |
The Gather node in the plan tree marks the boundary between parallel and serial execution. Everything below a Gather node runs in parallel; everything above runs serially on the coordinator.
Example table
All examples in this topic use the pq_test table.
Schema:
SHOW CREATE TABLE pq_test\G*************************** 1. row ***************************
Table: pq_test
Create Table: CREATE TABLE `pq_test` (
`id` BIGINT(20) NOT NULL AUTO_INCREMENT,
`help_topic_id` INT(10) UNSIGNED NOT NULL,
`name` CHAR(64) NOT NULL,
`help_category_id` SMALLINT(5) UNSIGNED NOT NULL,
`description` TEXT NOT NULL,
`example` TEXT NOT NULL,
`url` TEXT NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=21495809 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)Table size:
SHOW TABLE STATUS\G*************************** 1. row ***************************
Name: pq_test
Engine: InnoDB
Version: 10
Row_format: Dynamic
Rows: 20064988
Avg_row_length: 1898
Data_length: 38085328896
Max_data_length: 0
Index_length: 0
Data_free: 4194304
Auto_increment: 21495809
Create_time: 2019-07-30 01:35:27
Update_time: NULL
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.02 sec)Test query:
SELECT COUNT(*) FROM pq_test;View a non-parallel query plan
Set max_parallel_degree to 0 to disable parallel query, then run EXPLAIN:
SET max_parallel_degree=0; EXPLAIN SELECT COUNT(*) FROM pq_test\G*************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: pq_test
Partitions: NULL
Type: index
Possible_keys: NULL
Key: PRIMARY
Key_len: 8
Ref: NULL
Rows: 20064988
Filtered: 100.00
Extra: Using index
1 row in set, 1 warning (0.03 sec)The plan shows a standard index scan. There is no Gather node, and the Extra field contains no Parallel scan marker, confirming that the query runs serially.
View a parallel query plan
PolarDB for MySQL 8.0.1: EXPLAIN
EXPLAIN SELECT COUNT(*) FROM pq_test\G*************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: <gather2>
Partitions: NULL
Type: ALL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 20064988
Filtered: 100.00
Extra: NULL
*************************** 2. row ***************************
Id: 2
Select_type: SIMPLE
Table: pq_test
Partitions: NULL
Type: index
Possible_keys: NULL
Key: PRIMARY
Key_len: 8
Ref: NULL
Rows: 10032494
Filtered: 100.00
Extra: Parallel scan (2 workers); Using index
2 rows in set, 1 warning (0.00 sec)Two rows appear instead of one:
Row 1 (
<gather2>) is theGatheroperation, which collects and merges results from all workers. This row marks the parallel boundary — everything below it runs in parallel.Row 2 shows the actual scan on
pq_testwithParallel scan (2 workers)in theExtrafield, confirming that 2 workers execute the index scan in parallel.
PolarDB for MySQL 8.0.2: EXPLAIN FORMAT=TREE
EXPLAIN FORMAT=TREE returns a tree-shaped plan with richer detail about the parallel execution structure:
EXPLAIN FORMAT=TREE SELECT COUNT(*) FROM pq_test\G*************************** 1. row ***************************
EXPLAIN: -> Count rows in gather (cost=... rows=1)
-> Gather (slice: 1; workers: 2, nodes: 2)
-> Parallel index scan on pq_test using PRIMARY, with parallel partitions: 8 (cost=... rows=20064988)Reading top to bottom:
Count rows in gather— the coordinator aggregates the final count from worker results.Gather (slice: 1; workers: 2, nodes: 2)— 2 workers run across 2 nodes. This is the parallel boundary.Parallel index scan on pq_test using PRIMARY, with parallel partitions: 8— the index is divided into 8 partitions scanned in parallel.
View a parallel query plan with subqueries
PolarDB for MySQL 8.0.1: EXPLAIN
EXPLAIN SELECT
o_orderpriority,
COUNT(*) as order_count
FROM
orders
WHERE
o_orderdate >= '1994-04-01'
AND o_orderdate < date_add('1994-04-01', interval '3' month)
AND exists (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
and l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority\G*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: <gather1.1>
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1489068
filtered: 100.00
Extra: Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: orders
partitions: NULL
type: range
possible_keys: i_o_orderdate
key: i_o_orderdate
key_len: 3
ref: NULL
rows: 568369
filtered: 100.00
Extra: Parallel scan (2 workers); Using index condition; Using where; Using temporary
*************************** 3. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: lineitem
partitions: NULL
type: ref
possible_keys: PRIMARY,i_l_orderkey
key: PRIMARY
key_len: 4
ref: tpch_10.orders.O_ORDERKEY
rows: 4
filtered: 33.3
Extra: Parallel pushdown; Using where
3 rows in set, 2 warnings (0.01 sec)Row 3 has select_type: DEPENDENT SUBQUERY with Extra: Parallel pushdown. This means the subquery is pushed down to each worker for parallel execution rather than being evaluated once on the coordinator.
PolarDB for MySQL 8.0.2: EXPLAIN FORMAT=TREE
Example 1
EXPLAIN FORMAT=TREE SELECT
o_orderpriority,
COUNT(*) as order_count
FROM
orders
WHERE
o_orderdate >= '1994-04-01'
AND o_orderdate < date_add('1994-04-01', interval '3' month)
AND exists (
SELECT
*
FROM
lineitem
WHERE
l_orderkey = o_orderkey
and l_commitdate < l_receiptdate
)
GROUP BY
o_orderpriority
ORDER BY
o_orderpriority\G*************************** 1. row ***************************
EXPLAIN: -> Sort: <temporary>.o_orderpriority
-> Table scan on <temporary>
-> Aggregate using temporary table (cost=1746887.76 rows=1489068)
-> Gather (slice: 1; workers: 2) (cost=1597980.96 rows=1489068)
-> Table scan on <temporary>
-> Aggregate using temporary table (cost=1486290.85 rows=744534)
-> Filter: exists(select #2) (cost=772982.43 rows=568369)
-> Parallel index range scan on orders using i_o_orderdate, with index condition: ((orders.O_ORDERDATE >= DATE'1994-04-01') and (orders.O_ORDERDATE < <cache>(('1994-04-01' + interval '3' month)))), with parallel partitions: 89 (cost=772982.43 rows=568369)
-> Select #2 (subquery in condition; dependent)
-> Limit: 1 row(s)
-> Filter: (lineitem.L_COMMITDATE < lineitem.L_RECEIPTDATE) (cost=1.14 rows=1)
-> Index lookup on lineitem using PRIMARY (L_ORDERKEY=orders.O_ORDERKEY) (cost=1.14 rows=4)
1 row in set, 1 warning (0.02 sec)The Gather (slice: 1; workers: 2) node separates the parallel subtree (below) from the serial aggregation and sort (above). The subquery (Select #2) is marked dependent, meaning it runs inside each worker as Parallel pushdown.
Example 2
EXPLAIN FORMAT=TREE
select
ps_partkey,
sum(ps_supplycost * ps_availqty) as value
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'IRAN'
group by
ps_partkey having
sum(ps_supplycost * ps_availqty) > (
select
sum(ps_supplycost * ps_availqty) * 0.0000010000
from
partsupp,
supplier,
nation
where
ps_suppkey = s_suppkey
and s_nationkey = n_nationkey
and n_name = 'IRAN'
)
order by
value desc limit 1;| -> Limit: 1 row(s) (cost=1408498.03 rows=1)
-> Gather (merge sort; slice: 1; workers: 256; actual workers: 32) (cost=1408498.03 rows=256)
-> Limit: 1 row(s) (cost=1408404.20 rows=1)
-> Sort: <temporary>.value DESC, limit input to 1 row(s) per chunk (cost=1408404.20 rows=803182)
-> Filter: (sum((partsupp.PS_SUPPLYCOST * partsupp.PS_AVAILQTY)) > (select #2))
-> Table scan on <temporary>
-> Aggregate using temporary table (cost=1408404.20 rows=803182)
-> Inner hash join (partsupp.PS_SUPPKEY = supplier.S_SUPPKEY) (cost=829770.18 rows=327820)
-> Parallel table scan on partsupp, with parallel partitions: 2882, partition_keys: 1 (cost=6347528.15 rows=3176912)
-> Hash
-> Broadcast (slice: 2; workers: 256; nodes: 16) (cost=103382.56 rows=1029632)
-> Nested loop inner join (cost=409.36 rows=4022)
-> Filter: (nation.N_NAME = 'IRAN') (cost=2.29 rows=3)
-> Table scan on nation (cost=2.29 rows=25)
-> Parallel index lookup on supplier using SUPPLIER_FK1 (S_NATIONKEY=nation.N_NATIONKEY), with parallel partitions: 9243 (cost=65.94 rows=1609)
-> Select #2 (subquery in condition; run only once; shared access)
-> Aggregate: sum(`<collector>`.tmp_field_0) (cost=825576.85 rows=1)
-> Gather (slice: 1; workers: 256; nodes: 16) (cost=825564.05 rows=256)
-> Aggregate: sum((partsupp.PS_SUPPLYCOST * partsupp.PS_AVAILQTY)) (cost=825541.20 rows=1)
-> Inner hash join (partsupp.PS_SUPPKEY = supplier.S_SUPPKEY) (cost=809150.20 rows=327820)
-> Parallel table scan on partsupp, with parallel partitions: 14405 (cost=6147699.35 rows=3176912)
-> Hash
-> Broadcast (slice: 2; workers: 256; nodes: 16) (cost=103382.56 rows=1029632)
-> Nested loop inner join (cost=409.36 rows=4022)
-> Filter: (nation.N_NAME = 'IRAN') (cost=2.29 rows=3)
-> Table scan on nation (cost=2.29 rows=25)
-> Parallel index lookup on supplier using SUPPLIER_FK1 (S_NATIONKEY=nation.N_NATIONKEY), with parallel partitions: 9243 (cost=65.94 rows=1609)Key points in this plan:
Gather (merge sort; slice: 1; workers: 256; actual workers: 32)— themerge sortmodifier means the Gather node merges pre-sorted results from workers, avoiding a full re-sort at the top level.workers: 256is the configured degree of parallelism;actual workers: 32is the number that actually executed.Select #2 (subquery in condition; run only once; shared access)— the HAVING subquery runs once in parallel, and the result is shared among all outer workers. This is theShared accessoptimization.Broadcast (slice: 2; workers: 256; nodes: 16)— sends thesupplierandnationjoin result to all 16 nodes so each node can perform the hash join locally without cross-node data movement.The outer query block cannot use multi-node elastic parallel query due to policy restrictions. Workers for the outer query run single-node elastic parallel query on the node where the query is delivered.
View per-worker execution time with EXPLAIN ANALYZE
Available in PolarDB for MySQL 8.0.2, EXPLAIN ANALYZE adds actual timing alongside the estimated cost. For parallel queries, it records the execution time of each worker per operator, including the maximum, minimum, and average across all workers.
EXPLAIN ANALYZE select
c_name,
c_custkey,
o_orderkey,
o_orderdate,
o_totalprice,
sum(l_quantity)
from
customer,
orders,
lineitem
where
o_orderkey in (
select
l_orderkey
from
lineitem
group by
l_orderkey having
sum(l_quantity) > 313
)
and c_custkey = o_custkey
and o_orderkey = l_orderkey
group by
c_name,
c_custkey,
o_orderkey,
o_orderdate,
o_totalprice
order by
o_totalprice desc,
o_orderdate
LIMIT 100;| -> Limit: 100 row(s) (cost=14.62 rows=1) (actual time=1.252..1.252 rows=0 loops=1)
-> Gather (merge sort; slice: 1; workers: 1; nodes: 2) (cost=14.62 rows=1) (actual time=1.250..1.250 rows=0 loops=1)
-> Limit: 100 row(s) (cost=4.52 rows=1) (actual time=0.084,0.084,0.084..0.084,0.084,0.084 rows=0,0,0 loops=1,1,1)
-> Sort: <temporary>.O_TOTALPRICE DESC, <temporary>.O_ORDERDATE, limit input to 100 row(s) per chunk (cost=4.52 rows=1) (actual time=0.083,0.083,0.083..0.083,0.083,0.083 rows=0,0,0 loops=1,1,1)
-> Table scan on <temporary> (actual time=0.070,0.070,0.070..0.070,0.070,0.070 rows=0,0,0 loops=1,1,1)
-> Aggregate using temporary table (cost=4.52 rows=1) (actual time=0.001,0.001,0.001..0.001,0.001,0.001 rows=0,0,0 loops=1,1,1)
-> Nested loop inner join (cost=2.86 rows=4) (actual time=0.039,0.039,0.039..0.039,0.039,0.039 rows=0,0,0 loops=1,1,1)
-> Nested loop inner join (cost=1.45 rows=1) (actual time=0.037,0.037,0.037..0.037,0.037,0.037 rows=0,0,0 loops=1,1,1)
-> Parallel table scan on customer, with parallel partitions: 1, partition_keys: 1 (cost=0.35 rows=1) (actual time=0.036,0.036,0.036..0.036,0.036,0.036 rows=0,0,0 loops=1,1,1)
-> Filter: <in_optimizer>(orders.O_ORDERKEY,<exists>(select #2)) (cost=1.10 rows=1)
-> Index lookup on orders using ORDERS_FK1 (O_CUSTKEY=customer.C_CUSTKEY) (cost=1.10 rows=1)
-> Select #2 (subquery in condition; dependent)
-> Limit: 1 row(s)
-> Filter: ((sum(lineitem.L_QUANTITY) > 313) and (<cache>(orders.O_ORDERKEY) = <ref_null_helper>(lineitem.L_ORDERKEY)))
-> Group aggregate: sum(lineitem.L_QUANTITY)
-> Index scan on lineitem using PRIMARY (cost=41554048.20 rows=380071042)
-> Index lookup on lineitem using PRIMARY (L_ORDERKEY=orders.O_ORDERKEY) (cost=1.41 rows=4)The actual time format differs depending on whether an operator runs above or below the Gather node:
| Operator location | Format | Example |
|---|---|---|
Above Gather (coordinator, serial) | actual time=start..end rows=N loops=N | actual time=1.250..1.250 rows=0 loops=1 |
Below Gather (workers, parallel) | actual time=v1,v2,v3..v1,v2,v3 rows=N1,N2,N3 loops=N1,N2,N3 | actual time=0.084,0.084,0.084..0.084,0.084,0.084 rows=0,0,0 loops=1,1,1 |
For operators inside the parallel subtree, each comma-separated value corresponds to one worker. This lets you identify which operator is the bottleneck across workers and whether execution time is evenly distributed.