Analyze elastic parallel query execution plans with EXPLAIN - PolarDB

Use the EXPLAIN statement to inspect execution plans and confirm whether a query runs in parallel on PolarDB for MySQL.

How it works

When elastic parallel query is active, PolarDB splits a query into sub-tasks and assigns them to multiple workers across one or more nodes. The execution plan reflects this by introducing parallel-specific operators alongside the standard EXPLAIN output fields.

Standard EXPLAIN output fields:

Field	Description
`Id`	The step identifier in the execution plan
`Select_type`	The type of SELECT (for example, `SIMPLE`, `PRIMARY`, `DEPENDENT SUBQUERY`)
`Table`	The table accessed by this step, or a virtual node such as `<gather2>`
`Type`	The join or access type (for example, `index`, `range`, `ALL`)
`Key`	The index used
`Rows`	The estimated number of rows examined
`Filtered`	The estimated percentage of rows that pass the filter condition
`Extra`	Additional execution information, including parallel operators

Parallel-specific operators:

Operator	Where it appears	What it does
`Gather`	Top-level or sub-tree	Collects partial results from all workers and merges them into a single result set
`Parallel scan (N workers)`	`Extra` field	Indicates a parallel table or index scan with N workers
`Parallel pushdown`	`Extra` field	Sends a dependent subquery to workers for parallel execution
`Broadcast`	`EXPLAIN FORMAT=TREE` output	Sends a copy of a dataset to all workers
`Shared access`	`EXPLAIN FORMAT=TREE` output	Runs a parallel subquery once and shares the result among workers

The Gather node in the plan tree marks the boundary between parallel and serial execution. Everything below a Gather node runs in parallel; everything above runs serially on the coordinator.

Example table

All examples in this topic use the pq_test table.

Schema:

SHOW CREATE TABLE pq_test\G

*************************** 1. row ***************************
       Table: pq_test
Create Table: CREATE TABLE `pq_test` (
  `id` BIGINT(20) NOT NULL AUTO_INCREMENT,
  `help_topic_id` INT(10) UNSIGNED NOT NULL,
  `name` CHAR(64) NOT NULL,
  `help_category_id` SMALLINT(5) UNSIGNED NOT NULL,
  `description` TEXT NOT NULL,
  `example` TEXT NOT NULL,
  `url` TEXT NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=21495809 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

Table size:

SHOW TABLE STATUS\G

*************************** 1. row ***************************
           Name: pq_test
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 20064988
 Avg_row_length: 1898
    Data_length: 38085328896
Max_data_length: 0
   Index_length: 0
      Data_free: 4194304
 Auto_increment: 21495809
    Create_time: 2019-07-30 01:35:27
     Update_time: NULL
      Check_time: NULL
       Collation: utf8_general_ci
        Checksum: NULL
 Create_options:
         Comment:
1 row in set (0.02 sec)

Test query:

SELECT COUNT(*) FROM pq_test;

View a non-parallel query plan

Set max_parallel_degree to 0 to disable parallel query, then run EXPLAIN:

SET max_parallel_degree=0; EXPLAIN SELECT COUNT(*) FROM pq_test\G

*************************** 1. row ***************************
           Id: 1
  Select_type: SIMPLE
        Table: pq_test
   Partitions: NULL
         Type: index
Possible_keys: NULL
          Key: PRIMARY
      Key_len: 8
          Ref: NULL
         Rows: 20064988
     Filtered: 100.00
        Extra: Using index
1 row in set, 1 warning (0.03 sec)

The plan shows a standard index scan. There is no Gather node, and the Extra field contains no Parallel scan marker, confirming that the query runs serially.

View a parallel query plan

PolarDB for MySQL 8.0.1: EXPLAIN

EXPLAIN SELECT COUNT(*) FROM pq_test\G

*************************** 1. row ***************************
           Id: 1
  Select_type: SIMPLE
        Table: <gather2>
   Partitions: NULL
         Type: ALL
Possible_keys: NULL
          Key: NULL
      Key_len: NULL
          Ref: NULL
         Rows: 20064988
     Filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           Id: 2
  Select_type: SIMPLE
        Table: pq_test
   Partitions: NULL
         Type: index
Possible_keys: NULL
          Key: PRIMARY
      Key_len: 8
          Ref: NULL
         Rows: 10032494
     Filtered: 100.00
        Extra: Parallel scan (2 workers); Using index
2 rows in set, 1 warning (0.00 sec)

Two rows appear instead of one:

Row 1 (<gather2>) is the Gather operation, which collects and merges results from all workers. This row marks the parallel boundary — everything below it runs in parallel.
Row 2 shows the actual scan on pq_test with Parallel scan (2 workers) in the Extra field, confirming that 2 workers execute the index scan in parallel.

PolarDB for MySQL 8.0.2: EXPLAIN FORMAT=TREE

EXPLAIN FORMAT=TREE returns a tree-shaped plan with richer detail about the parallel execution structure:

EXPLAIN FORMAT=TREE SELECT COUNT(*) FROM pq_test\G

*************************** 1. row ***************************
EXPLAIN: -> Count rows in gather  (cost=... rows=1)
    -> Gather (slice: 1; workers: 2, nodes: 2)
        -> Parallel index scan on pq_test using PRIMARY, with parallel partitions: 8  (cost=... rows=20064988)

Reading top to bottom:

Count rows in gather — the coordinator aggregates the final count from worker results.
Gather (slice: 1; workers: 2, nodes: 2) — 2 workers run across 2 nodes. This is the parallel boundary.
Parallel index scan on pq_test using PRIMARY, with parallel partitions: 8 — the index is divided into 8 partitions scanned in parallel.

View a parallel query plan with subqueries

PolarDB for MySQL 8.0.1: EXPLAIN

EXPLAIN SELECT
    o_orderpriority,
    COUNT(*) as order_count
FROM
    orders
WHERE
    o_orderdate >= '1994-04-01'
    AND o_orderdate < date_add('1994-04-01', interval '3' month)
    AND exists (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            and l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority\G

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <gather1.1>
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1489068
     filtered: 100.00
        Extra: Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: PRIMARY
        table: orders
   partitions: NULL
         type: range
possible_keys: i_o_orderdate
          key: i_o_orderdate
      key_len: 3
          ref: NULL
         rows: 568369
     filtered: 100.00
        Extra: Parallel scan (2 workers); Using index condition; Using where; Using temporary
*************************** 3. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: lineitem
   partitions: NULL
         type: ref
possible_keys: PRIMARY,i_l_orderkey
          key: PRIMARY
      key_len: 4
          ref: tpch_10.orders.O_ORDERKEY
         rows: 4
     filtered: 33.3
        Extra: Parallel pushdown; Using where
3 rows in set, 2 warnings (0.01 sec)

Row 3 has select_type: DEPENDENT SUBQUERY with Extra: Parallel pushdown. This means the subquery is pushed down to each worker for parallel execution rather than being evaluated once on the coordinator.

PolarDB for MySQL 8.0.2: EXPLAIN FORMAT=TREE

Example 1

EXPLAIN FORMAT=TREE SELECT
    o_orderpriority,
    COUNT(*) as order_count
FROM
    orders
WHERE
    o_orderdate >= '1994-04-01'
    AND o_orderdate < date_add('1994-04-01', interval '3' month)
    AND exists (
        SELECT
            *
        FROM
            lineitem
        WHERE
            l_orderkey = o_orderkey
            and l_commitdate < l_receiptdate
    )
GROUP BY
    o_orderpriority
ORDER BY
    o_orderpriority\G

*************************** 1. row ***************************
EXPLAIN: -> Sort: <temporary>.o_orderpriority
    -> Table scan on <temporary>
        -> Aggregate using temporary table  (cost=1746887.76 rows=1489068)
            -> Gather (slice: 1; workers: 2)  (cost=1597980.96 rows=1489068)
                -> Table scan on <temporary>
                    -> Aggregate using temporary table  (cost=1486290.85 rows=744534)
                        -> Filter: exists(select #2)  (cost=772982.43 rows=568369)
                            -> Parallel index range scan on orders using i_o_orderdate, with index condition: ((orders.O_ORDERDATE >= DATE'1994-04-01') and (orders.O_ORDERDATE < <cache>(('1994-04-01' + interval '3' month)))), with parallel partitions: 89  (cost=772982.43 rows=568369)
                            -> Select #2 (subquery in condition; dependent)
                                -> Limit: 1 row(s)
                                    -> Filter: (lineitem.L_COMMITDATE < lineitem.L_RECEIPTDATE)  (cost=1.14 rows=1)
                                        -> Index lookup on lineitem using PRIMARY (L_ORDERKEY=orders.O_ORDERKEY)  (cost=1.14 rows=4)

1 row in set, 1 warning (0.02 sec)

The Gather (slice: 1; workers: 2) node separates the parallel subtree (below) from the serial aggregation and sort (above). The subquery (Select #2) is marked dependent, meaning it runs inside each worker as Parallel pushdown.

Example 2

EXPLAIN FORMAT=TREE
select
    ps_partkey,
    sum(ps_supplycost * ps_availqty) as value
from
    partsupp,
    supplier,
    nation
where
    ps_suppkey = s_suppkey
    and s_nationkey = n_nationkey
    and n_name = 'IRAN'
group by
    ps_partkey having
        sum(ps_supplycost * ps_availqty) > (
            select
                sum(ps_supplycost * ps_availqty) * 0.0000010000
            from
                partsupp,
                supplier,
                nation
            where
                ps_suppkey = s_suppkey
                and s_nationkey = n_nationkey
                and n_name = 'IRAN'
        )
order by
    value desc limit 1;

| -> Limit: 1 row(s)  (cost=1408498.03 rows=1)
    -> Gather (merge sort; slice: 1; workers: 256; actual workers: 32)  (cost=1408498.03 rows=256)
        -> Limit: 1 row(s)  (cost=1408404.20 rows=1)
            -> Sort: <temporary>.value DESC, limit input to 1 row(s) per chunk  (cost=1408404.20 rows=803182)
                -> Filter: (sum((partsupp.PS_SUPPLYCOST * partsupp.PS_AVAILQTY)) > (select #2))
                    -> Table scan on <temporary>
                        -> Aggregate using temporary table  (cost=1408404.20 rows=803182)
                            -> Inner hash join (partsupp.PS_SUPPKEY = supplier.S_SUPPKEY)  (cost=829770.18 rows=327820)
                                -> Parallel table scan on partsupp, with parallel partitions: 2882, partition_keys: 1  (cost=6347528.15 rows=3176912)
                                -> Hash
                                    -> Broadcast (slice: 2; workers: 256; nodes: 16)  (cost=103382.56 rows=1029632)
                                        -> Nested loop inner join  (cost=409.36 rows=4022)
                                            -> Filter: (nation.N_NAME = 'IRAN')  (cost=2.29 rows=3)
                                                -> Table scan on nation  (cost=2.29 rows=25)
                                            -> Parallel index lookup on supplier using SUPPLIER_FK1 (S_NATIONKEY=nation.N_NATIONKEY), with parallel partitions: 9243  (cost=65.94 rows=1609)
                    -> Select #2 (subquery in condition; run only once; shared access)
                        -> Aggregate: sum(`<collector>`.tmp_field_0)  (cost=825576.85 rows=1)
                            -> Gather (slice: 1; workers: 256; nodes: 16)  (cost=825564.05 rows=256)
                                -> Aggregate: sum((partsupp.PS_SUPPLYCOST * partsupp.PS_AVAILQTY))  (cost=825541.20 rows=1)
                                    -> Inner hash join (partsupp.PS_SUPPKEY = supplier.S_SUPPKEY)  (cost=809150.20 rows=327820)
                                        -> Parallel table scan on partsupp, with parallel partitions: 14405  (cost=6147699.35 rows=3176912)
                                        -> Hash
                                            -> Broadcast (slice: 2; workers: 256; nodes: 16)  (cost=103382.56 rows=1029632)
                                                -> Nested loop inner join  (cost=409.36 rows=4022)
                                                    -> Filter: (nation.N_NAME = 'IRAN')  (cost=2.29 rows=3)
                                                        -> Table scan on nation  (cost=2.29 rows=25)
                                                    -> Parallel index lookup on supplier using SUPPLIER_FK1 (S_NATIONKEY=nation.N_NATIONKEY), with parallel partitions: 9243  (cost=65.94 rows=1609)

Key points in this plan:

Gather (merge sort; slice: 1; workers: 256; actual workers: 32) — the merge sort modifier means the Gather node merges pre-sorted results from workers, avoiding a full re-sort at the top level. workers: 256 is the configured degree of parallelism; actual workers: 32 is the number that actually executed.
Select #2 (subquery in condition; run only once; shared access) — the HAVING subquery runs once in parallel, and the result is shared among all outer workers. This is the Shared access optimization.
Broadcast (slice: 2; workers: 256; nodes: 16) — sends the supplier and nation join result to all 16 nodes so each node can perform the hash join locally without cross-node data movement.
The outer query block cannot use multi-node elastic parallel query due to policy restrictions. Workers for the outer query run single-node elastic parallel query on the node where the query is delivered.

View per-worker execution time with EXPLAIN ANALYZE

Available in PolarDB for MySQL 8.0.2, EXPLAIN ANALYZE adds actual timing alongside the estimated cost. For parallel queries, it records the execution time of each worker per operator, including the maximum, minimum, and average across all workers.

EXPLAIN ANALYZE select
    c_name,
    c_custkey,
    o_orderkey,
    o_orderdate,
    o_totalprice,
    sum(l_quantity)
from
    customer,
    orders,
    lineitem
where
    o_orderkey in (
        select
            l_orderkey
        from
            lineitem
        group by
            l_orderkey having
                sum(l_quantity) > 313
    )
    and c_custkey = o_custkey
    and o_orderkey = l_orderkey
group by
    c_name,
    c_custkey,
    o_orderkey,
    o_orderdate,
    o_totalprice
order by
    o_totalprice desc,
    o_orderdate
LIMIT 100;

| -> Limit: 100 row(s)  (cost=14.62 rows=1) (actual time=1.252..1.252 rows=0 loops=1)
    -> Gather (merge sort; slice: 1; workers: 1; nodes: 2)  (cost=14.62 rows=1) (actual time=1.250..1.250 rows=0 loops=1)
        -> Limit: 100 row(s)  (cost=4.52 rows=1) (actual time=0.084,0.084,0.084..0.084,0.084,0.084 rows=0,0,0 loops=1,1,1)
            -> Sort: <temporary>.O_TOTALPRICE DESC, <temporary>.O_ORDERDATE, limit input to 100 row(s) per chunk  (cost=4.52 rows=1) (actual time=0.083,0.083,0.083..0.083,0.083,0.083 rows=0,0,0 loops=1,1,1)
                -> Table scan on <temporary> (actual time=0.070,0.070,0.070..0.070,0.070,0.070 rows=0,0,0 loops=1,1,1)
                    -> Aggregate using temporary table  (cost=4.52 rows=1) (actual time=0.001,0.001,0.001..0.001,0.001,0.001 rows=0,0,0 loops=1,1,1)
                        -> Nested loop inner join  (cost=2.86 rows=4) (actual time=0.039,0.039,0.039..0.039,0.039,0.039 rows=0,0,0 loops=1,1,1)
                            -> Nested loop inner join  (cost=1.45 rows=1) (actual time=0.037,0.037,0.037..0.037,0.037,0.037 rows=0,0,0 loops=1,1,1)
                                -> Parallel table scan on customer, with parallel partitions: 1, partition_keys: 1  (cost=0.35 rows=1) (actual time=0.036,0.036,0.036..0.036,0.036,0.036 rows=0,0,0 loops=1,1,1)
                                -> Filter: <in_optimizer>(orders.O_ORDERKEY,<exists>(select #2))  (cost=1.10 rows=1)
                                    -> Index lookup on orders using ORDERS_FK1 (O_CUSTKEY=customer.C_CUSTKEY)  (cost=1.10 rows=1)
                                    -> Select #2 (subquery in condition; dependent)
                                        -> Limit: 1 row(s)
                                            -> Filter: ((sum(lineitem.L_QUANTITY) > 313) and (<cache>(orders.O_ORDERKEY) = <ref_null_helper>(lineitem.L_ORDERKEY)))
                                                -> Group aggregate: sum(lineitem.L_QUANTITY)
                                                    -> Index scan on lineitem using PRIMARY  (cost=41554048.20 rows=380071042)
                            -> Index lookup on lineitem using PRIMARY (L_ORDERKEY=orders.O_ORDERKEY)  (cost=1.41 rows=4)

The actual time format differs depending on whether an operator runs above or below the Gather node:

Operator location	Format	Example
Above `Gather` (coordinator, serial)	`actual time=start..end rows=N loops=N`	`actual time=1.250..1.250 rows=0 loops=1`
Below `Gather` (workers, parallel)	`actual time=v1,v2,v3..v1,v2,v3 rows=N1,N2,N3 loops=N1,N2,N3`	`actual time=0.084,0.084,0.084..0.084,0.084,0.084 rows=0,0,0 loops=1,1,1`

For operators inside the parallel subtree, each comma-separated value corresponds to one worker. This lets you identify which operator is the bottleneck across workers and whether execution time is evenly distributed.