This topic describes how to use the EXPLAIN statement to view parallel query information in execution plans.

Example table

In the following examples, the pq_test table is used to test parallel queries.

  • Schema:
    SHOW CREATE TABLE pq_test\G
    *************************** 1. row ***************************
           Table: pq_test
    Create Table: CREATE TABLE `pq_test` (
      `id` BIGINT(20) NOT NULL AUTO_INCREMENT,
      `help_topic_id` INT(10) UNSIGNED NOT NULL,
      `name` CHAR(64) NOT NULL,
      `help_category_id` SMALLINT(5) UNSIGNED NOT NULL,
      `description` TEXT NOT NULL,
      `example` TEXT NOT NULL,
      `url` TEXT NOT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB AUTO_INCREMENT=21495809 DEFAULT CHARSET=utf8
    1 row in set (0.00 sec)
  • Table size:
    SHOW TABLE STATUS\G
    *************************** 1. row ***************************
               Name: pq_test
             Engine: InnoDB
            Version: 10
         Row_format: Dynamic
               Rows: 20064988
     Avg_row_length: 1898
        Data_length: 38085328896
    Max_data_length: 0
       Index_length: 0
          Data_free: 4194304
     Auto_increment: 21495809
        Create_time: 2019-07-30 01:35:27
        Update_time: NULL
         Check_time: NULL
          Collation: utf8_general_ci
           Checksum: NULL
     Create_options:
            Comment:
    1 row in set (0.02 sec)
  • SQL statement:
    SELECT COUNT(*) FROM pq_test;

EXPLAIN statements

  • You can use the EXPLAIN statement on non-parallel queries. Sample statement:
    SET max_parallel_degree=0; EXPLAIN SELECT COUNT(*) FROM pq_test\G

    The following result is returned:

    *************************** 1. row ***************************
               Id: 1
      Select_type: SIMPLE
            Table: pq_test
      Partitions: NULL
             Type: index
    Possible_keys: NULL
              Key: PRIMARY
          Key_len: 8
              Ref: NULL
             Rows: 20064988
         Filtered: 100.00
            Extra: Using index
    1 row in set, 1 warning (0.03 sec)
  • In PolarDB for MySQL 8.0.1, you can use the EXPLAIN statement on parallel queries. Sample statement:
    EXPLAIN SELECT COUNT(*) FROM pq_test\G
    The following result is returned:
    *************************** 1. row ***************************
               Id: 1
      Select_type: SIMPLE
            Table: <gather2>
       Partitions: NULL
             Type: ALL
    Possible_keys: NULL
              Key: NULL
          Key_len: NULL
              Ref: NULL
             Rows: 20064988
         Filtered: 100.00
            Extra: NULL
    *************************** 2. row ***************************
               Id: 2
      Select_type: SIMPLE
            Table: pq_test
       Partitions: NULL
             Type: index
    Possible_keys: NULL
              Key: PRIMARY
          Key_len: 8
              Ref: NULL
             Rows: 10032494
         Filtered: 100.00
            Extra: Parallel scan (2 workers); Using index
    2 rows in set, 1 warning (0.00 sec)
    In PolarDB for MySQL 8.0.2, you can use the EXPLAIN FORMAT=TREE statement on parallel queries. This returns a more precise description of query handling. Sample statement:
    EXPLAIN FORMAT=TREE SELECT COUNT(*) FROM pq_test\G
    The following result is returned:
    *************************** 1. row ***************************
    EXPLAIN: -> Count rows in gather  (cost=... rows=1)
        -> Gather (slice: 1; workers: 2)
            -> Parallel index scan on pq_test using PRIMARY, with parallel partitions: 2  (cost=... rows=20064988)
    The result indicates the following information:
    • The EXPLAIN output shows that the parallel plan includes a Gather operation. Gather is implemented to gather the partial results that are produced by all workers.
    • In addition, information in the Extra field shows that a parallel scan is performed on the pq_test table by using two workers.
  • In PolarDB for MySQL 8.0.1, you can use the EXPLAIN statement on parallel queries that include subqueries. Sample statement:
    EXPLAIN SELECT
        o_orderpriority,
        COUNT(*) as order_count
    FROM
        orders
    WHERE
        o_orderdate >= '1994-04-01'
        AND o_orderdate < date_add('1994-04-01', interval '3' month)
        AND exists (
            SELECT
                *
            FROM
                lineitem
            WHERE
                l_orderkey = o_orderkey
                and l_commitdate < l_receiptdate
        )
    GROUP BY
        o_orderpriority
    ORDER BY
        o_orderpriority\G
    The following result is returned:
    *************************** 1. row ***************************
               id: 1
      select_type: PRIMARY
            table: <gather1.1>
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 1489068
         filtered: 100.00
            Extra: Using temporary; Using filesort
    *************************** 2. row ***************************
               id: 1
      select_type: PRIMARY
            table: orders
       partitions: NULL
             type: range
    possible_keys: i_o_orderdate
              key: i_o_orderdate
          key_len: 3
              ref: NULL
             rows: 568369
         filtered: 100.00
            Extra: Parallel scan (2 workers); Using index condition; Using where; Using temporary
    *************************** 3. row ***************************
               id: 2
      select_type: DEPENDENT SUBQUERY
            table: lineitem
       partitions: NULL
             type: ref
    possible_keys: PRIMARY,i_l_orderkey
              key: PRIMARY
          key_len: 4
              ref: tpch_10.orders.O_ORDERKEY
             rows: 4
         filtered: 33.33
            Extra: Parallel pushdown; Using where
    3 rows in set, 2 warnings (0.01 sec)

    In the preceding result, the select_type of the subquery is DEPENDENT SUBQUERY and the Extra field is set to Parallel pushdown, which indicates that the subquery is sent to workers for parallel processing.

    In PolarDB for MySQL 8.0.2, you can use the EXPLAIN FORMAT=TREE statement on parallel queries that include subqueries. This returns a more precise description of query handling.
    • Example 1:
      Sample statement:
      EXPLAIN FORMAT=TREE SELECT
          o_orderpriority,
          COUNT(*) as order_count
      FROM
          orders
      WHERE
          o_orderdate >= '1994-04-01'
          AND o_orderdate < date_add('1994-04-01', interval '3' month)
          AND exists (
              SELECT
                  *
              FROM
                  lineitem
              WHERE
                  l_orderkey = o_orderkey
                  and l_commitdate < l_receiptdate
          )
      GROUP BY
          o_orderpriority
      ORDER BY
          o_orderpriority\G
      The following result is returned:
      *************************** 1. row ***************************
      EXPLAIN: -> Sort: <temporary>.o_orderpriority
          -> Table scan on <temporary>
              -> Aggregate using temporary table  (cost=1746887.76 rows=1489068)
                  -> Gather (slice: 1; workers: 2)  (cost=1597980.96 rows=1489068)
                      -> Table scan on <temporary>
                          -> Aggregate using temporary table  (cost=1486290.85 rows=744534)
                              -> Filter: exists(select #2)  (cost=772982.43 rows=568369)
                                  -> Parallel index range scan on orders using i_o_orderdate, with index condition: ((orders.O_ORDERDATE >= DATE'1994-04-01') and (orders.O_ORDERDATE < <cache>(('1994-04-01' + interval '3' month)))), with parallel partitions: 89  (cost=772982.43 rows=568369)
                                  -> Select #2 (subquery in condition; dependent)
                                      -> Limit: 1 row(s)
                                          -> Filter: (lineitem.L_COMMITDATE < lineitem.L_RECEIPTDATE)  (cost=1.14 rows=1)
                                              -> Index lookup on lineitem using PRIMARY (L_ORDERKEY=orders.O_ORDERKEY)  (cost=1.14 rows=4)
      
      1 row in set, 1 warning (0.02 sec)
    • Example 2:
      Sample statement:
      EXPLAIN FORMAT=TREE select
          c_name,
          c_custkey,
          o_orderkey,
          o_orderdate,
          o_totalprice,
          sum(l_quantity)
      from
          customer,
          orders,
          lineitem
      where
          o_orderkey in (
              select
                  l_orderkey
              from
                  lineitem
              group by
                  l_orderkey having
                      sum(l_quantity) > 313
          )
          and c_custkey = o_custkey
          and o_orderkey = l_orderkey
      group by
          c_name,
          c_custkey,
          o_orderkey,
          o_orderdate,
          o_totalprice
      order by
          o_totalprice desc,
          o_orderdate
      LIMIT 100;
      The following result is returned:
      *************************** 1. row ***************************
      EXPLAIN: -> Limit: 100 row(s)  (cost=17816010.29 rows=100)
          -> Gather (merge sort; slice: 1; workers: 32)  (cost=17816010.29 rows=3200)
              -> Limit: 100 row(s)  (cost=17815280.77 rows=100)
                  -> Sort: <temporary>.O_TOTALPRICE DESC, <temporary>.O_ORDERDATE, limit input to 100 row(s) per chunk  (cost=17815280.77 rows=1846093)
                      -> Table scan on <temporary>
                          -> Aggregate using temporary table  (cost=16483455.65 rows=1846093)
                              -> Nested loop inner join  (cost=4483831.82 rows=18460929)
                                  -> Parallel inner hash join (customer.C_CUSTKEY = orders.O_CUSTKEY)  (cost=1464180.48 rows=4660127)
                                      -> Parallel table scan on customer, with parallel partitions: 155, partition_keys: 1  (cost=102461.53 rows=461635)
                                      -> Parallel hash
                                          -> Filter: <in_optimizer>(orders.O_ORDERKEY,orders.O_ORDERKEY in (select #2))  (cost=476012.16 rows=4660127)
                                              -> Parallel table scan on orders, with parallel partitions: 1064, partition_keys: 1  (cost=476012.16 rows=4660127)
                                              -> Select #2 (subquery in condition; run only once; shared access)
                                                  -> Filter: ((orders.O_ORDERKEY = `<materialized_subquery>`.l_orderkey))
                                                      -> Limit: 1 row(s)
                                                          -> Index lookup on <materialized_subquery> using <auto_distinct_key> (l_orderkey=orders.O_ORDERKEY)
                                                              -> Materialize with deduplication
                                                                  -> Gather (slice: 1; workers: 32)  (cost=10557896.46 rows=149899005)
                                                                      -> Filter: (sum(lineitem.L_QUANTITY) > 313)
                                                                          -> Group aggregate: sum(lineitem.L_QUANTITY)  (cost=2828719.00 rows=4684344)
                                                                              -> Parallel index scan on lineitem using PRIMARY, with parallel partitions: 6225, partition_keys: 1  (cost=1900875.93 rows=18556862)
                                  -> Index lookup on lineitem using PRIMARY (L_ORDERKEY=orders.O_ORDERKEY)  (cost=0.25 rows=4)

      In the preceding result, the select_type of the subquery is SUBQUERY and the Select #2 field is set to Shared access, which indicates that the PolarDB optimizer runs the parallel subquery in advance and the result is shared among workers.