This topic describes the output structure of EXPLAIN statements and information such as actual execution information, output nodes, Index Only Scan, Bitmap Index Scan, and Bitmap Heap Scan.

Output structure of EXPLAIN statements

The following example shows the output structure of the EXPLAIN statement:

EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=717.34..717.59 rows=101 width=488) (actual time=7.761..7.774 rows=100 loops=1)
  Sort Key: t1.fivethous
  Sort Method: quicksort Memory: 77kB
  -> Hash Join (cost=230.47..713.98 rows=101 width=488) (actual time=0.711..7.427 rows=100 loops=1)
       Hash Cond: (t2.unique2 = t1.unique2)
       -> Seq Scan on tenk2 t2 (cost=0.00..445.00 rows=10000 width=244) (actual time=0.007..2.583 rows=10000 loops=1)
       -> Hash (cost=229.20..229.20 rows=101 width=244) (actual time=0.659..0.659 rows=100 loops=1)
            Buckets: 1024 Batches: 1 Memory Usage: 28kB
            -> Bitmap Heap Scan on tenk1 t1 (cost=5.07..229.20 rows=101 width=244) (actual time=0.080..0.526 rows=100 loops=1)
                 Recheck Cond: (unique1 < 100)
                 -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0) (actual time=0.049..0.049 rows=100 loops=1)
                      Index Cond: (unique1 < 100)
Planning time: 0.194 ms
Execution time: 8.008 ms

The output of the EXPLAIN statement is in a tree structure, which can be called a query plan tree. For each node in the tree, the node type, objects, and other attributes such as the cost, rows, and width are returned. If only node types are displayed, the preceding example can be simplified to the following structure:

Sort
└── Hash Join
          ├── Seq Scan
          └── Hash
                     └── Bitmap Heap Scan
                                └── Bitmap Index Scan

The execution of SQL statements in PolarDB-O has the following characteristics:

  • The SQL statement is executed based on the query plan tree from bottom up.
  • The SQL statement can also be executed based on a volcano model. This indicates that if a node except Bitmap Index Scan is executed, a row of data is returned to the parent node of this node.

Based on the preceding characteristics, you can see that the output of an EXPLAIN statement is in a visualized query plan tree. The query plan tree shows the nodes that are executed and the estimated cost of each node. Each node represents an operation.

Actual execution information

If the ANALYZE option is set to TRUE in the EXPLAIN statement, the actual execution information is returned and placed after the estimated cost information. The actual execution information includes the following items:

  • actual time: the execution time in the format xxx..xxx. The value that is placed before the symbol .. indicates the actual startup time of the node. To be more specific, this indicates the actual time taken to find the first record that meets the condition of the node. The value that is placed after the symbol .. indicates the actual execution time of the node.
  • rows: the number of rows returned for the node.
  • loops: the number of times that the node is re-executed. If the parameters of a plan node such as the bound variables are modified during the execution of the plan node, the plan node must be re-executed.

In most cases, the estimated cost information is similar to the actual execution information. The estimated cost is proportional to the actual time, and the number of returned rows for the estimated cost is similar that for the actual time. However, a query plan tree that has the minimum estimated cost may have poor performance, because table statistics based on which the cost is estimated may be outdated. Developers must modify parameters, or run the vacuum analyze command to update table statistics in time. This ensures that the optimizer of PostgreSQL can find the optimal query plan tree.

Output nodes

  • Node type
    The output of an EXPLAIN statement may contain the following types of execution nodes:
    • Control nodes
    • Scan nodes
    • Materialization nodes
    • Join nodes
  • Scan nodes

    Scan nodes scan tuples in tables. Each scan node returns one tuple except Bitmap Index Scan to its parent node at a time. The parent node uses the tuple as the input. Scan nodes can scan tables, function result sets, lists, and subquery result sets.

    PolarDB-O supports the following scan nodes:

    • Seq Scan: sequential scan.
    • Index Scan: scans a table based on an index and returns the values of index columns and other columns.
    • Index Only Scan: scans a table based on an index and returns only the values of the index columns. This type of scan is also known as covering index scan.
    • Bitmap Index Scan: scans the index of a table and returns a bitmap.
    • Bitmap Heap Scan: converts the bitmap returned by the Bitmap Index Scan node into a tuple.
    • TID Scan: scans an array of tuple identifiers (TIDs).
    • Subquery Scan: scans a subquery.
    • Function Scan: scans data based on functions.
    • TableFunc Scan: scans data based on the tablefunc module.
    • Values Scan: scans the results that are returned by the VALUES command.
    • CTE Scan: scans the result set that is returned by a WITH clause.
    • Named Tuple Store Scan: scans named result sets.
    • WorkTable Scan: scans the intermediate table that is generated during a recursive union query.
    • Foreign Scan: scans tables based on foreign keys.
    • Custom Scan: user-defined scans.

The following sections describe common scan nodes: Seq Scan, Index Scan, Index Only Scan, Bitmap Index Scan, and Bitmap Heap Scan.

  • Seq Scan

    Seq Scan is a sequential scan on a full table. Seq Scan is used to scan tables that have no indexes. In the following example, the public.class table does not have an index, and the output of the EXPLAIN statement is returned:

    postgres=> explain(ANALYZE,VERBOSE,BUFFERS) select * from class where st_no=2;
    QUERY PLAN
    --------------------------------------------------------------------------------------------------
    Seq Scan on public.class (cost=0.00..26.00 rows=1 width=35) (actual time=0.136..0.141 rows=1 loops=1)
        Output: st_no, name
        Filter: (class.st_no = 2)
        Rows Removed by Filter: 1199
        Buffers: shared hit=11
    
    Planning time: 0.066 ms
    Execution time: 0.160 ms

    The following table describes the parameters in the preceding output.

    Parameter Description
    Seq Scan on public.class The type and object of the node. In this example, a full table scan is performed on the public.class table.
    (cost=0.00..26.00 rows=1 width=35) The estimated cost of the node.
    (actual time=0.136..0.141 rows=1 loops=1) The actual execution information of the node. The information is returned only after the ANALYZE option in the EXPLAIN statement is set to TRUE.
    Output: st_no, name The columns in the result set of the SQL statement. The information is returned only after the VERBOSE option in the EXPLAIN statement is set to TRUE.
    Filter: (class.st_no = 2) The filter operation that is performed on the Seq Scan node. This indicates that the filter operation is performed on each row in the table during a full table scan. In this example, the filter condition is class.st_no = 2.
    Rows Removed by Filter: 1199 The number of rows that are removed during the filter operation. This refers to the VERBOSE information about the Seq Scan node. The information is returned only after the VERBOSE option in the EXPLAIN statement is set to TRUE.
    Buffers: shared hit=11 Indicates that 11 blocks are hit in the shared buffer. This refers to the BUFFERS information about the Seq Scan node. The information is returned only after the BUFFERS option in the EXPLAIN statement is set to TRUE.
    Planning time: 0.066 ms The time spent in generating the query plan.
    Execution time: 0.160 ms The actual execution time of the SQL statement. The actual execution time excludes the time spent in generating the query plan.
  • Index Scan

    The Index Scan node scans tables based on indexes. It is used when index columns are specified in the WHERE clause. For example, if an index is created for the st_no column in the preceding example of the Seq Scan node, the EXPLAIN statement returns the following result:

    postgres=> explain(ANALYZE,VERBOSE,BUFFERS) select * from class where st_no=2;
    QUERY PLAN
    --------------------------------------------------------------------------------------------------
    Index Scan using no_index on public.class (cost=0.28..8.29 rows=1 width=35) (actual time=0.022..0.023 rows=1 loops=1)
         Output: st_no, name
         Index Cond: (class.st_no = 2)
         Buffers: shared hit=3
    Planning time: 0.119 ms
    Execution time: 0.060 ms (6 rows)

    The following table describes the parameters in the preceding output.

    Parameter Description
    Index Scan using no_index on public.class Indicates that an index scan is performed on the public.class table based on the no_index index that is created on this table.
    Index Cond: (class.st_no = 2) Indicates that the condition of the index scan is class.st_no = 2.

    The preceding example shows that after the index is used, the scan on the same table based on the same condition is faster. This is because the index scan instead of the full table scan is used. The output information Buffers: shared hit=3 indicates that the number of scanned blocks is reduced. This reduces the cost and increases the scan speed.

Index Only Scan

Index Only Scan is a covering index scan. The return result can be covered by the index to be scanned. For example, if you change select * to select st_no in the preceding example of the Index Scan node, the EXPLAIN statement returns the following result:

postgres=> explain(ANALYZE,VERBOSE,BUFFERS) select st_no from class where st_no=2;
QUERY PLAN
--------------------------------------------------------------------------------------------------
Index Only Scan using no_index on public.class (cost=0.28..4.29 rows=1 width=4) (actual time=0.015..0.016 rows=1 loops=1)
    Output: st_no Index Cond: (class.st_no = 2)
    Heap Fetches: 0
    Buffers: shared hit=3
Planning time: 0.058 ms
Execution time: 0.036 ms
(7 rows)
Parameter Description
Index Only Scan using no_index on public.class Indicates that a covering index scan is performed on the public.class table based on the no_index index that is created on this table.
Heap Fetches The number of data blocks that are scanned.

The Index Only Scan node can retrieve data from the index. However, to implement the Multi-version Concurrency Control (MVCC) mechanism, PostgreSQL must check the visibility of the scanned tuples by checking the Visibility Map (VM) file. If the vacuum or autovacuum operation is not performed after a table is created, no VM file exists, and the index does not store the version information. In this case, the Index Only Scan node must scan blocks to retrieve the version information. This may lead to a query speed lower than the query speed of Index Scan.

Bitmap Index Scan and Bitmap Heap Scan

The Bitmap Index Scan node is similar to the Index Scan node. They scan tables based on indexes. The difference is that Bitmap Index Scan returns a bitmap instead of a tuple. In the bitmap, each bit represents a block that is scanned. In most cases, the Bitmap Heap Scan node functions as the parent node of the Bitmap Index Scan node. The Bitmap Heap Scan node converts the bitmap that is returned by the Bitmap Index Scan node into a tuple. Compared with the Index Scan node that randomly reads data blocks, the Bitmap Heap Scan node reads data blocks based on their physical order. This improves the scan performance if a large number of data blocks are read.

You can execute the set enable_indexscan = off statement to disable Index Scan. For example, if the Index Scan feature is disabled in the preceding example of the Index Scan node, the EXPLAIN statement returns the following result:

postgres=> explain(ANALYZE,VERBOSE,BUFFERS) select * from class where st_no=2;
QUERY PLAN
--------------------------------------------------------------------------------------------------
Bitmap Heap Scan on public.class (cost=4.29..8.30 rows=1 width=35) (actual time=0.025..0.025 rows=1 loops=1) Output: st_no, name
Recheck Cond: (class.st_no = 2)
Heap Blocks: exact=1
Buffers: shared hit=3
-> Bitmap Index Scan on no_index (cost=0.00..4.29 rows=1 width=0) (actual time=0.019..0.019 rows=1 loops=1) Index Cond: (class.st_no = 2)
Buffers: shared hit=2
Planning time: 0.088 ms
Execution time: 0.063 ms
(10 rows)

The following table describes the parameters in the preceding output.

Parameter Description
Bitmap Index Scan on no_index Indicates that Bitmap Index Scan is performed on the public.class table based on the no_index index that is created on this table.
Index Cond: (class.st_no = 2) Indicates that the filter condition of Bitmap Index Scan is class.st_no = 2.
Bitmap Heap Scan on public.class Indicates that Bitmap Heap Scan is performed on the public.class table.
Recheck Cond: (class.st_no = 2) Indicates that the Recheck condition of the Bitmap Heap Scan is class.st_no = 2. The Bitmap Heap Scan node receives a bitmap from the Bitmap Index Scan node. In this bitmap, each bit represents a data block that is scanned. The bitmap can be used to locate some data blocks that meet the filter condition. In the preceding example, the output information Buffers: shared hit=3 indicates that the number of scanned blocks is 3. The Bitmap Heap Scan node needs to recheck the tuples in each scanned data block.
Heap Blocks: exact=1 Indicates that the exact number of scanned data blocks is 1.
  • In most cases, Index Scan is faster than Seq Scan. However, if the result set takes a large proportion of all the data, Index Scan is slower than Seq Scan. This is because the Index Scan node must scan the index before it reads data from the table, which takes more time than a full table scan.
  • If the result set takes a small proportion of all the data but a large number of tuples exist, Bitmap Index Scan is faster than Index Scan.
  • If the result set can be covered by the index, Index Only Scan provides optimal performance. This is because the Index Only Scan node scans only the index and does not read data form the table. However, if the VM file is not generated, Index Only Scan is slower than Index Scan.

The preceding conclusions are based on theoretical analysis. In the output of each EXPLAIN statement, the estimated cost information such as the cost, rows, and width shows the estimated cost of the scan nodes and other nodes. You can compare the estimated costs to select the query plan tree that has the minimum cost.