Use EXPLAIN statements to view the execution plan for any SQL query in ApsaraDB for SelectDB. Three variants are available, each returning a different level of detail:
Tip:EXPLAIN GRAPH<EXPLAIN<EXPLAIN VERBOSE— each variant returns more detail than the previous one.
| Variant | Output |
|---|---|
EXPLAIN GRAPH SELECT or DESC GRAPH SELECT | Visual graph of the execution plan tree and its fragments |
EXPLAIN SELECT | Text-format plan with per-node details such as pushed-down filter conditions and scan statistics |
EXPLAIN VERBOSE SELECT | Full detail: everything in EXPLAIN SELECT plus tuple descriptors, slot descriptors, and runtime filter assignments |
EXPLAIN compiles the SQL statement but does not execute it. No compute resources are consumed, and query results are not returned.
How it works
SQL is a declarative language — it describes *what* data to retrieve, not *how* to retrieve it. The query planner determines the execution method, including which join algorithm to use (Hash Join, Sort Merge Join, Shuffle Join, or Broadcast Join), the optimal join order to avoid a Cartesian product, and which nodes run each operation.
The query planner first produces a standalone execution plan tree, then converts it into a distributed execution plan composed of multiple plan fragments. Each fragment handles a portion of the plan. Data moves between fragments via the ExchangeNode operator. Each fragment is further split into multiple instances, which run in parallel to maximize resource utilization and query concurrency.
The following diagram shows a simple standalone execution plan tree:
┌────┐
│Sort│
└────┘
│
┌───────────┐
│Aggregation│
└───────────┘
│
┌────┐
│Join│
└────┘
┌───┴────┐
┌──────┐ ┌──────┐
│Scan-1│ │Scan-2│
└──────┘ └──────┘After the query planner converts this into a distributed plan, it splits the tree into fragments. The following diagram shows the same plan divided into two fragments (F1 and F2), with an ExchangeNode passing data between them:
┌────┐
│Sort│
│F1 │
└────┘
│
┌───────────┐
│Aggregation│
│F1 │
└───────────┘
│
┌────┐
│Join│
│F1 │
└────┘
┌──────┴────┐
┌──────┐ ┌────────────┐
│Scan-1│ │ExchangeNode│
│F1 │ │F1 │
└──────┘ └────────────┘
│
┌──────────────┐
│DataStreamSink│
│F2 │
└──────────────┘
│
┌──────┐
│Scan-2│
│F2 │
└──────┘EXPLAIN GRAPH SELECT or DESC GRAPH SELECT
EXPLAIN GRAPH SELECT and DESC GRAPH SELECT are equivalent. Both display the execution plan as a visual tree, making it easy to see which fragment each node belongs to and how fragments connect to one another.
EXPLAIN GRAPH SELECT tbl1.k1, SUM(tbl1.k2)
FROM tbl1 JOIN tbl2 ON tbl1.k1 = tbl2.k1
GROUP BY tbl1.k1
ORDER BY tbl1.k1;Output:
+---------------------------------------------------------------------------------------------------------------------------------+
| Explain String |
+---------------------------------------------------------------------------------------------------------------------------------+
| |
| ┌───────────────┐ |
| │[9: ResultSink]│ |
| │[Fragment: 4] │ |
| │RESULT SINK │ |
| └───────────────┘ |
| │ |
| ┌─────────────────────┐ |
| │[9: MERGING-EXCHANGE]│ |
| │[Fragment: 4] │ |
| └─────────────────────┘ |
| │ |
| ┌───────────────────┐ |
| │[9: DataStreamSink]│ |
| │[Fragment: 3] │ |
| │STREAM DATA SINK │ |
| │ EXCHANGE ID: 09 │ |
| │ UNPARTITIONED │ |
| └───────────────────┘ |
| │ |
| ┌─────────────┐ |
| │[4: TOP-N] │ |
| │[Fragment: 3]│ |
| └─────────────┘ |
| │ |
| ┌───────────────────────────────┐ |
| │[8: AGGREGATE (merge finalize)]│ |
| │[Fragment: 3] │ |
| └───────────────────────────────┘ |
| │ |
| ┌─────────────┐ |
| │[7: EXCHANGE]│ |
| │[Fragment: 3]│ |
| └─────────────┘ |
| │ |
| ┌───────────────────┐ |
| │[7: DataStreamSink]│ |
| │[Fragment: 2] │ |
| │STREAM DATA SINK │ |
| │ EXCHANGE ID: 07 │ |
| │ HASH_PARTITIONED │ |
| └───────────────────┘ |
| │ |
| ┌─────────────────────────────────┐ |
| │[3: AGGREGATE (update serialize)]│ |
| │[Fragment: 2] │ |
| │STREAMING │ |
| └─────────────────────────────────┘ |
| │ |
| ┌─────────────────────────────────┐ |
| │[2: HASH JOIN] │ |
| │[Fragment: 2] │ |
| │join op: INNER JOIN (PARTITIONED)│ |
| └─────────────────────────────────┘ |
| ┌──────────┴──────────┐ |
| ┌─────────────┐ ┌─────────────┐ |
| │[5: EXCHANGE]│ │[6: EXCHANGE]│ |
| │[Fragment: 2]│ │[Fragment: 2]│ |
| └─────────────┘ └─────────────┘ |
| │ │ |
| ┌───────────────────┐ ┌───────────────────┐ |
| │[5: DataStreamSink]│ │[6: DataStreamSink]│ |
| │[Fragment: 0] │ │[Fragment: 1] │ |
| │STREAM DATA SINK │ │STREAM DATA SINK │ |
| │ EXCHANGE ID: 05 │ │ EXCHANGE ID: 06 │ |
| │ HASH_PARTITIONED │ │ HASH_PARTITIONED │ |
| └───────────────────┘ └───────────────────┘ |
| │ │ |
| ┌─────────────────┐ ┌─────────────────┐ |
| │[0: OlapScanNode]│ │[1: OlapScanNode]│ |
| │[Fragment: 0] │ │[Fragment: 1] │ |
| │TABLE: tbl1 │ │TABLE: tbl2 │ |
| └─────────────────┘ └─────────────────┘ |
+---------------------------------------------------------------------------------------------------------------------------------+The plan splits into five fragments (Fragment 0 through Fragment 4). The [Fragment: N] label on each node indicates which fragment it belongs to. DataStreamSink and ExchangeNode nodes handle data transmission between fragments.
EXPLAIN SELECT
EXPLAIN SELECT returns a text-format plan with per-node details not visible in the graph view, including pushed-down filter conditions and per-scan statistics.
EXPLAIN SELECT tbl1.k1, sum(tbl1.k2)
FROM tbl1 JOIN tbl2 ON tbl1.k1 = tbl2.k1
GROUP BY tbl1.k1
ORDER BY tbl1.k1;Output:
+----------------------------------------------------------------------------------+
| EXPLAIN String |
+----------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:<slot 5> <slot 3> `tbl1`.`k1` | <slot 6> <slot 4> sum(`tbl1`.`k2`) |
| PARTITION: UNPARTITIONED |
| |
| RESULT SINK |
| |
| 9:MERGING-EXCHANGE |
| limit: 65535 |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: <slot 3> `tbl1`.`k1` |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 09 |
| UNPARTITIONED |
| |
| 4:TOP-N |
| | order by: <slot 5> <slot 3> `tbl1`.`k1` ASC |
| | offset: 0 |
| | limit: 65535 |
| | |
| 8:AGGREGATE (merge finalize) |
| | output: sum(<slot 4> sum(`tbl1`.`k2`)) |
| | group by: <slot 3> `tbl1`.`k1` |
| | cardinality=-1 |
| | |
| 7:EXCHANGE |
| |
| PLAN FRAGMENT 2 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: `tbl1`.`k1` |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 07 |
| HASH_PARTITIONED: <slot 3> `tbl1`.`k1` |
| |
| 3:AGGREGATE (update serialize) |
| | STREAMING |
| | output: sum(`tbl1`.`k2`) |
| | group by: `tbl1`.`k1` |
| | cardinality=-1 |
| | |
| 2:HASH JOIN |
| | join op: INNER JOIN (PARTITIONED) |
| | runtime filter: false |
| | hash predicates: |
| | colocate: false, reason: table not in the same group |
| | equal join conjunct: `tbl1`.`k1` = `tbl2`.`k1` |
| | cardinality=2 |
| | |
| |----6:EXCHANGE |
| | |
| 5:EXCHANGE |
| |
| PLAN FRAGMENT 3 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 06 |
| HASH_PARTITIONED: `tbl2`.`k1` |
| |
| 1:OlapScanNode |
| TABLE: tbl2 |
| PREAGGREGATION: ON |
| partitions=1/1 |
| rollup: tbl2 |
| tabletRatio=3/3 |
| tabletList=105104776,105104780,105104784 |
| cardinality=1 |
| avgRowSize=4.0 |
| numNodes=6 |
| |
| PLAN FRAGMENT 4 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 05 |
| HASH_PARTITIONED: `tbl1`.`k1` |
| |
| 0:OlapScanNode |
| TABLE: tbl1 |
| PREAGGREGATION: ON |
| partitions=1/1 |
| rollup: tbl1 |
| tabletRatio=3/3 |
| tabletList=105104752,105104763,105104767 |
| cardinality=2 |
| avgRowSize=8.0 |
| numNodes=6 |
+----------------------------------------------------------------------------------+Output fields
The following fields appear in EXPLAIN SELECT output:
| Field | Description |
|---|---|
PARTITION | Data partitioning strategy for the fragment: RANDOM, HASH_PARTITIONED, or UNPARTITIONED |
EXCHANGE ID | Identifier linking a DataStreamSink to its corresponding ExchangeNode |
cardinality | Estimated row count for the node. -1 means statistics are unavailable |
avgRowSize | Average size of a scanned row, in bytes |
numNodes | Number of nodes that run the scan |
tabletRatio | Ratio of scanned tablets to total tablets (scanned/total) |
tabletList | Comma-separated list of tablet IDs to scan |
PREAGGREGATION | Whether storage-layer preaggregation is enabled (ON or OFF) |
rollup | Rollup index used for the scan |
colocate | Whether the join uses colocate mode. If false, the reason is shown |
runtime filter | Whether a runtime filter is applied to this join |
EXPLAIN VERBOSE SELECT
EXPLAIN VERBOSE SELECT returns the most detailed output. In addition to everything in EXPLAIN SELECT, it includes tuple descriptors, slot descriptors, and runtime filter assignments. Operators use the V prefix (for example, VHASH JOIN, VOlapScanNode) to indicate the vectorized execution engine.
EXPLAIN VERBOSE SELECT tbl1.k1, sum(tbl1.k2)
FROM tbl1 JOIN tbl2 ON tbl1.k1 = tbl2.k1
GROUP BY tbl1.k1
ORDER BY tbl1.k1;Output:
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN String |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:<slot 5> <slot 3> `tbl1`.`k1` | <slot 6> <slot 4> sum(`tbl1`.`k2`) |
| PARTITION: UNPARTITIONED |
| |
| VRESULT SINK |
| |
| 6:VMERGING-EXCHANGE |
| limit: 65535 |
| tuple ids: 3 |
| |
| PLAN FRAGMENT 1 |
| |
| PARTITION: HASH_PARTITIONED: `default_cluster:test`.`tbl1`.`k2` |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 06 |
| UNPARTITIONED |
| |
| 4:VTOP-N |
| | order by: <slot 5> <slot 3> `tbl1`.`k1` ASC |
| | offset: 0 |
| | limit: 65535 |
| | tuple ids: 3 |
| | |
| 3:VAGGREGATE (update finalize) |
| | output: sum(<slot 8>) |
| | group by: <slot 7> |
| | cardinality=-1 |
| | tuple ids: 2 |
| | |
| 2:VHASH JOIN |
| | join op: INNER JOIN(BROADCAST)[Tables are not in the same group] |
| | equal join conjunct: CAST(`tbl1`.`k1` AS DATETIME) = `tbl2`.`k1` |
| | runtime filters: RF000[in_or_bloom] <- `tbl2`.`k1` |
| | cardinality=0 |
| | vec output tuple id: 4 | tuple ids: 0 1 |
| | |
| |----5:VEXCHANGE |
| | tuple ids: 1 |
| | |
| 0:VOlapScanNode |
| TABLE: tbl1(null), PREAGGREGATION: OFF. Reason: the type of agg on StorageEngine's Key column should only be MAX or MIN.agg expr: sum(`tbl1`.`k2`) |
| runtime filters: RF000[in_or_bloom] -> CAST(`tbl1`.`k1` AS DATETIME) |
| partitions=0/1, tablets=0/0, tabletList= |
| cardinality=0, avgRowSize=20.0, numNodes=1 |
| tuple ids: 0 |
| |
| PLAN FRAGMENT 2 |
| |
| PARTITION: HASH_PARTITIONED: `default_cluster:test`.`tbl2`.`k2` |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 05 |
| UNPARTITIONED |
| |
| 1:VOlapScanNode |
| TABLE: tbl2(null), PREAGGREGATION: OFF. Reason: null |
| partitions=0/1, tablets=0/0, tabletList= |
| cardinality=0, avgRowSize=16.0, numNodes=1 |
| tuple ids: 1 |
| |
| Tuples: |
| TupleDescriptor{id=0, tbl=tbl1, byteSize=32, materialized=true} |
| SlotDescriptor{id=0, col=k1, type=DATE} |
| parent=0 |
| materialized=true |
| byteSize=16 |
| byteOffset=16 |
| nullIndicatorByte=0 |
| nullIndicatorBit=-1 |
| slotIdx=1 |
| |
| SlotDescriptor{id=2, col=k2, type=INT} |
| parent=0 |
| materialized=true |
| byteSize=4 |
| byteOffset=0 |
| nullIndicatorByte=0 |
| nullIndicatorBit=-1 |
| slotIdx=0 |
| |
| |
| TupleDescriptor{id=1, tbl=tbl2, byteSize=16, materialized=true} |
| SlotDescriptor{id=1, col=k1, type=DATETIME} |
| parent=1 |
| materialized=true |
| byteSize=16 |
| byteOffset=0 |
| nullIndicatorByte=0 |
| nullIndicatorBit=-1 |
| slotIdx=0 |
| |
| |
| TupleDescriptor{id=2, tbl=null, byteSize=32, materialized=true} |
| SlotDescriptor{id=3, col=null, type=DATE} |
| parent=2 |
| materialized=true |
| byteSize=16 |
| byteOffset=16 |
| nullIndicatorByte=0 |
| nullIndicatorBit=-1 |
| slotIdx=1 |
| |
| SlotDescriptor{id=4, col=null, type=BIGINT} |
| parent=2 |
| materialized=true |
| byteSize=8 |
| byteOffset=0 |
| nullIndicatorByte=0 |
| nullIndicatorBit=-1 |
| slotIdx=0 |
| |
| |
| TupleDescriptor{id=3, tbl=null, byteSize=32, materialized=true} |
| SlotDescriptor{id=5, col=null, type=DATE} |
| parent=3 |
| materialized=true |
| byteSize=16 |
| byteOffset=16 |
| nullIndicatorByte=0 |
| nullIndicatorBit=-1 |
| slotIdx=1 |
| |
| SlotDescriptor{id=6, col=null, type=BIGINT} |
| parent=3 |
| materialized=true |
| byteSize=8 |
| byteOffset=0 |
| nullIndicatorByte=0 |
| nullIndicatorBit=-1 |
| slotIdx=0 |
| |
| |
| TupleDescriptor{id=4, tbl=null, byteSize=48, materialized=true} |
| SlotDescriptor{id=7, col=k1, type=DATE} |
| parent=4 |
| materialized=true |
| byteSize=16 |
| byteOffset=16 |
| nullIndicatorByte=0 |
| nullIndicatorBit=-1 |
| slotIdx=1 |
| |
| SlotDescriptor{id=8, col=k2, type=INT} |
| parent=4 |
| materialized=true |
| byteSize=4 |
| byteOffset=0 |
| nullIndicatorByte=0 |
| nullIndicatorBit=-1 |
| slotIdx=0 |
| |
| SlotDescriptor{id=9, col=k1, type=DATETIME} |
| parent=4 |
| materialized=true |
| byteSize=16 |
| byteOffset=32 |
| nullIndicatorByte=0 |
| nullIndicatorBit=-1 |
| slotIdx=2 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------+
160 rows in set (0.00 sec)The Tuples section at the end lists every TupleDescriptor and its SlotDescriptor entries. Each slot descriptor describes one column or expression in the plan, including its data type, byte size, memory offset, and null indicator position.
Additional output fields in EXPLAIN VERBOSE
| Field | Description |
|---|---|
tuple ids | IDs of the tuples produced or consumed by a node |
vec output tuple id | Tuple ID of the vectorized output produced by a join node |
runtime filters | Runtime filter assignments: <- indicates where a filter is built; -> indicates where it is applied |
TupleDescriptor | Describes a row structure (table, total byte size, materialization status) |
SlotDescriptor | Describes one column or expression within a tuple (data type, byte size, memory offset, null indicator) |
Usage notes
EXPLAIN shows the logical execution plan. The actual runtime execution order may differ due to optimizations such as runtime filter pruning.
cardinalityvalues are estimates based on table statistics. If statistics are stale or absent,cardinality=-1appears.tabletRatioandtabletListvalues inEXPLAIN SELECTare upper-bound estimates. Runtime optimizations such as partition pruning may reduce the number of tablets actually scanned.The
V-prefixed operators (VHASH JOIN,VOlapScanNode, and so on) inEXPLAIN VERBOSEoutput indicate the vectorized execution engine. Non-prefixed operators inEXPLAIN SELECToutput indicate the non-vectorized engine.