Use EXPLAIN to inspect the execution plan for a SQL statement before running it. Supported statement types: SELECT, DELETE, INSERT, REPLACE, and UPDATE.
PolarDB for XScale extends the standard EXPLAIN syntax with variants that expose distributed execution details — such as which shards a query touches, how operators are distributed across the compute node (CN) and data nodes (DNs), and what the optimizer estimates for cost and parallelism.
Syntax
EXPLAIN
[LOGICALVIEW | LOGIC | SIMPLE | DETAIL | EXECUTE | PHYSICAL | OPTIMIZER | SHARDING
| COST | ANALYZE | BASELINE | JSON_PLAN]
{SELECT | DELETE | INSERT | REPLACE | UPDATE} statementChoose a variant
| Variant | What it shows | Best for |
|---|---|---|
EXPLAIN | Full logical execution plan on the CN, with operators and plan cache status | First-pass inspection of any query |
EXPLAIN LOGICALVIEW | Pushed-down SQL plan on the DN, including DN-level operators | Understanding what runs on each data node |
EXPLAIN EXECUTE | MySQL-format plan for pushed-down SQL on the DN (identical to MySQL EXPLAIN) | Checking index usage or full table scan on the DN |
EXPLAIN SHARDING | Physical shards scanned by the query | Verifying shard pruning |
EXPLAIN COST | Logical plan with estimated costs per operator and workload type | Understanding optimizer cost estimates |
EXPLAIN ANALYZE | Logical plan with both estimated and actual runtime metrics per operator | Diagnosing plan vs. reality divergence |
EXPLAIN PHYSICAL | Execution mode, plan fragments, fragment dependencies, and degree of parallelism | Analyzing parallel execution |
Output fields
Common fields (EXPLAIN, LOGICALVIEW, COST, ANALYZE)
| Field | Description |
|---|---|
HitCache | Whether the query hit the plan cache. true means the plan was reused from cache; false means it was compiled fresh. |
Source | Plan source. PLAN_CACHE indicates the plan came from the plan cache. |
TemplateId | Globally unique identifier for the plan template. Use this to correlate plans across runs. |
EXPLAIN COST additional fields
| Field | Description |
|---|---|
WorkloadType | Workload type identified by the optimizer. TP indicates transactional processing. |
Each operator line in EXPLAIN COST output includes:
| Metric | Description |
|---|---|
rowcount | Estimated number of output rows for this operator |
cumulative cost | Cumulative cost up to this operator, expressed as a composite value |
cpu | Estimated CPU cost |
memory | Estimated memory usage |
io | Estimated I/O cost |
net | Estimated network transfer cost |
id | Operator ID within the plan |
EXPLAIN ANALYZE additional fields
In addition to estimated cost metrics, each operator line includes:
| Metric | Description |
|---|---|
actual time | Actual elapsed time (in seconds) for this operator: startup + execution |
actual rowcount | Actual number of rows processed by this operator |
actual memory | Actual memory consumed |
instances | Number of parallel instances that ran this operator |
EXPLAIN PHYSICAL fields
| Field | Description |
|---|---|
ExecutorMode | Execution mode for the query. TP_LOCAL means single-node, single-threaded transactional processing. |
Fragment N | A plan fragment — a unit of execution that can run with its own degree of parallelism |
dependency | Fragment IDs that must complete before this fragment can start |
parallelism | Number of parallel threads allocated to this fragment |
EXPLAIN SHARDING output columns
| Column | Description |
|---|---|
LOGICAL_TABLE | The logical table name |
SHARDING | Physical shard locations and shard name pattern |
SHARD_COUNT | Total number of shards scanned |
BROADCAST | Whether the table is a broadcast table |
CONDITION | Shard pruning condition applied |
EXPLAIN EXECUTE output columns
EXPLAIN EXECUTE returns the standard MySQL EXPLAIN format:
id, select_type, table, partitions, type, possible_keys, key, key_len, ref, rows, filtered, Extra
Examples
All examples use the same query:
SELECT count(*) FROM lineitem GROUP BY L_ORDERKEY;EXPLAIN
Shows the full logical execution plan on the CN. The plan tree shows operators from outermost to innermost.
EXPLAIN SELECT count(*) FROM lineitem GROUP BY L_ORDERKEY;+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTION PLAN |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Project(count(*)="count(*)") |
| HashAgg(group="L_ORDERKEY", count(*)="SUM(count(*))") |
| Gather(concurrent=true) |
| LogicalView(tables="[000000-000003].lineitem_[00-15]", shardCount=16, sql="SELECT `L_ORDERKEY`, COUNT(*) AS `count(*)` FROM `lineitem` AS `lineitem` GROUP BY `L_ORDERKEY`") |
| HitCache:false |
| TemplateId: 5819c807 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+HitCache: false means this plan was compiled fresh. On subsequent runs, HitCache becomes true and Source: PLAN_CACHE appears in the output.
EXPLAIN LOGICALVIEW
Expands the LogicalView operator to show the execution plan pushed down to the DN, including DN-level operators such as MysqlAgg and MysqlTableScan.
EXPLAIN LOGICALVIEW SELECT count(*) FROM lineitem GROUP BY L_ORDERKEY;+----------------------------------------------------------+
| LOGICAL EXECUTION PLAN |
+----------------------------------------------------------+
| Project(count(*)="count(*)") |
| HashAgg(group="L_ORDERKEY", count(*)="SUM(count(*))") |
| Gather(concurrent=true) |
| LogicalView |
| MysqlAgg(group="L_ORDERKEY", count(*)="COUNT()") |
| MysqlTableScan(name=[ads, lineitem]) |
| HitCache:true |
| Source:PLAN_CACHE |
| TemplateId: 5819c807EXPLAIN EXECUTE
Returns the MySQL-format plan for the SQL pushed down to the DN. Use this to check whether the DN uses an index or performs a full table scan.
EXPLAIN EXECUTE SELECT count(*) FROM lineitem GROUP BY L_ORDERKEY;+----+-------------+----------+------------+-------+---------------+---------+---------+-----+------+----------+----------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+---------+---------+-----+------+----------+----------------------------------------------+
| 1 | SIMPLE | lineitem | NULL | index | PRIMARY | PRIMARY | 8 | NULL | 1 | 100 | Using index; Using temporary; Using filesort |
+----+-------------+----------+------------+-------+---------------+---------+---------+-----+------+----------+----------------------------------------------+
1 row in set (0.24 sec)EXPLAIN SHARDING
Shows which physical shards the query scans. Use this to verify that shard pruning is working — a high SHARD_COUNT relative to the total number of shards may indicate a missing or ineffective shard key condition.
EXPLAIN SHARDING SELECT count(*) FROM lineitem GROUP BY L_ORDERKEY;+---------------+----------------------------------+-------------+-----------+-----------+
| LOGICAL_TABLE | SHARDING | SHARD_COUNT | BROADCAST | CONDITION |
+---------------+----------------------------------+-------------+-----------+-----------+
| lineitem | [000000-000003].lineitem_[00-15] | 16 | false | |
+---------------+----------------------------------+-------------+-----------+-----------+
1 row in set (0.04 sec)EXPLAIN COST
Shows the logical execution plan with estimated cost metrics per operator and the optimizer's identified workload type.
EXPLAIN COST SELECT count(*) FROM lineitem GROUP BY L_ORDERKEY;+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTION PLAN |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Project(count(*)="count(*)"): rowcount = 2508.0, cumulative cost = value = 2.4867663E7, cpu = 112574.0, memory = 88984.0, io = 201.0, net = 4.75, id = 182 |
| HashAgg(group="L_ORDERKEY", count(*)="SUM(count(*))"): rowcount = 2508.0, cumulative cost = value = 2.4867662E7, cpu = 112573.0, memory = 88984.0, io = 201.0, net = 4.75, id = 180 |
| Gather(concurrent=true): rowcount = 2508.0, cumulative cost = value = 2.4860069E7, cpu = 105039.0, memory = 29796.0, io = 201.0, net = 4.75, id = 178 |
| LogicalView(tables="[000000-000003].lineitem_[00-15]", shardCount=16, sql="SELECT `L_ORDERKEY`, COUNT(*) AS `count(*)` FROM `lineitem` AS `lineitem` GROUP BY `L_ORDERKEY`"): rowcount = 2508.0, cumulative cost = value = 2.4860068E7, cpu = 105038.0, memory = 29796.0, io = 201.0, net = 4.75, id = 109 |
| HitCache:true |
| Source:PLAN_CACHE |
| WorkloadType: TP |
| TemplateId: 5819c807EXPLAIN ANALYZE
Executes the query and returns both estimated and actual runtime metrics per operator. Compare rowcount (estimated) against actual rowcount to identify operators where the optimizer's estimates are significantly off — large divergences often point to stale statistics.
EXPLAIN ANALYZE SELECT count(*) FROM lineitem GROUP BY L_ORDERKEY;+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| LOGICAL EXECUTION PLAN |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Project(count(*)="count(*)"): rowcount = 2508.0, cumulative cost = value = 2.4867663E7, cpu = 112574.0, memory = 88984.0, io = 201.0, net = 4.75, actual time = 0.001 + 0.000, actual rowcount = 2506, actual memory = 0, instances = 1, id = 182 |
| HashAgg(group="L_ORDERKEY", count(*)="SUM(count(*))"): rowcount = 2508.0, cumulative cost = value = 2.4867662E7, cpu = 112573.0, memory = 88984.0, io = 201.0, net = 4.75, actual time = 0.000 + 0.000, actual rowcount = 2506, actual memory = 0, instances = 1, id = 180 |
| Gather(concurrent=true): rowcount = 2508.0, cumulative cost = value = 2.4860069E7, cpu = 105039.0, memory = 29796.0, io = 201.0, net = 4.75, actual time = 0.000 + 0.000, actual rowcount = 0, actual memory = 0, instances = 0, id = 178 |
| LogicalView(tables="[000000-000003].lineitem_[00-15]", shardCount=16, sql="SELECT `L_ORDERKEY`, COUNT(*) AS `count(*)` FROM `lineitem` AS `lineitem` GROUP BY `L_ORDERKEY`"): rowcount = 2508.0, cumulative cost = value = 2.4860068E7, cpu = 105038.0, memory = 29796.0, io = 201.0, net = 4.75, actual time = 0.030 + 0.025, actual rowcount = 10000, actual memory = 0, instances = 0, id = 109 |
| HitCache:true |
| Source:PLAN_CACHE |
| TemplateId: 5819c807 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
7 rows in set (1.08 sec)In this example, LogicalView estimated 2,508 rows (rowcount) but scanned 10,000 rows (actual rowcount). This divergence suggests the optimizer's statistics for the lineitem table may be stale.
EXPLAIN PHYSICAL
Shows the physical execution plan: execution mode, plan fragments, inter-fragment dependencies, and the degree of parallelism for each fragment. Use this to understand how the optimizer parallelizes a query across CN threads.
EXPLAIN PHYSICAL SELECT count(*) FROM lineitem GROUP BY L_ORDERKEY;+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ExecutorMode: TP_LOCAL |
| Fragment 0 dependency: [] parallelism: 4 |
| Gather(concurrent=true) |
| LogicalView(tables="[000000-000003].lineitem_[00-15]", shardCount=16, sql="SELECT `L_ORDERKEY`, COUNT(*) AS `count(*)` FROM `lineitem` AS `lineitem` GROUP BY `L_ORDERKEY`") |
| Fragment 1 dependency: [] parallelism: 8 |
| LocalBuffer |
| RemoteSource(sourceFragmentIds=[0], type=RecordType(INTEGER L_ORDERKEY, BIGINT count(*))) |
| Fragment 2 dependency: [0, 1] parallelism: 8 |
| Project(count(*)="count(*)") |
| HashAgg(group="L_ORDERKEY", count(*)="SUM(count(*))") |
| RemoteSource(sourceFragmentIds=[1], type=RecordType(INTEGER L_ORDERKEY, BIGINT count(*))) |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
11 rows in set (0.10 sec)This query runs in TP_LOCAL mode (single-node, single-threaded). The plan splits into three fragments:
Fragment 0 — pulls data from the 16 DN shards (parallelism: 4)
Fragment 1 — buffers results from Fragment 0 (parallelism: 8)
Fragment 2 — performs pre-aggregation then final aggregation; depends on both Fragment 0 and Fragment 1 (parallelism: 8)