Query profiles capture per-node execution details for every query. Use them to pinpoint performance bottlenecks and optimize query performance.
Enable query profiles
Set the enable_profile variable to start collecting profiles:
SET enable_profile=true;
enable_profile set to true, SelectDB generates a profile for every query that runs. Each profile records execution details across all nodes.View query profiles
The SHOW QUERY PROFILE statement is supported only for SelectDB Core versions earlier than V4.0.0. For SelectDB Core V4.0.0 and later, preview or download query profiles from the SelectDB console, or use the HTTP API. For details, see Use the query audit feature and Export query profiles.
Run the following statement to list all stored profiles:
SHOW QUERY PROFILE "/"\G;
Each row in the result corresponds to one profile:
SHOW QUERY PROFILE "/"\G
*************************** 1. row ***************************
Profile ID: c257c52f93e149ee-ace8ac14e8c9fef9
Task Type: QUERY
Start Time: 2021-04-08 11:30:50
End Time: 2021-04-08 11:30:50
Total: 9ms
Task State: EOF
User: root
Default Db: default_cluster:db1
Sql Statement: select tbl1.k1, sum(tbl1.k2) from tbl1 join tbl2 on tbl1.k1 = tbl2.k1 group by tbl1.k1 order by tbl1.k1
Query ID and query_id fields refer to the profile ID.Analyze query profiles
Use the following three-step workflow to locate the performance bottleneck in a query. Each step drills one level deeper into the execution hierarchy.
Step 1: Identify the slowest fragment
Goal: Find which fragment is the primary bottleneck.
The execution plan tree shows the fragment structure and the execution time of each fragment. The MaxActiveTime value on each sender node is the longest execution time across all execution units in that fragment.
SHOW QUERY PROFILE "/<profile_id>"\G
Example output:
SHOW QUERY PROFILE "/c257c52f93e149ee-ace8ac14e8c9fef9"\G
*************************** 1. row ***************************
Fragments:
┌──────────────────────┐
│[-1: DataBufferSender]│
│Fragment: 0 │
│MaxActiveTime: 6.626 ms│
└──────────────────────┘
│
┌──────────────────┐
│[9: EXCHANGE_NODE]│
│Fragment: 0 │
└──────────────────┘
│
┌──────────────────────┐
│[9: DataStreamSender] │
│Fragment: 1 │
│MaxActiveTime: 5.449 ms│
└──────────────────────┘
│
┌──────────────┐
│[4: SORT_NODE]│
│Fragment: 1 │
└──────────────┘
┌┘
┌─────────────────────┐
│[8: AGGREGATION_NODE]│
│Fragment: 1 │
└─────────────────────┘
└┐
┌──────────────────┐
│[7: EXCHANGE_NODE]│
│Fragment: 1 │
└──────────────────┘
│
┌──────────────────────┐
│[7: DataStreamSender] │
│Fragment: 2 │
│MaxActiveTime: 3.505 ms│
└──────────────────────┘
┌┘
┌─────────────────────┐
│[3: AGGREGATION_NODE]│
│Fragment: 2 │
└─────────────────────┘
│
┌───────────────────┐
│[2: HASH_JOIN_NODE]│
│Fragment: 2 │
└───────────────────┘
┌────────────┴────────────┐
┌──────────────────┐ ┌──────────────────┐
│[5: EXCHANGE_NODE]│ │[6: EXCHANGE_NODE]│
│Fragment: 2 │ │Fragment: 2 │
└──────────────────┘ └──────────────────┘
│ │
┌─────────────────────┐ ┌────────────────────────┐
│[5: DataStreamSender]│ │[6: DataStreamSender] │
│Fragment: 4 │ │Fragment: 3 │
│MaxActiveTime: 1.87 ms│ │MaxActiveTime: 636.767 us│
└─────────────────────┘ └────────────────────────┘
│ ┌┘
┌───────────────────┐ ┌───────────────────┐
│[0: OLAP_SCAN_NODE]│ │[1: OLAP_SCAN_NODE]│
│Fragment: 4 │ │Fragment: 3 │
└──────────────────┘ └──────────────────┘
│ │
┌─────────────┐ ┌─────────────┐
│[OlapScanner]│ │[OlapScanner]│
│Fragment: 4 │ │Fragment: 3 │
└─────────────┘ └─────────────┘
│ │
┌─────────────────┐ ┌─────────────────┐
│[SegmentIterator]│ │[SegmentIterator]│
│Fragment: 4 │ │Fragment: 3 │
└─────────────────┘ └─────────────────┘
1 row in set (0.02 sec)
In this example, Fragment 1 has the highest MaxActiveTime (5.449 ms) — it is the primary bottleneck. Proceed to Step 2 with Fragment 1.
Step 2: Identify the slowest execution unit
Goal: Find which node instance within the bottleneck fragment is taking the most time.
After identifying the slowest fragment, list all execution units (instances) within it:
SHOW QUERY PROFILE "/<profile_id>/<fragment_id>";
Example output for Fragment 1:
SHOW QUERY PROFILE "/c257c52f93e149ee-ace8ac14e8c9fef9/1";
+-----------------------------------+-------------------+------------+
| Instances | Host | ActiveTime |
+-----------------------------------+-------------------+------------+
| c257c52f93e149ee-ace8ac14e8c9ff03 | 10.200.00.01:9060 | 5.449ms |
| c257c52f93e149ee-ace8ac14e8c9ff05 | 10.200.00.02:9060 | 5.367ms |
| c257c52f93e149ee-ace8ac14e8c9ff04 | 10.200.00.03:9060 | 5.358ms |
+-----------------------------------+-------------------+------------+
The instance with the highest ActiveTime is your target for Step 3.
Step 3: Inspect per-operator metrics
Goal: Determine which operator is consuming the most time and resources on the target execution unit.
SHOW QUERY PROFILE "/<profile_id>/<fragment_id>/<instance_id>"\G
Example output for instance c257c52f93e149ee-ace8ac14e8c9ff03 in Fragment 1:
SHOW QUERY PROFILE "/c257c52f93e149ee-ace8ac14e8c9fef9/1/c257c52f93e149ee-ace8ac14e8c9ff03"\G
*************************** 1. row ***************************
Instance:
┌───────────────────────────────────────┐
│[9: DataStreamSender] │
│(Active: 37.222 us, non-child: 0.40) │
│ - Counters: │
│ - BytesSent: 0.00 │
│ - IgnoreRows: 0 │
│ - OverallThroughput: 0.0 /sec │
│ - PeakMemoryUsage: 8.00 KB │
│ - SerializeBatchTime: 0ns │
│ - UncompressedRowBatchSize: 0.00 │
└───────────────────────────────────────┘
└┐
│
┌──────────────────────────────────┐
│[4: SORT_NODE] │
│(Active: 5.421 ms, non-child: 0.71)│
│ - Counters: │
│ - PeakMemoryUsage: 12.00 KB │
│ - RowsReturned: 0 │
│ - RowsReturnedRate: 0 │
└──────────────────────────────────┘
┌┘
│
┌───────────────────────────────────┐
│[8: AGGREGATION_NODE] │
│(Active: 5.355 ms, non-child: 10.68)│
│ - Counters: │
│ - BuildTime: 3.701us │
│ - GetResultsTime: 0 ns │
│ - HTResize: 0 │
│ - HTResizeTime: 1.211 us │
│ - HashBuckets: 0 │
│ - HashCollisions: 0 │
│ - HashFailedProbe: 0 │
│ - HashFilledBuckets: 0 │
│ - HashProbe: 0 │
│ - HashTravelLength: 0 │
│ - LargestPartitionPercent: 0 │
│ - MaxPartitionLevel: 0 │
│ - NumRepartitions: 0 │
│ - PartitionsCreated: 16 │
│ - PeakMemoryUsage: 34.02 MB │
│ - RowsProcessed: 0 │
│ - RowsRepartitioned: 0 │
│ - RowsReturned: 0 │
│ - RowsReturnedRate: 0 │
│ - SpilledPartitions: 0 │
└───────────────────────────────────┘
└┐
│
┌──────────────────────────────────────────┐
│[7: EXCHANGE_NODE] │
│(Active: 4.360 ms, non-child: 46.84) │
│ - Counters: │
│ - BytesReceived: 0.00 │
│ - ConvertRowBatchTime: 387 ns │
│ - DataArrivalWaitTime: 4.357 ms │
│ - DeserializeRowBatchTimer: 0 ns │
│ - FirstBatchArrivalWaitTime: 4.356 ms│
│ - PeakMemoryUsage: 0.00 │
│ - RowsReturned: 0 │
│ - RowsReturnedRate: 0 │
│ - SendersBlockedTotalTimer(*): 0 ns │
└──────────────────────────────────────────┘
A high Active percentage combined with a high non-child percentage on a node indicates that the node itself is the bottleneck (not a child operator). Focus your optimization efforts there.
Export query profiles
Exporting query profiles requires SelectDB Core V3.0.4 or later.
Export a profile to a file for offline analysis or sharing.
-
Enable query profiles.
SET enable_profile=true; -
Run the query you want to profile.
-- Run your target query. SelectDB generates a profile for it. SELECT count(1) FROM test_table LIMIT 10; -
Get the profile ID.
SelectDB 4.0.x versions
Send an HTTP request to get query information. The
Query IDfield in the response is the profile ID.curl -u'<userName>:<userPassword>' \ "http://<selectdbAddress>:<httpPort>/rest/v2/manager/query/query_info?is_all_node=true"Example response:
{ "msg": "success", "code": 0, "data": { "column_names": [ "Query ID", "FE Node", "Query User", "Execution Database", "Sql", "Query Type", "Start Time", "End Time", "Execution Duration", "Status" ], "rows": [ [ ... ] ] }, "count": 0 }SelectDB 3.0.x versions
The following example shows how to obtain SelectDB 3.0:
SHOW QUERY PROFILE "/";Example output:
+-----------------------------------+-----------+---------------------+---------------------+-------+------------+-------+------------------------------------+-------------------------------+ | Profile ID | Task Type | Start Time | End Time | Total | Task State | User | Default Db | Sql Statement | +-----------------------------------+-----------+---------------------+---------------------+-------+------------+-------+------------------------------------+-------------------------------+ | b9c9ba063d9d4365-97878361371e757a | QUERY | 2024-02-07 17:40:04 | 2024-02-07 17:40:04 | 32ms | EOF | admin | default_cluster:information_schema | select * from user_privileges | | 8800c306137e4072-9bb1ed419f4ac9f2 | QUERY | 2024-02-07 17:40:00 | 2024-02-07 17:40:00 | 3ms | ERR | admin | default_cluster:information_schema | select * from user_priveleges | | e2efdd1a996c4de2-ab939ad49b409990 | QUERY | 2024-02-07 17:39:51 | 2024-02-07 17:39:51 | 13ms | EOF | admin | | SELECT DATABASE() | +-----------------------------------+-----------+---------------------+---------------------+-------+------------+-------+------------------------------------+-------------------------------+ -
Export the profile using its ID.
Placeholder Description Default <userName>Username for accessing the SelectDB instance — <userPassword>Password for accessing the SelectDB instance — <selectdbAddress>Endpoint of the SelectDB instance — <httpPort>HTTP port number 8080 <query_id>Profile ID from step 3 — curl -u'<userName>:<userPassword>' \ "http://<selectdbAddress>:<httpPort>/api/profile?query_id=<query_id>"To save the output to a file, redirect the response:
curl -u'<userName>:<userPassword>' \ "http://<selectdbAddress>:<httpPort>/api/profile?query_id=<query_id>" > res.txtReplace the following placeholders with your actual values:
-
(Optional) Format the exported file for readability.
sudo sed -r 's/\\n/\n/g' res.txt | sed -r 's/\\t/\t/g' > new.txt
Profile parameter reference
The following tables describe all parameters collected in a query profile.
Fragment
| Parameter | Description |
|---|---|
AverageThreadTokens |
Number of threads used for fragment execution, excluding thread pool usage. |
Buffer Pool PeakReservation |
Peak memory usage of the buffer pool. |
MemoryLimit |
Memory limit for the query. |
PeakMemoryUsage |
Peak memory usage of an execution unit during the query. |
RowsProduced |
Number of rows whose columns are processed. |
BlockMgr
| Parameter | Description |
|---|---|
BlocksCreated |
Number of blocks created by BlockManager. |
BlocksRecycled |
Number of blocks that are reused. |
BytesWritten |
Total size of data written to disk. |
MaxBlockSize |
Size of a single block. |
TotalReadBlockTime |
Total time consumed to read blocks. |
DataStreamSender
| Parameter | Description |
|---|---|
BytesSent |
Total size of data sent. Calculated as: number of receivers x size of data sent. |
IgnoreRows |
Number of rows filtered out. |
LocalBytesSent |
Size of data sent by the local node and received by the local node during data exchange. |
OverallThroughput |
Total throughput. Calculated as: BytesSent / execution time. |
SerializeBatchTime |
Time consumed to serialize the data being sent. |
UncompressedRowBatchSize |
Size of the sent RowBatch data before compression. |
ODBC_TABLE_SINK
| Parameter | Description |
|---|---|
NumSentRows |
Total number of rows written to an external table. |
TupleConvertTime |
Time consumed to serialize the sent data into INSERT statements. |
ResultSendTime |
Time consumed to write data using the Open Database Connectivity (ODBC) driver. |
EXCHANGE_NODE
| Parameter | Description |
|---|---|
BytesReceived |
Size of data received over the network. |
MergeGetNext |
If sorting is performed on the child node, the exchange node performs unified merging and sorting. This is the total time for that merging and sorting, including MergeGetNextBatch. |
MergeGetNextBatch |
Time consumed by the merge node to fetch data. For single-level merging and sorting, data is fetched from a network queue. For multi-level merging and sorting, data is fetched from a child merger. |
ChildMergeGetNext |
When many senders are sending data, single-thread merging becomes a bottleneck. ApsaraDB for SelectDB starts multiple threads on the child merge node to merge and sort in parallel. This is the cumulative time across all threads. |
ChildMergeGetNextBatch |
Time consumed for the child merge node to fetch data. High values may indicate bottlenecks in the child data sending node. |
DataArrivalWaitTime |
Total time waiting for the sender to send data. |
FirstBatchArrivalWaitTime |
Time to wait for the first batch of data from the sender. |
DeserializeRowBatchTimer |
Time consumed to deserialize network data. |
SendersBlockedTotalTimer(*) |
Total waiting time on the sender side when the DataStreamRecv queue memory is fully occupied. |
ConvertRowBatchTime |
Time consumed to convert received data into RowBatch. |
RowsReturned |
Number of rows received. |
RowsReturnedRate |
Rate at which rows are received. |
SORT_NODE
| Parameter | Description |
|---|---|
InMemorySortTime |
Time consumed for in-memory sorting. |
InitialRunsCreated |
Number of times sorting is initialized. The value is 1 for in-memory sorting. |
SortDataSize |
Total size of data sorted. |
MergeGetNext |
Time consumed by MergeSort to get the next batch from multiple sorted runs. Measured only when data is written to disk. |
MergeGetNextBatch |
Time consumed by MergeSort to extract the batch of the next sorted run. Measured only when data is written to disk. |
TotalMergesPerformed |
Number of times external merging is performed. |
AGGREGATION_NODE
| Parameter | Description |
|---|---|
PartitionsCreated |
Number of partitions the aggregate query is split into. |
GetResultsTime |
Time consumed to get aggregate results from partitions. |
HTResizeTime |
Time consumed to resize the hash table. |
HTResize |
Number of times the hash table is resized. |
HashBuckets |
Number of buckets in the hash table. |
HashBucketsWithDuplicate |
Number of buckets with duplicate nodes in the hash table. |
HashCollisions |
Number of hash collisions in the hash table. |
HashDuplicateNodes |
Number of duplicate nodes belonging to the same bucket in the hash table. |
HashFailedProbe |
Number of failed probe operations on the hash table. |
HashFilledBuckets |
Number of buckets filled with data in the hash table. |
HashProbe |
Number of times the hash table is queried. |
HashTravelLength |
Number of steps moved during a hash table query. |
HASH_JOIN_NODE
| Parameter | Description |
|---|---|
ExecOption |
Hash table construction method for the right child node: synchronous or asynchronous. |
BuildBuckets |
Number of buckets in the hash table. |
BuildRows |
Number of rows in the hash table. |
BuildTime |
Time consumed to build the hash table. |
LoadFactor |
Load factor of the hash table, which indicates the number of non-empty buckets. |
ProbeRows |
Number of rows on which the hash probe operation is performed after traversing the left child nodes. |
ProbeTime |
Time consumed to traverse the left child nodes for hash probe, excluding time to call GetNext on RowBatch from the left child node. |
PushDownComputeTime |
Time consumed to compute predicate pushdown conditions. |
PushDownTime |
Total time consumed for predicate pushdown. Join queries on right child nodes that meet the conditions are converted to IN queries on left child nodes. |
CROSS_JOIN_NODE
| Parameter | Description |
|---|---|
ExecOption |
RowBatchList construction method for the right child node: synchronous or asynchronous. |
BuildRows |
Number of RowBatchList rows (rows in the right child node). |
BuildTime |
Time consumed to build the RowBatchList. |
LeftChildRows |
Number of rows in the left child node. |
LeftChildTime |
Time consumed to traverse left and right child nodes to generate the Cartesian product, excluding time to call GetNext on RowBatch from the left child node. |
UNION_NODE
| Parameter | Description |
|---|---|
MaterializeExprsEvaluateTime |
Time consumed to evaluate type conversion expressions and materialize results when field types differ on both sides of a UNION. |
ANALYTIC_EVAL_NODE
| Parameter | Description |
|---|---|
EvaluationTime |
Total time consumed for analytic function (window function) computation. |
GetNewBlockTime |
Time consumed to allocate a new block during initialization. The block caches the Rows window or an entire partition for analytic function computation. |
PinTime |
Time consumed to allocate a new block or re-read a block that was written to disk back into memory. |
UnpinTime |
Time consumed to write a block's data to disk when the block is no longer needed or when the current operator is under high memory pressure. |
OLAP_SCAN_NODE
OLAP_SCAN_NODE handles data scan tasks. It creates one or more OLAP scanners, each responsible for scanning a portion of the data. The profile is organized in three layers that reflect the call hierarchy: OLAP_SCAN_NODE -> OlapScanner -> SegmentIterator.
Some or all predicate conditions are pushed to OLAP_SCAN_NODE. Conditions that can use storage-engine indexes are pushed further to the storage engine for filtering; the remainder are evaluated in OLAP_SCAN_NODE itself.
The following example shows a typical OLAP_SCAN_NODE profile:
OLAP_SCAN_NODE (id=0):(Active: 1.2ms, % non-child: 0.00%)
- BytesRead: 265.00 B # Size of data read from data files. For example, 10 x 32-bit integers = 40 bytes. This reflects data fully loaded into memory, not raw I/O size.
- NumDiskAccess: 1 # Number of disks accessed by the ScanNode.
- NumScanners: 20 # Number of scanners created by the ScanNode.
- PeakMemoryUsage: 0.00 # Peak memory usage during the query.
- RowsRead: 7 # Rows returned from the storage engine to the scanner, excluding rows filtered by the scanner.
- RowsReturned: 7 # Rows returned from the ScanNode to the parent node.
- RowsReturnedRate: 6.979K /sec # RowsReturned / ActiveTime.
- TabletCount: 20 # Number of tablets accessed by the ScanNode.
- TotalReadThroughput: 74.70 KB/sec # BytesRead divided by total execution time. For I/O-bound queries, this approaches total disk throughput.
- ScannerBatchWaitTime: 426.886us # Time for the transfer thread to wait for the scanner thread to return RowBatch data.
- ScannerWorkerWaitTime: 17.745us # Time for the scanner thread to wait for available worker threads in the thread pool.
OlapScanner:
- BlockConvertTime: 8.941us # Time to convert a vectorized block to a row-structured RowBlock. The vectorized block is VectorizedRowBatch (V1) or RowBlockV2 (V2).
- BlockFetchTime: 468.974us # Time for the rowset reader to get blocks.
- ReaderInitTime: 5.475ms # Time for the OLAP scanner to initialize readers. Includes MergeHeap creation time (V1) and the time to generate all iterator levels and read the first block group (V2).
- RowsDelFiltered: 0 # Rows filtered based on DELETE information in the tablet and rows marked for deletion in the Unique Key model.
- RowsPushedCondFiltered: 0 # Rows filtered by predicates pushed down from a join operation (e.g., from BuildTable to ProbeTable). This value is approximate; filtering may be skipped if the effect is poor.
- ScanTime: 39.24us # Time for data to travel from ScanNode to the parent node.
- ShowHintsTime_V1: 0ns # V1 only: time to read some data for ScanRange splitting.
SegmentIterator:
- BitmapIndexFilterTimer: 779ns # Time to filter data using bitmap indexes.
- BlockLoadTime: 415.925us # Time for the segment reader (V1) or segment iterator (V2) to get blocks.
- BlockSeekCount: 12 # Number of block seeks during segment reading.
- BlockSeekTime: 222.556us # Time consumed for block seeks during segment reading.
- BlocksLoad: 6 # Number of blocks read.
- CachedPagesNum: 30 # V2 only: number of pages that hit the page cache.
- CompressedBytesRead: 0.00 # V1: size of data before decompression read from file. V2: size of pages before compression that did not hit the page cache.
- DecompressorTimer: 0ns # Time consumed for data decompression.
- IOTimer: 0ns # Actual I/O time for reading data from the operating system.
- IndexLoadTime_V1: 0ns # V1 only: time to read index streams.
- NumSegmentFiltered: 0 # Number of segments completely filtered out based on column statistics and query conditions.
- NumSegmentTotal: 6 # Total number of segments involved in the query.
- RawRowsRead: 7 # Source data rows read from the storage engine after all preceding filtering.
- RowsBitmapIndexFiltered: 0 # V2 only: rows filtered using bitmap indexes.
- RowsBloomFilterFiltered: 0 # V2 only: rows filtered using BloomFilter indexes.
- RowsKeyRangeFiltered: 0 # V2 only: rows filtered using SortkeyIndex indexes.
- RowsStatsFiltered: 0 # V2: rows filtered using ZoneMap indexes, including delete conditions. V1: includes rows filtered using BloomFilter indexes.
- RowsConditionsFiltered: 0 # V2 only: rows filtered by all column indexes combined.
- RowsVectorPredFiltered: 0 # Rows filtered by vectorized condition operations.
- TotalPagesNum: 30 # V2 only: total number of pages read.
- UncompressedBytesRead: 0.00 # V1: size of data after decompression (or file size if not compressed). V2: size of pages that did not hit the page cache after decompression (or page size if not compressed).
- VectorPredEvalTime: 0ns # Time to perform filtering by vectorized condition.
- ShortPredEvalTime: 0ns # Time to perform filtering by short-circuit predicate.
- PredColumnReadTime: 0ns # Time consumed for predicate column reading.
- LazyReadTime: 0ns # Time consumed to read non-predicate columns.
- OutputColumnTime: 0ns # Time consumed for column materialization.
How filtering metrics relate to each other
The row-count metrics in an OLAP_SCAN_NODE profile reveal how effectively predicates and indexes reduce the data volume at each stage. The following describes the V2-format filtering sequence. V1-format metrics have slightly different meanings.
Initialization phase
Filtering happens in this order:
-
SortkeyIndex — If the query has key ranges (prefix key ranges), data is filtered first. Rows eliminated are recorded in
RowsKeyRangeFiltered. -
Bitmap indexes — Accurate column filtering is applied for conditions that match bitmap indexes. Rows eliminated are recorded in
RowsBitmapIndexFiltered. -
BloomFilter indexes — Equality conditions (
=,IN,IS) are filtered using BloomFilter indexes. Rows eliminated are recorded inRowsBloomFilterFiltered.NoteRowsBloomFilterFilteredis calculated as total segment rows minus remaining rows after BloomFilter filtering. It may overlap with rows already eliminated by bitmap indexes. -
ZoneMap indexes — Data is filtered based on query and delete conditions. Rows eliminated are recorded in
RowsStatsFiltered. -
RowsConditionsFilteredis the total rows filtered by all indexes combined, includingRowsBloomFilterFilteredandRowsStatsFiltered.
Next phase
After initialization, the remaining data passes through additional stages:
-
RowsDelFiltered: Rows eliminated by delete conditions. These are also counted inRowsStatsFiltered. -
RawRowsRead: Rows remaining after all preceding filtering. -
RowsRead: Rows returned from the storage engine to the scanner. This is typically less thanRawRowsReadbecause the storage engine may aggregate rows. A large gap betweenRawRowsReadandRowsReadindicates heavy aggregation, which can be a performance concern. -
RowsReturned: Rows finally returned by ScanNode to the parent node. This is typically less thanRowsReadbecause predicates not pushed to the storage engine are applied at the scanner level. A large gap betweenRowsReadandRowsReturnedmeans many high-selectivity predicates are not reaching the storage engine, where filtering is more efficient.
Together, the Rows***Filtered metrics let you assess whether query predicates are being pushed to the storage engine and how effective each index type is.
Other metrics
-
Metrics under
OlapScanner— such asIOTimerandBlockFetchTime— are cumulative across all scanner threads. Because scanner threads read data asynchronously, these accumulated values reflect total scanner working time, not ScanNode wall-clock time. TheActivepercentage records the actual proportion of time ScanNode contributed to the overall execution plan. It is normal to seeIOTimerin the tens of seconds whileActiveis only a few seconds, for two reasons:-
IOTimeraccumulates across many scanners running in parallel. -
If the parent node is slow (for example, 100 seconds), ScanNode (10 seconds) finishes early and prepares data in the background. The parent then retrieves prepared data, so ScanNode's
Activetime is short.
-
-
NumScannersis the number of tasks the scanner submits to the thread pool, which is managed byRuntimeState. Thedoris_scanner_thread_pool_thread_numanddoris_scanner_thread_pool_queue_sizeparameters control thread pool size and queue length. If the number of threads is excessive or insufficient, the query efficiency is affected. You can also divide some metrics by the number of threads to roughly estimate the time consumed by each thread. -
TabletCountis the number of tablets to scan. A high tablet count means more random reads and data merge operations. -
UncompressedBytesReadreflects the volume of data read. A high value may indicate heavy I/O. -
Compare
CachedPagesNumagainstTotalPagesNumto assess page cache hit rate. A higher hit rate reduces both I/O and decompression overhead.
Buffer pool
| Parameter | Description |
|---|---|
AllocTime |
Time consumed for memory allocation. |
CumulativeAllocationBytes |
Cumulative size of memory allocated. |
CumulativeAllocations |
Cumulative number of memory allocations. |
PeakReservation |
Peak reservation size. |
PeakUnpinnedBytes |
Size of memory data that is unpinned. |
PeakUsedReservation |
Memory usage of reservations. |
ReservationLimit |
Reservation limit in the buffer pool. |