A query profile captures execution information from all worker nodes involved in a query. You can use the query profile to quickly identify query performance bottlenecks in a StarRocks instance through visual analysis.
Enable query profile
To enable the query profile for your instance, set the enable_profile variable to true.
SET enable_profile = true;You can globally enable the query profile without affecting query performance if your instance uses kernel version 2.5.13 or later (2.x series) or 3.1.5 or later (3.x series).
For kernel versions earlier than those listed above, upgrade the kernel before enabling the query profile to avoid degrading query performance.
Query profile structure
A query profile consists of the following five parts:
Fragment: An execution tree. A query consists of one or more fragments.
Fragment instance: A fragment can have multiple instances, each called a fragment instance. Each runs on a different compute node.
Pipeline: An execution chain. A pipeline consists of a series of operators. A fragment instance breaks down into one or more pipelines.
Pipeline driver: A pipeline can have multiple instances, each called a pipeline driver, to fully utilize multiple cores.
Operator: A component of a pipeline driver. A pipeline driver consists of multiple operators.
Visualize a Query Profile
StarRocks Manager visualizes the Query Profile as a tree structure, where each node represents an aggregated Operator. Click an Operator to view its details in the tabs on the right side of the page. If no Operator is selected, the right side displays the Query overview.
Execution overview
The Execution Details page shows a visual execution plan tree on the left. Each operator node is labeled with its execution time and percentage. Nodes that consume a lot of time, such as HASH_JOIN, are highlighted in red. The lines between nodes indicate the number of rows processed. On the right, the top section shows a sorted list of nodes by time percentage, and the bottom section provides an overview of summary metrics.
The Execution Details page displays key summary metrics, including execution time, I/O performance, and network data transfer volume.
Total
Metric
Description
Total
The total time consumed by the query. This includes the duration of the planning, execution, and analysis phases.
PlannerTotalTime
The planning time. This includes parsing the SQL, analysis, optimization, and generating the execution plan.
ExecutionWallTime
The query execution time and the BE-side execution time (wall-clock time of execution).
CollectProfileTime
The profile collection time. This is the time spent collecting profile metrics from BEs.
Execution time
Metric
Description
ExecutionWallTime
The total time consumed by the query execution.
CumulativeCpuTime
The sum of CPU time for all BE nodes. This reflects the total CPU resource consumption.
CpuUtilization
CPU utilization = CumulativeCpuTime / ExecutionWallTime. This indicates the average number of CPU cores used.
CumulativeWaitTime
The cumulative wait time of the pipeline. This includes scheduling wait time (ScheduleTime) and blocking time (PendingTime) due to conditions such as an empty input queue, a full output queue, or unready dependencies.
OperatorCumulativetime
The sum of the execution times for all operators. This includes I/O, network, and computation.
IO
The cumulative I/O time for all Scan operators. This includes local disk reads, remote reads, and data cache access.
I/O
Metric
Description
RawRowsRead
The total number of records scanned by all SCAN nodes.
DiskReadBytes
The total amount of compressed data read by all SCAN execution nodes.
LocalDiskReadBytes
The total size of compressed data read from the local cache by all Connector Scan execution nodes. This metric applies only to shared-data instances.
RemoteReadBytes
The total size of compressed data read from OSS by all Connector Scan execution nodes. This metric applies only to shared-data instances.
ResultRows
The total number of records output by all SCAN execution nodes.
ResultBytes
The total amount of data read by all SCAN execution nodes.
Network
Metric
Description
Bytes sent over network
The total number of bytes transferred by all Exchange execution nodes. This corresponds to the BytesSent metric.
Operator
Click an operator to view the following information on the right side of the page:
CoreMetrics tab: Displays the core metrics for the operator.
NodeMetrics tab: Displays all metrics for the operator.
Pipeline tab: Displays metrics for the pipeline that contains the operator. These metrics are related only to scheduling and do not require close attention.
The greater the proportion of time an operator spends, the darker its color. The top three nodes with the longest execution times are color-coded. This helps you easily identify query bottlenecks.
Scroll the mouse wheel or click the zoom buttons to zoom in and out of the profile tree.
Key metrics
Query level
Summary metrics
Metric
Description
Total
The query's total elapsed time, including time spent in the Planning, Executing, and Profiling phases.
QueryCpuCost
The cumulative CPU time consumed by the query. This value is the sum of CPU time from all concurrent processes, so it can exceed the actual execution time.
QueryMemCost
The total memory consumed by the query.
Variables
Variables associated with the query.
Pipeline level
Metric
Description
ActiveTime
The driver execution time.
DriverTotalTime
The total time consumed by the driver.
PendingTime
The time a driver waits for its inputs and prerequisites.
Operator level
General operator metrics
Metric
Description
OperatorTotalTime
Total execution time of the operator.
PushRowNum
Total rows output by the operator.
PullRowNum
Total rows input to the operator.
PullChunkNum
Total chunks input to the operator.
PushChunkNum
Total chunks output by the operator.
PeakMemoryUsage
The operator's peak memory usage.
OLAP Scan Operator
Metric
Description
Table
The table name.
ScanTime
Total time spent on scan operations, which are performed in an asynchronous I/O thread pool.
TabletCount
Number of tablets scanned.
PushdownPredicates
The number of pushed-down predicates.
BytesRead
The amount of data read.
CompressedBytesRead
The amount of compressed data read.
IOTime
Total I/O time.
BitmapIndexFilterRows
The number of rows filtered by bitmap indexes.
BloomFilterFilterRows
The number of rows filtered by the Bloom filter.
SegmentRuntimeZoneMapFilterRows
The number of rows filtered by the runtime zone map.
SegmentZoneMapFilterRows
The number of rows filtered by the zone map.
ShortKeyFilterRows
The number of rows filtered by the short key.
ZoneMapIndexFilterRows
The number of rows filtered by the zone map index.
Connector Scan Operator
These metrics are in addition to those of the OLAP Scan Operator and apply to instances that use compute and storage separation.
Metric
Description
CompressedBytesReadLocalDisk
The amount of compressed data read from the local cache on the compute node.
CompressedBytesReadRemote
The total amount of compressed data read from OSS.
IOTimeLocalDisk
The time spent on I/O operations when reading data from the local cache.
IOTimeRemote
The time spent on I/O operations when reading data from OSS.
Exchange Operator
Sink
Metric
Description
PartType
The data distribution mode. Valid values:
UNPARTITIONED,RANDOM,HASH_PARTITIONED, andBUCKET_SHUFFLE_HASH_PARTITIONED.BytesSent
The amount of data sent.
OverallThroughput
The overall data transfer throughput.
NetworkTime
The time spent transmitting packets, excluding post-reception processing time.
WaitTime
The time spent waiting because the sender-side queue is full.
NetworkBandwidth
The calculated network bandwidth during data transfer.
Source
Metric
Description
SenderWaitLockTime
Total time the sender waited for a lock.
BytesReceived
The amount of data received.
DecompressChunkTime
Total time spent decompressing data chunks.
DeserializeChunkTime
Total time spent deserializing data chunks.
SenderTotalTime
The total time spent by the sender.
Aggregate Operator
Metric
Description
GroupingKeys
Columns used in the GROUP BY clause.
AggregateFunctions
The aggregate functions applied.
AggComputeTime
The time spent computing aggregate functions.
ExprComputeTime
The time spent computing expressions.
HashTableSize
Size of the hash table used for aggregation.
Join Operator
Probe
Metric
Description
DistributionMode
The data distribution mode.
JoinType
The join type.
OtherJoinConjunctEvaluateTime
The time spent evaluating other join conjuncts.
ProbeConjunctEvaluateTime
The time spent evaluating probe conjuncts.
SearchHashTableTime
The time spent searching the hash table.
WhereConjunctEvaluateTime
The time spent evaluating WHERE conjuncts.
Build
Metric
Description
JoinPredicates
Predicates used in the join condition.
JoinType
The join type.
BuildBuckets
Number of buckets in the join's hash table.
BuildHashTableTime
Time spent building the hash table for the join.
RuntimeFilterBuildTime
The time spent building the runtime filter.
RuntimeFilterNum
The number of runtime filters.
DistributionMode
The data distribution mode.
Window Function Operator
Metric
Description
ComputeTime
The time spent computing window functions.
PartitionKeys
Keys used to partition the window.
AggregateFunctions
The aggregate functions applied.
Sort Operator
Metric
Description
SortKeys
The sort keys.
SortType
The sorting method used, such as full sort or top-N sort.
MergingTime
Time spent merging sorted runs of data.
SortingTime
The time spent sorting data.
TableFunction Operator
Metric
Description
TableFunctionExecTime
The time spent computing the table function.
TableFunctionExecCount
The number of times the table function was executed.
Project Operator
Metric
Description
ExprComputeTime
The time spent computing expressions.
CommonSubExprComputeTime
The time spent computing common subexpressions.
LocalExchange Operator
Metric
Description
Type
The type of local exchange. Valid values:
Passthrough,Partition, andBroadcast.ShuffleNum
The number of shuffles. This metric applies only when
TypeisPartition.