All Products
Search
Document Center

E-MapReduce:Query profile

Last Updated:Mar 18, 2025

A query profile records the execution information of all nodes involved in a query. You can use query profiles to perform visualized analysis and quickly identify bottlenecks that affect the query performance of a StarRocks instance.

Enable the query profile feature

To enable the query profile feature for a StarRocks instance, run the following command to set the enable_profile parameter to true.

SET enable_profile = true;
Note
  • To prevent the query performance from being affected, you can globally enable the query profile feature for a StarRocks instance only if the kernel version of the StarRocks instance is 2.5.13 or later or 3.1.5 or later.

  • If the kernel version of a StarRocks instance is earlier than the preceding versions, we recommend that you upgrade the kernel to the required version before you enable the query profile feature. Otherwise, the query performance may be affected.

Query profile structure

A query profile consists of the following parts:

  • Fragment: A query consists of one or more fragments.

  • Fragment instance: A fragment can contain multiple fragment instances. If a fragment contains multiple fragment instances, the fragment instances are executed by different compute nodes.

  • Pipeline: A pipeline consists of a group of operators. A fragment instance is split into multiple pipelines.

  • Pipeline driver: A pipeline can contain multiple pipeline drivers. Each pipeline driver runs on different cores to improve the utilization of cores.

  • Operator: A pipeline driver consists of multiple operators.

Visualized query profiles

E-MapReduce (EMR) StarRocks Manager provides visualized query profiles. A query profile is presented as a tree structure. Tree nodes represent aggregated operators. On the Execution Details tab of a query, you can click a tree node to view the details of the corresponding operator in the right-side area of the tab. If you do not click a node, the overview information of the query is displayed in the Overview section of the Execution Details tab.

Execution overviewimage

In the Overview section, you can view the metrics related to execution time, I/O performance, and amount of transmitted data.

  • Execution time

    Metric

    Description

    I/O

    The total I/O time consumed by all SCAN nodes.

    LocalDiskReadIOTime

    The I/O time consumed to read data from the local cache. This metric is available only for StarRocks shared-data instances.

    RemoteReadIOTime

    The I/O time consumed to read data from Object Storage Service (OSS). This metric is available only for StarRocks shared-data instances.

    IoSeekTime

    The total I/O seek time. This metric is available only for StarRocks shared-data instances.

    Processing

    The time consumed by an operator to perform computing operations.

    ExecutionWallTime

    The time consumed in the query execution phase.

  • I/O

    Metric

    Description

    RawRowsRead

    The total number of rows that are scanned by all SCAN nodes.

    DiskReadBytes

    The total amount of compressed data that is read by all SCAN nodes.

    LocalDiskReadBytes

    The total amount of compressed data that is read from the local cache by all CONNECTOR_SCAN nodes. This metric is available only for StarRocks shared-data instances.

    RemoteReadBytes

    The total amount of compressed data that is read from OSS by all CONNECTOR_SCAN nodes. This metric is available only for StarRocks shared-data instances.

    ResultRows

    The total number of output records of all SCAN nodes.

    ResultBytes

    The total amount of data that is read by all SCAN nodes.

  • Network

    Metric

    Description

    Bytes sent over network

    The total number of bytes transmitted by all Exchange nodes. This metric corresponds to the BytesSent metric.

Operator

You can click an operator in the query profile to view the details of the operator on the following tabs:

  • CoreMetrics: This tab displays the core metrics of the operator.

  • NodeMetrics: This tab displays all metrics of the operator.

  • Pipeline: This tab displays the metrics of the pipeline to which the operator belongs. The metrics on this tab are related only to scheduling. You do not need to pay much attention to such metrics.

Note
  • The system highlights the top three time-consuming nodes in the profile tree by using different colors. The color of the bar for an execution time-related metric darkens as the time consumed by the operator increases. This helps you easily identify the bottlenecks in your query.

  • You can scroll the mouse wheel or click the Zoom In or Zoom Out icon to zoom in or zoom out the profile tree.

Important metrics

Query-related metrics

  • Summary metrics

    Metric

    Description

    Total

    The time consumed by a query, including the time consumed in the planning, execution, and profiling phases.

    QueryCpuCost

    The total time during which the CPU is used by all processes. The value is greater than the actual CPU running time.

    QueryMemCost

    The total amount of memory consumed by a query.

    Variables

    The queried variables.

  • Pipeline-related metrics

    Metric

    Description

    ActiveTime

    The time when a driver is executed.

    DriverTotalTime

    The total time consumed by a driver.

    PendingTime

    The period of time that a driver waits when the input or prerequisite is not met.

Operator-related metrics

  • General-purpose operator metrics and chunk-related metrics

    Metric

    Description

    OperatorTotalTime

    The total time consumed by an operator.

    PushRowNum

    The total number of output rows generated by an operator.

    PullRowNum

    The total number of input rows generated by an operator.

    PullChunkNum

    The total number of input chunks generated by an operator.

    PushChunkNum

    The total number of output chunks generated by an operator.

    PeakMemoryUsage

    The maximum amount of memory used by a query.

  • Metrics of OLAP scan operators

    Metric

    Description

    Table

    The name of the table.

    ScanTime

    The total scan time. Scan operations are performed in an asynchronous I/O thread pool.

    TabletCount

    The number of tablets.

    PushdownPredicates

    The number of predicates that are pushed down.

    BytesRead

    The number of bytes read by the query.

    CompressedBytesRead

    The size of compressed data that is read.

    IOTime

    The total I/O time.

    BitmapIndexFilterRows

    The number of data rows that are filtered by using a bitmap index.

    BloomFilterFilterRows

    The number of data rows that are filtered by using a bloom filter.

    SegmentRuntimeZoneMapFilterRows

    The number of data rows that are filtered by using a runtime zone map.

    SegmentZoneMapFilterRows

    The number of data rows that are filtered by using a zone map.

    ShortKeyFilterRows

    The number of data rows that are filtered by using a short key.

    ZoneMapIndexFilterRows

    The number of data rows that are filtered by using a zone map index.

  • Metrics of connector scan operators

    The following table describes the metrics of connector scan operators. The metrics are available only for StarRocks shared-data instances.

    Metric

    Description

    CompressedBytesReadLocalDisk

    The amount of compressed data that is read from the local cache of a compute node.

    CompressedBytesReadRemote

    The total amount of compressed data that is read from OSS.

    IOTimeLocalDisk

    The I/O time consumed to read data from the local cache.

    IOTimeRemote

    The I/O time consumed to read data from OSS.

  • Metrics of exchange operators

    • Sink

      Metric

      Description

      PartType

      The distribution mode of data. Valid values: UNPARTITIONED, RANDOM, HASH_PARTITIONED, and BUCKET_SHUFFLE_HASH_PARTITIONED.

      BytesSent

      The size of data that is sent.

      OverallThroughput

      The throughput.

      NetworkTime

      The time consumed to transmit a data packet. The processing time for the data packet is not included.

      WaitTime

      The waiting time due to a full queue of the sender.

      NetworkBandwidth

      The network bandwidth.

    • Source

      Metric

      Description

      SenderWaitLockTime

      The time consumed to wait for a lock.

      BytesReceived

      The size of received data.

      DecompressChunkTime

      The time consumed to decompress data.

      DeserializeChunkTime

      The time consumed to deserialize data.

      SenderTotalTime

      The total time consumed to send data.

  • Metrics of aggregate operators

    Metric

    Description

    GroupingKeys

    The GROUP BY columns.

    AggregateFunctions

    The aggregate function.

    AggComputeTime

    The time consumed by the calculation performed based on an aggregate function.

    ExprComputeTime

    The time consumed by the calculation performed based on an expression.

    HashTableSize

    The size of a hash table.

  • Metrics of join operators

    • Probe

      Metric

      Description

      DistributionMode

      The data distribution mode.

      JoinType

      The join type.

      OtherJoinConjunctEvaluateTime

      The time consumed by the JoinConjunct operation.

      ProbeConjunctEvaluateTime

      The time consumed by conjunct evaluation in the probe phase.

      SearchHashTableTime

      The time consumed to query a hash table.

      WhereConjunctEvaluateTime

      The time consumed by the Where Conjunct operation.

    • Build

      Metric

      Description

      JoinPredicates

      The join predicate.

      JoinType

      The join type.

      BuildBuckets

      The number of buckets in a hash table.

      BuildHashTableTime

      The time consumed to create a hash table.

      RuntimeFilterBuildTime

      The time consumed to create a runtime filter.

      RuntimeFilterNum

      The number of runtime filters.

      DistributionMode

      The data distribution mode.

  • Metrics of window function operators

    Metric

    Description

    ComputeTime

    The time consumed by the calculation performed based on a window function.

    PartitionKeys

    The partition key column.

    AggregateFunctions

    The aggregate function.

  • Metrics of sort operators

    Metric

    Description

    SortKeys

    The sort key.

    SortType

    The sorting order of the query results. You can sort the Top N query results or all query results.

    MergingTime

    The time consumed to merge data.

    SortingTime

    The time consumed to sort data.

  • Metrics of TableFunction operators

    Metric

    Description

    TableFunctionExecTime

    The time consumed by the calculation performed based on a table function.

    TableFunctionExecCount

    The number of times that a table function is executed.

  • Metrics of project operators

    Metric

    Description

    ExprComputeTime

    The time consumed by the calculation performed based on an expression.

    CommonSubExprComputeTime

    The time consumed for by calculation performed based on a common sub-expression.

  • Metrics of Local Exchange operators

    Metric

    Description

    Type

    The type of Local Exchange. Valid values: Passthrough, Partition, and Broadcast.

    ShuffleNum

    The amount of data that is shuffled. This metric is valid only if the value of the Type metric is Partition.