All Products
Search
Document Center

ApsaraDB for SelectDB:Query profiles

Last Updated:Jun 18, 2025

You can analyze query profiles to help improve query performance. This topic describes how to obtain the query profiles of ApsaraDB for SelectDB.

Enable the query profile feature

Configure the enable_profile variable to enable the query profile feature. For more information, see Variable management.

SET enable_profile=true;
Note
  • After you enable the query profile feature, SelectDB generates a profile for the execution of each query.

  • A query profile contains the execution details of a query on each node. This helps you analyze query performance bottlenecks.

View query profiles

Execute the following statement to view all query profiles that are stored:

Important

The following statement is supported only for SelectDB instances whose kernel versions are earlier than SelectDB Core V4.0.0. If your instance runs SelectDB Core V4.0.0 or later, we recommend that you preview or download query profiles in the SelectDB console. You can also send HTTP requests to obtain the query profiles. For more information, see Use the query audit feature and Export query profiles.

SHOW QUERY PROFILE "/"\G;

In the returned result set, each row of data corresponds to a query profile. You can view the details of a query profile based on the profile ID.

Note

In HTTP requests that are sent to query profiles, fields such as Query ID and query_id indicate the profile ID.

SHOW QUERY PROFILE "/"\G
*************************** 1. row ***************************
   Profile ID: c257c52f93e149ee-ace8ac14e8c9fef9
    Task Type: QUERY
   Start Time: 2021-04-08 11:30:50
     End Time: 2021-04-08 11:30:50
        Total: 9ms
   Task State: EOF
         User: root
   Default Db: default_cluster:db1
Sql Statement: select tbl1.k1, sum(tbl1.k2) from tbl1 join tbl2 on tbl1.k1 = tbl2.k1 group by tbl1.k1 order by tbl1.k1

Use query profiles

Perform the following steps to troubleshoot the performance bottlenecks of an SQL query statement:

  1. View the execution plan tree.

    This step is mainly used to analyze the execution plan as a whole and view the execution time of each fragment. In the returned result, each node is marked with the fragment to which it belongs, and the sender node of each fragment is marked with the execution time of this fragment. The time is the longest among the execution time of all execution units in a fragment. This execution plan tree helps find the most time-consuming fragment. The following sample code shows a sample execution plan tree:

    SHOW QUERY PROFILE "/c257c52f93e149ee-ace8ac14e8c9fef9"\G
    *************************** 1. row ***************************
    Fragments:
                 ┌──────────────────────┐
                 │[-1: DataBufferSender]│
                 │Fragment: 0           │
                 │MaxActiveTime: 6.626 ms│
                 └──────────────────────┘
                             │
                   ┌──────────────────┐
                   │[9: EXCHANGE_NODE]│
                   │Fragment: 0       │
                   └──────────────────┘
                             │
                 ┌──────────────────────┐
                 │[9: DataStreamSender] │
                 │Fragment: 1           │
                 │MaxActiveTime: 5.449 ms│
                 └──────────────────────┘
                             │
                     ┌──────────────┐
                     │[4: SORT_NODE]│
                     │Fragment: 1   │
                     └──────────────┘
                            ┌┘
                 ┌─────────────────────┐
                 │[8: AGGREGATION_NODE]│
                 │Fragment: 1          │
                 └─────────────────────┘
                            └┐
                   ┌──────────────────┐
                   │[7: EXCHANGE_NODE]│
                   │Fragment: 1       │
                   └──────────────────┘
                             │
                 ┌──────────────────────┐
                 │[7: DataStreamSender] │
                 │Fragment: 2           │
                 │MaxActiveTime: 3.505 ms│
                 └──────────────────────┘
                            ┌┘
                 ┌─────────────────────┐
                 │[3: AGGREGATION_NODE]│
                 │Fragment: 2 │
                 └─────────────────────┘
                            │
                  ┌───────────────────┐
                  │[2: HASH_JOIN_NODE]│
                  │Fragment: 2        │
                  └───────────────────┘
               ┌────────────┴────────────┐
     ┌──────────────────┐      ┌──────────────────┐
     │[5: EXCHANGE_NODE]│      │[6: EXCHANGE_NODE]│
     │Fragment: 2       │      │Fragment: 2       │
     └──────────────────┘      └──────────────────┘
               │                         │
    ┌─────────────────────┐ ┌────────────────────────┐
    │[5: DataStreamSender]│ │[6: DataStreamSender] │
    │Fragment: 4          │ │Fragment: 3             │
    │MaxActiveTime: 1.87 ms│ │MaxActiveTime: 636.767 us│
    └─────────────────────┘ └────────────────────────┘
               │                        ┌┘
     ┌───────────────────┐    ┌───────────────────┐
     │[0: OLAP_SCAN_NODE]│    │[1: OLAP_SCAN_NODE]│
     │Fragment: 4        │    │Fragment: 3        │
     └──────────────────┘      └──────────────────┘
               │                        │
        ┌─────────────┐          ┌─────────────┐
        │[OlapScanner]│          │[OlapScanner]│
        │Fragment: 4  │          │Fragment: 3  │
        └─────────────┘          └─────────────┘
               │                        │
      ┌─────────────────┐      ┌─────────────────┐
      │[SegmentIterator]│      │[SegmentIterator]│
      │Fragment: 4      │      │Fragment: 3      │
      └─────────────────┘      └─────────────────┘
     
    1 row in set (0.02 sec)
  2. View the execution units in a specific fragment.

    In the preceding example, Fragment 1 is the most time-consuming fragment. Therefore, you can view the execution units in Fragment 1. The following sample code shows how to query the execution nodes and execution time of all execution units in Fragment 1:

    SHOW QUERY PROFILE "/c257c52f93e149ee-ace8ac14e8c9fef9/1";
    +-----------------------------------+-------------------+------------+
    | Instances                         | Host              | ActiveTime |
    +-----------------------------------+-------------------+------------+
    | c257c52f93e149ee-ace8ac14e8c9ff03 | 10.200.00.01:9060 | 5.449ms    |
    | c257c52f93e149ee-ace8ac14e8c9ff05 | 10.200.00.02:9060 | 5.367ms    |
    | c257c52f93e149ee-ace8ac14e8c9ff04 | 10.200.00.03:9060 | 5.358ms    |
    +-----------------------------------+-------------------+------------+ 
  3. View a specific execution unit.

    You can view the profile of each operator on a specific execution unit. The following sample code shows how to query the profile of each operator on the execution unit c257c52f93e149ee-ace8ac14e8c9ff03 in Fragment 1:

    SHOW QUERY PROFILE "/c257c52f93e149ee-ace8ac14e8c9fef9/1/c257c52f93e149ee-ace8ac14e8c9ff03"\G
    *************************** 1. row ***************************
    Instance:
     ┌───────────────────────────────────────┐
     │[9: DataStreamSender]                  │
     │(Active: 37.222 us, non-child: 0.40)    │
     │  - Counters:                          │
     │      - BytesSent: 0.00                │
     │      - IgnoreRows: 0                  │
     │      - OverallThroughput: 0.0 /sec    │
     │      - PeakMemoryUsage: 8.00 KB       │
     │      - SerializeBatchTime: 0ns        │
     │      - UncompressedRowBatchSize: 0.00 │
     └───────────────────────────────────────┘
                         └┐
                          │
        ┌──────────────────────────────────┐
        │[4: SORT_NODE]                    │
        │(Active: 5.421 ms, non-child: 0.71)│
        │  - Counters:                     │
        │      - PeakMemoryUsage: 12.00 KB │
        │ - RowsReturned: 0 │
        │      - RowsReturnedRate: 0       │
        └──────────────────────────────────┘
                         ┌┘
                         │
       ┌───────────────────────────────────┐
       │[8: AGGREGATION_NODE]              │
       │(Active: 5.355 ms, non-child: 10.68)│
       │  - Counters:                      │
       │      - BuildTime: 3.701us         │
       │      - GetResultsTime: 0 ns        │
       │      - HTResize: 0                │
       │      - HTResizeTime: 1.211 us      │
       │      - HashBuckets: 0             │
       │      - HashCollisions: 0          │
       │      - HashFailedProbe: 0         │
       │      - HashFilledBuckets: 0       │
       │      - HashProbe: 0               │
       │      - HashTravelLength: 0        │
       │      - LargestPartitionPercent: 0 │
       │      - MaxPartitionLevel: 0       │
       │      - NumRepartitions: 0         │
       │      - PartitionsCreated: 16      │
       │      - PeakMemoryUsage: 34.02 MB  │
       │      - RowsProcessed: 0           │
       │      - RowsRepartitioned: 0       │
       │      - RowsReturned: 0            │
       │      - RowsReturnedRate: 0        │
       │      - SpilledPartitions: 0       │
       └───────────────────────────────────┘
                         └┐
                          │
    ┌──────────────────────────────────────────┐
    │[7: EXCHANGE_NODE]                        │
    │(Active: 4.360 ms, non-child: 46.84)       │
    │  - Counters:                             │
    │      - BytesReceived: 0.00               │
    │      - ConvertRowBatchTime: 387 ns        │
    │      - DataArrivalWaitTime: 4.357 ms      │
    │      - DeserializeRowBatchTimer: 0 ns     │
    │      - FirstBatchArrivalWaitTime: 4.356 ms│
    │      - PeakMemoryUsage: 0.00             │
    │      - RowsReturned: 0                   │
    │      - RowsReturnedRate: 0               │
    │      - SendersBlockedTotalTimer(*): 0 ns  │
    └──────────────────────────────────────────┘

Export query profiles

Perform the following steps to export query profiles for subsequent analysis:

Important

Only SelectDB instances that run SelectDB Core V3.0.4 and later support this feature.

  1. Enable the query profile feature.

    SET enable_profile=true;
  2. Execute a query statement.

    The following code shows a sample query statement. Replace it with an SQL query statement that you want to use in actual use.

    -- Send an SQL request. In this case, the profile of the query statement is generated.
    SELECT count(1) FROM test_table LIMIT 10;
  3. Query profile IDs.

    SelectDB Core V3.0.x

    The following sample code shows an example on how to query the profile IDs of a SelectDB instance that runs SelectDB Core V3.0.x:

    SHOW QUERY PROFILE "/";
    +-----------------------------------+-----------+---------------------+---------------------+-------+------------+-------+------------------------------------+-------------------------------+
    | Profile ID                        | Task Type | Start Time          | End Time            | Total | Task State | User  | Default Db                         | Sql Statement                 |
    +-----------------------------------+-----------+---------------------+---------------------+-------+------------+-------+------------------------------------+-------------------------------+
    | b9c9ba063d9d4365-97878361371e757a | QUERY     | 2024-02-07 17:40:04 | 2024-02-07 17:40:04 | 32ms  | EOF        | admin | default_cluster:information_schema | select * from user_privileges |
    | 8800c306137e4072-9bb1ed419f4ac9f2 | QUERY     | 2024-02-07 17:40:00 | 2024-02-07 17:40:00 | 3ms   | ERR        | admin | default_cluster:information_schema | select * from user_priveleges |
    | e2efdd1a996c4de2-ab939ad49b409990 | QUERY     | 2024-02-07 17:39:51 | 2024-02-07 17:39:51 | 13ms  | EOF        | admin |                                    | SELECT DATABASE()             |
    +-----------------------------------+-----------+---------------------+---------------------+-------+------------+-------+------------------------------------+-------------------------------+

    SelectDB Core V4.0.x

    The following sample code shows an example on how to query the profile IDs of an ApsaraDB for SelectDB instance that runs SelectDB Core V4.0.x. The Query ID field indicates the profile ID.

    curl -u'<userName>:<userPassword>' "http://<selectdbAddress>:<httpPort>/rest/v2/manager/query/query_info?is_all_node=true" 
    
    # Response
    {
       "msg": "success",  
        "code": 0,  
        "data": {  
            "column_names": [  
                "Query ID",  
                "FE Node",  
                "Query User",  
                "Execution Database",  
                "Sql",  
                "Query Type",  
                "Start Time",  
                "End Time",  
                "Execution Duration",  
                "Status"  
            ],  
            "rows": [  
                [  
                    ...  
                ]  
            ]  
        },  
        "count": 0  
    }
  4. Run the curl command to export the profile into a file based on the profile ID:

    curl -u'<userName>:<userPassword>' "http://<selectdbAddress>:<httpPort>/api/profile?query_id=<query_id>" 
    
    # You can use a redirection symbol to export the profiles into a file.
    curl -u'<userName>:<userPassword>' "http://<selectdbAddress>:<httpPort>/api/profile?query_id=<query_id>" > res.txt

    Parameters

    Parameter

    Description

    userName

    The username that is used to access the SelectDB instance.

    userPassword

    The password that is used to access the SelectDB instance.

    selectdbAddress

    The endpoint of the SelectDB instance.

    httpPort

    The HTTP port number of the SelectDB instance. Default value: 8080.

    query_id

    The query ID of the profile.

  5. Optional. If you use a redirection symbol to export the profile into a file, you can format the content to make it more readable.

    sudo sed -r 's/\\n/\n/g' res.txt | sed -r 's/\\t/\t/g' > new.txt

Profile parameters

The following tables describe the parameters in the collected statistics.

Fragment

Parameter

Description

AverageThreadTokens

The number of threads that are used for fragment execution, excluding the usage of the thread pool.

Buffer Pool PeakReservation

The peak memory usage of the buffer pool.

MemoryLimit

The memory limit for the query.

PeakMemoryUsage

The peak memory usage of an execution unit during the query.

RowsProduced

The number of rows whose columns are processed.

BlockMgr

Parameter

Description

BlocksCreated

The number of blocks that are created by BlockManager.

BlocksRecycled

The number of blocks that are reused.

BytesWritten

The total size of data that is written to the disk.

MaxBlockSize

The size of a single block.

TotalReadBlockTime

The total time consumed to read blocks.

DataStreamSender

Parameter

Description

BytesSent

The total size of data that is sent. The data size is calculated by using the following formula: Total size of data that is sent = Number of receivers × Size of data that is sent.

IgnoreRows

The number of rows that are filtered.

LocalBytesSent

The size of data that is sent by the local node and then received by the local node during data exchange.

OverallThroughput

The total throughput. The total throughput is calculated by using the following formula: Total throughput = Value of the BytesSent parameter/Execution time.

SerializeBatchTime

The time consumed to serialize the data that is sent.

UncompressedRowBatchSize

The size of the sent RowBatch data before compression.

ODBC_TABLE_SINK

Parameter

Description

NumSentRows

The total number of rows that are written to an external table.

TupleConvertTime

The time consumed to serialize the sent data into INSERT statements.

ResultSendTime

The time consumed to write data by using the Open Database Connectivity (ODBC) driver.

EXCHANGE_NODE

Parameter

Description

BytesReceived

The size of data that is received over the network.

MergeGetNext

If sorting is performed on the child node, the exchange node performs unified merging and sorting and outputs ordered results. This parameter indicates the total time consumed for merging and sorting, including the time specified by the MergeGetNextBatch parameter.

MergeGetNextBatch

The time consumed by the merge node to fetch data. For single-level merging and sorting, the object that fetches data is a network queue. For multi-level merging and sorting, the object that fetches data is a child merger.

ChildMergeGetNext

If excessive senders are sending data, single-thread merging becomes a performance bottleneck. In this case, ApsaraDB for SelectDB starts multiple threads on the child merge node to perform merging and sorting in parallel. This parameter records the time consumed by the child merge node for sorting, which is the sum of the time consumed by all threads.

ChildMergeGetNextBatch

The time consumed for the child merge node to fetch data. If excessive time is consumed, the child data sending node may have bottlenecks.

DataArrivalWaitTime

The total time to wait for the sender to send data.

FirstBatchArrivalWaitTime

The time to wait for the first batch to obtain data from the sender.

DeserializeRowBatchTimer

The time consumed to deserialize network data.

SendersBlockedTotalTimer(*)

The total waiting time on the sender side if the memory of the DataStreamRecv queue is fully occupied.

ConvertRowBatchTime

The time consumed to convert the received data into RowBatch data.

RowsReturned

The number of rows that are received.

RowsReturnedRate

The rate at which rows are received.

SORT_NODE

Parameter

Description

InMemorySortTime

The time consumed for in-memory sorting.

InitialRunsCreated

The number of times to initialize the sorting. The value is 1 for in-memory sorting.

SortDataSize

The total size of data that is sorted.

MergeGetNext

The time consumed by MergeSort to obtain the next batch from multiple sorted runs. Timing is performed only when data is written to disks.

MergeGetNextBatch

The time consumed by MergeSort to extract the batch of the next sorted run. Timing is performed only when data is written to disks.

TotalMergesPerformed

The number of times that external merging is performed.

AGGREGATION_NODE

Parameter

Description

PartitionsCreated

The number of partitions into which the aggregate query is split.

GetResultsTime

The time consumed to obtain aggregate results from partitions.

HTResizeTime

The time consumed to resize the hash table.

HTResize

The number of times that the hash table is resized.

HashBuckets

The number of buckets in the hash table.

HashBucketsWithDuplicate

The number of buckets with duplicate nodes in the hash table.

HashCollisions

The number of hash collisions that occur in the hash table.

HashDuplicateNodes

The number of duplicate nodes that belong to the same bucket in the hash table.

HashFailedProbe

The number of failed probe operations that are performed on the hash table.

HashFilledBuckets

The number of buckets that are filled with data in the hash table.

HashProbe

The number of times that the hash table is queried.

HashTravelLength

The number of steps that are moved during the hash table query.

HASH_JOIN_NODE

Parameter

Description

ExecOption

The hash table construction method for a right child node. The methods include the synchronous and asynchronous methods. The right child node in a join may be a table or a subquery. This rule also applies to left child nodes.

BuildBuckets

The number of buckets in the hash table.

BuildRows

The number of rows in the hash table.

BuildTime

The time consumed to construct a hash table.

LoadFactor

The load factor of the hash table, which indicates the number of non-empty buckets.

ProbeRows

The number of rows on which the hash probe operation is performed after the left child nodes are traversed.

ProbeTime

The time consumed to traverse the left child nodes for hash probe operation, excluding the time consumed to call the GetNext operation for RowBatch on the left child node side.

PushDownComputeTime

The time consumed to compute predicate pushdown conditions.

PushDownTime

The total time consumed for predicate pushdown. Join queries on right child nodes that meet the conditions are converted to IN queries on left child nodes.

CROSS_JOIN_NODE

Parameter

Description

ExecOption

The RowBatchList construction method for a right child node. The methods include the synchronous and asynchronous methods.

BuildRows

The number of RowBatchList rows, which indicates the number of rows of a right child node.

BuildTime

The time consumed for RowBatchList construction.

LeftChildRows

The number of rows of a left child node.

LeftChildTime

The time consumed to traverse the left child nodes and right child nodes for the generation of the Cartesian product, excluding the time consumed to call the GetNext operation for RowBatch on the left child node side.

UNION_NODE

Parameter

Description

MaterializeExprsEvaluateTime

The time consumed to calculate the type conversion expression and materialize the result if the field types on both sides of UNION are inconsistent.

ANALYTIC_EVAL_NODE

Parameter

Description

EvaluationTime

The total time consumed for analytic function (window function) compute.

GetNewBlockTime

The time consumed to apply for a new block during initialization. The block is used to cache the Rows window or the entire partition for analytic function compute.

PinTime

The time consumed to apply for a new block or re-read the block that is written to the disk back to the memory.

UnpinTime

The time consumed to write data from a block to the disk if the block is not required or if the current operator has a high memory pressure.

OLAP_SCAN_NODE

OLAP_SCAN_NODE is responsible for specific data scan tasks. It generates one or more OLAP scanners. Each scanner thread is responsible for scanning partial data.

Some or all of the predicate conditions in the query are pushed to OLAP_SCAN_NODE. Some of these predicate conditions are pushed down to the storage engine for data filtering by using the indexes of the storage engine. The other predicate conditions are retained in OLAP_SCAN_NODE and are used to filter the data returned from the storage engine.

The profile of OLAP_SCAN_NODE is generally used to analyze the efficiency of data scanning. The profile is divided into the following three layers based on the call relationships: OLAP_SCAN_NODE, OlapScanner, and SegmentIterator.

The following sample code shows a typical OLAP_SCAN_NODE node. The meanings of some metrics may vary with the storage format (V1 or V2).

OLAP_SCAN_NODE (id=0):(Active: 1.2ms, % non-child: 0.00%)
  - BytesRead: 265.00 B                 # The size of data that is read from the data file. For example, if 10 32-bit integers are read, the data size is 10 × 4 bytes = 40 bytes. This parameter indicates only the size of the data that is fully displayed in the memory, and does not indicate the actual I/O size.  
  - NumDiskAccess: 1                    # The number of disks that are involved in the ScanNode node. 
  - NumScanners: 20                     # The number of scanners that are generated by the ScanNode node. 
  - PeakMemoryUsage: 0.00               # The peak memory usage during query. The memory is not used.
  - RowsRead: 7                         # The number of rows that are returned from the storage engine to the scanner, excluding the number of rows that are filtered by the scanner. 
  - RowsReturned: 7                     # The number of rows that are returned from the ScanNode node to the parent node. 
  - RowsReturnedRate: 6.979K /sec       # RowsReturned/ActiveTime
  -TabletCount: 20 # The number of tablets involved in the ScanNode node. 
  - TotalReadThroughput: 74.70 KB/sec   # The total read throughout that is calculated by dividing BytesRead by the total execution time on this node from the initiation to the completion of the query. For I/O constrained queries, the value is close to the total throughput of the disk. 
  - ScannerBatchWaitTime: 426.886us     # The time for the transfer thread to wait for the scanner thread to return RowBatch data. 
  - ScannerWorkerWaitTime: 17.745us     # The time for the scanner thread to wait for the available worker threads in the thread pool. 
  OlapScanner:
    - BlockConvertTime: 8.941us         # The time consumed to convert a vectorized block to a row-structured RowBlock. A vectorized block is VectorizedRowBatch in V1 and RowBlockV2 in V2. 
    - BlockFetchTime: 468.974us         # The time for the rowset reader to obtain blocks. 
    - ReaderInitTime: 5.475ms           # The time for the OLAP scanner to initialize readers. The MergeHeap creation time is included in V1. The time consumed for generating all levels of Iterators and reading the first group of blocks is included in V2. 
    - RowsDelFiltered: 0                # The number of rows that are filtered out, including those based on the DELETE information in the tablet and those for rows that are marked for deletion in the Unique Key model. 
    - RowsPushedCondFiltered: 0         # The condition that is filtered out based on the predicate that is pushed down, such as the condition that is passed from BuildTable to ProbeTable in the join operation. This value is not accurate. If the filtering effect is poor, no filtering operation is performed. 
    - ScanTime: 39.24us                 # The time for the data to be returned from ScanNode to the parent node. 
    - ShowHintsTime_V1: 0ns             # This parameter is not applicable in V2. Some data is read for ScanRange splitting in V1. 
    SegmentIterator:
      - BitmapIndexFilterTimer: 779ns   # The time consumed to filter data by using bitmap indexes. 
      - BlockLoadTime: 415.925us        # The time consumed for the segment reader (V1) or segment iterator (V2) to obtain blocks. 
      - BlockSeekCount: 12              # The number of times that block seeking is performed during segment reading. 
      - BlockSeekTime: 222.556us        # The time consumed for block seeking during segment reading. 
      - BlocksLoad: 6                   # The number of blocks that are read.
      - CachedPagesNum: 30              # The number of pages that hit the cache after page cache is enabled only in V2. 
      - CompressedBytesRead: 0.00       # The size of data before decompression that is read from the file in V1, or the size of the read pages before compression that do not hit the page cache in V2. 
      - DecompressorTimer: 0ns          # The time consumed for data decompression. 
      - IOTimer: 0ns                    # The actual I/O time of reading data from the operating system. 
      - IndexLoadTime_V1: 0ns           # The time consumed to read index streams only in V1. 
      - NumSegmentFiltered: 0           # The number of segments that are completely filtered out based on the column statistics and query conditions. 
      - NumSegmentTotal: 6              # The number of all segments that are involved in the query. 
      - RawRowsRead: 7                  # The number of source data rows that are read from the storage engine. For more information, see the following section. 
      - RowsBitmapIndexFiltered: 0      # The number of rows that are filtered by using bitmap indexes only in V2. 
      - RowsBloomFilterFiltered: 0      # The number of rows that are filtered out by using BloomFilter indexes only in V2. 
      - RowsKeyRangeFiltered: 0         # The number of rows that are filtered out by using SortkeyIndex indexes only in V2. 
      - RowsStatsFiltered: 0            # The number of rows that are filtered out by using ZoneMap indexes in V2, including delete conditions. The number of rows that are filtered out by using BloomFilter indexes is included in V1. 
      - RowsConditionsFiltered: 0       # The number of rows that are filtered out by using various column indexes only in V2. 
      - RowsVectorPredFiltered: 0       # The number of rows that are filtered out by using filtering operations based on vectorized conditions. 
      - TotalPagesNum: 30               # The total number of pages that are read only in V2. 
      - UncompressedBytesRead: 0.00     # The size of the data file that is read after decompression in V1. If decompression is not required, this parameter specifies the size of the file. This parameter specifies the size of pages that do not hit the page cache after decompression in V2. If decompression is not required, this parameter specifies the size of pages.
      - VectorPredEvalTime: 0ns         # The time consumed to perform filtering by vectorized condition. 
      - ShortPredEvalTime: 0ns          # The time consumed to perform filtering by short circuit predicate. 
      - PredColumnReadTime: 0ns         # The time consumed for predicate column reading. 
      - LazyReadTime: 0ns               # The time consumed to read non-predicate columns. 
      - OutputColumnTime: 0ns           # The time consumed for column materialization.

Predicate condition pushdown and index usage can be inferred from the metrics related to the number of data rows in the profile. The following section describes the profile information during the reading of segments in the V2 format. These metrics have slightly different meanings for segments in the V1 format.

Initialization phase

  1. During the reading of a segment in the V2-format, if the query has key_ranges (a query range consisting of prefix keys), the data is first filtered by using SortkeyIndex indexes. The number of rows that are filtered out is recorded in RowsKeyRangeFiltered.

  2. Accurate filtering is performed on columns by using bitmap indexes if the query conditions contain bitmap indexes. The number of rows that is filtered out is recorded in RowsBitmapIndexFiltered.

  3. Data is filtered based on the equivalent (eq, in, and is) condition in query conditions by using BloomFilter indexes. The number of rows that are filtered out is recorded in RowsBloomFilterFiltered. RowsBloomFilterFiltered specifies the difference between the total number of rows in the segment (not the number of rows that are filtered out by using bitmap indexes) and the remaining number of rows after filtering by using BloomFilter indexes. As a result, the data that is filtered by using BloomFilter indexes may overlap with the data that is filtered by using bitmap indexes.

  4. Data is filtered by using ZoneMap indexes based on query and delete conditions. The number of rows that are filtered out is recorded in RowsStatsFiltered.

  5. RowsConditionsFiltered specifies the number of rows that are filtered out by using various indexes, including the values of RowsBloomFilterFiltered and RowsStatsFiltered.

Next phase

  • The number of rows that are filtered out by using the delete conditions in the next phase is recorded in RowsDelFiltered. The number of rows that are actually filtered by using the delete conditions is recorded in RowsStatsFiltered and RowsDelFiltered.

  • RawRowsRead specifies the number of rows to be read after the preceding filtering operations are performed.

  • RowsRead specifies the number of rows that are returned to the scanner. Generally, the value of RowsRead is less than that of RawRowsRead. This is because the data that is returned from the storage engine to the scanner may be aggregated. If the difference between the value of RawRowsRead and that of RowsRead is larger than expected, a large number of rows are aggregated, which may be time-consuming.

  • RowsReturned specifies the number of rows that are finally returned by ScanNode to the parent node. Generally, the value of RowsReturned is also less than that of RowsRead. Some predicate conditions are not pushed down to the storage engine on the scanner. This indicates that filtering is performed. If the difference between the value of RowsRead and that of RowsReturned is larger than expected, a large number of rows are filtered on the scanner. This indicates that multiple predicate conditions with high selectivity are not pushed to the storage engine. The filtering efficiency of the scanner is lower than that of the storage engine.

Based on the preceding metrics, you can roughly analyze the number of rows that are processed by the storage engine and the number of rows in the final results after the filtering. This set of metrics (Rows***Filtered) also allows you to analyze whether the query conditions are pushed down to the storage engine and the filtering effect of different indexes.

Other metrics

You can also perform analysis based on the following metrics:

  • Metrics under OlapScanner, such as IOTimer and BlockFetchTime, are the accumulation of all scanner thread metrics. As a result, the metrics values may be relatively large. Scanner threads asynchronously read data. In this case, these accumulated metric values can only reflect the accumulated working time of the scanner, and cannot directly indicate the time consumed by ScanNode. The proportion of the time consumed by ScanNode in the entire execution plan is recorded in Active. In some cases, the value of IOTimer is tens of seconds, whereas the value of Active is only a few seconds. The following items may describe the causes.

    • IOTimer is the cumulative time of multiple scanners, and a large number of scanners are available.

    • The parent node is time-consuming. For example, the parent node consumes 100 seconds, whereas ScanNode consumes only 10 seconds. The value of the Active parameter may be only a few milliseconds. When the parent node is processing data, ScanNode has asynchronously scanned the data and prepared the data. Therefore, the parent node can obtain the data that are prepared in ScanNode. In this case, the active time is short.

  • NumScanners indicates the number of tasks that the scanner submits to the thread pool, which is scheduled by the thread pool in RuntimeState. The doris_scanner_thread_pool_thread_num and doris_scanner_thread_pool_queue_size parameters respectively specify the size and queue length of the thread pool. If the number of threads is excessive or insufficient, the query efficiency is affected. You can also divide some metrics by the number of threads to roughly estimate the time consumed by each thread.

  • TabletCount indicates the number of tablets to be scanned. Excessive tablets indicate that a large number of random read and data merge operations are required.

  • UncompressedBytesRead indirectly reflects the size of data that is read. If the value is large, a large number of I/O operations may be performed.

  • You can check the page cache hits based on CachedPagesNum and TotalPagesNum. A higher hit ratio indicates that the I/O and decompression operations consume less time.

Buffer pool

Parameter

Description

AllocTime

The time consumed for memory allocation.

CumulativeAllocationBytes

The size of cumulative memory allocation.

CumulativeAllocations

The cumulative number of memory allocations.

PeakReservation

The peak reservation.

PeakUnpinnedBytes

The size of memory data for unpinning.

PeakUsedReservation

The memory usage of reservation.

ReservationLimit

The limit of reservation in the buffer pool.