All Products
Search
Document Center

Tenant diagnosis

Last Updated: Jun 25, 2021

The diagnosis feature is used to help you understand the current load and health status of cluster tenants and provide you with the function to bind execution plans. By default, the performance data of tenants in the last hour is displayed. Meanwhile, the options of the last two hours and the last one day are also provided. You can also customize the time range.

How do I view diagnostic information

  1. Log on to the OceanBase cluster list.
  2. Find the target cluster and click the cluster name to go to the cluster management page.
  3. Click diagnose in the left-side navigation pane. The diagnose button is displayed in the section on the right.
  4. Select the target tenant to be diagnosed, and click the diagnostics button next to it to go to the diagnostics details page.

Components of diagnostic modules

The diagnosis details page consists of three parts: problem category, session data, exception metric, and SQL.

  1. Diagnostic problem classification

    This module helps you quickly identify the problems of the current tenant and the problem types. There are two types of problems, namely, the number of abnormal metrics and the number of suspicious SQL statements. If no exception is found, the system prompts "no abnormal metrics and suspicious SQL." otherwise, the system displays the proportion of abnormal metrics and suspicious SQL statements and the specific number of SQL statements. Indicates a green mark for the abnormal metric and a blue mark for the suspicious SQL statement.

  2. Session Data

    Session status consists of three parts: response time, number of requests, and number of sessions. The response time includes TPS response time (TPS RT) and QPS response time (QPS RT). The unit is milliseconds. The number of requests includes tps and qps, and the number of sessions includes the maximum number of active sessions. The data is collected every 5s.

  3. Exception indicator:

    Exception metrics cover 10 categories and 38 items that are important to the database, as described in the following table.

    cache(OB cache hit rate):

    • block_cache_hit_percent: BlockCache cache hit rate
    • block_index_cache_hit_percent: IndexBlockCache the cache hit rate.
    • bloom_filter_cache_hit_percent: the cache hit rate of BloomFilter.
    • location_cache_hit_percent: LocationCache cache hit rate
    • row_cache_hit_percent: the cache hit rate of RowCache.

    clog(OB transaction log operation response time):

    • clog_rpc_rt: transaction log RPC response time
    • clog_write_rt: transaction log Write disk response time

    cpu:

    • cpu_usage_max: the number of active threads per tenant

    IO:

    • cpu_usage_max: the number of active threads per tenant
    • io_read_count: I /O reads
    • io_read_rt: I /O read response time
    • io_write_count: Number of I /O writes
    • io_write_rt: I /O write response time

    network:

    • rpc_packet_in: the number of RPC inbound packages.
    • rpc_packet_out: the number of RPC outbound packages.

    request:

    • active_sessions: Number of active sessions per second
    • qps: the number of queries per second
    • tps: DML statements per second
    • trans_commit_count: transactions per second

    request_rt (request response time):

    • qps_rt: queries the average response time.
    • tps_rt: the average response time of DML statements
    • trans_commit_rt: the average response time of transactions.

    sql_count(SQL request details):

    • request_enqueue_count: the number of times the queue has waited.
    • sql_delete_count : Delete times
    • sql_distributed_count: the number of distributed requests.
    • sql_insert_count: the number of Insert operations.
    • sql_update_count: number of updates
    • trans_rollback_count: Number of rollbacks
    • trans_timeout_count: Number of request timeouts

    sql_row (number of rows affected by SQL):

    • storage_delete_row_count: the number of rows affected by the Delete operation.
    • storage_dml_row_count: the number of rows affected by DML statements
    • storage_insert_row_count: the number of rows affected by an Insert statement
    • storage_read_row_count: the number of rows returned
    • storage_update_row_count: the number of rows affected by the Update.

    sql_rt(SQL response time details):

    • sql_delete_rt: the response time of the Delete operation.
    • sql_insert_rt: response time of Insert
    • sql_queue_rt: Queue waiting time
    • sql_update_rt: The Update response time.
    • trans_rollback_rt: the response time of rollback
  4. SQL

    The SQL statement consists of the suspicious SQL statement and Top SQL statements. You can filter SQL statements by database, node, keyword, and advanced index. By default, database and node are displayed. Click the expand button on the right side to display all filters. Advanced indexes support 37 query conditions, such as database, number of executions, and CPU time (in ms). In addition, eight matching rules, such as=,>=, and <, are also supported.

    • Suspicious SQL: SQL statements that are filtered out by the platform based on rules and algorithms to cause performance problems. The common suspicious items include SQL execution plan changes (rules) and SQL performance degradation (algorithms).
    • Top SQL: sort SQL statements by each dimension and retrieve the top N SQL statements. The SQL statements include detailed SQL records.

    The suspicious SQL statement includes the sequence number, database, request time, number of executions, CPU time, and problem diagnosis. You can sort the SQL statements by sequence number, request time, number of executions, and cpu utilization. The display columns for the Top SQL are user-defined. You can click manage columns to display all supported columns, totaling 34 items. You can filter data by using SQL statements. For example, you can exclude data from the display of other types of data in the select statement, and sort data by other conditions.

    You can click the SQL sequence number to view the corresponding SQL text, diagnostic details, historical SQL data, and physical execution plan. The SQL History data includes the line chart of metrics such as the SQL time trend chart and cpu utilization time, and the list of SQL execution history records.

    To view the physical execution plan list, click physical execution plan on the SQL execution details page. The execution plan list contains the execution plan generation time, hit times, average execution time, and status. You can choose to bind or unbind an execution plan. You can also bind or unbind all indexes under the table. An intelligent recommendation solution is provided by default.