全部產品
Search
文件中心

ApsaraMQ for RabbitMQ:秒級指標資料的產生方法

更新時間:Dec 19, 2024

本文為您介紹如何使用日誌管理功能產生秒級指標資料。

背景資訊

當前Cloud Monitor提供的圖表是分鐘級統計資料的平均值,無法展示秒級的TPS統計資料。雲訊息佇列 RabbitMQ 版的TPS統計了每秒Client主動發起的AMQP協議方法請求數量。

TPS統計的AMQP協議要求方法如下:

  • ConnectionOpenChannelOpen

  • QueueDeclareQueueDeleteQueueBindQueueUnbind

  • ExchangeDeclareExchangeDelete

  • ExchangeBindExchangeUnBind

  • SendMessageBasicConsumeBasicGetBasicAckBasicRejectBasicNackBasicRecover

關於要求方法的詳細描述,請參見要求方法

操作步驟

  1. 開啟日誌管理功能並配置索引

  2. 建立Metric時序庫,用於儲存清洗出來的指標資料。

    1. Log Service控制台的Project詳情頁面,選擇image > 立即建立image

    2. 建立MetricStore面板中設定Metric時序庫的基本資料。image

  3. 建立清洗任務。

    1. 在logstore中輸入查詢語句,以執行個體錯誤碼為例。

      * | SELECT Code, count(*) as num, microtime / 1000 / 1000 as timeSecond group by Code, timeSecond limit 1000000

      上述語句格式為:查詢語句|分析語句,前者為條件的篩選,後者為標準的SQL文法。從查詢結果中清洗出以下三項內容即可寫入Metric時序庫:您需要的Label;各個Label下的指標值;時間。以本語句為例,Code對應Label,代表各個請求的返回碼;num對應各個Code的值;timeSecond對應時間,單位為秒。

      查詢結果如下所示:image

    2. 在查詢結果中,單擊統計圖表頁簽中的另存新檔定時SQL,在計算配置頁簽中配置以下參數,然後單擊下一步image

      說明

      目標庫應選擇上文中已建立的Metric時序庫。

    3. 調度配置頁簽中設定調度時間間隔,然後單擊確定image

  4. 在Metric時序庫中查詢指標數值分布。image

    查詢結果如下所示:image

  5. 可選:將Metric時序庫中的資料作為資料來源接入可視化圖表大盤,大盤展示可選用Grafana或Simple Log Service的可視化能力。

說明

以上教程以清洗執行個體錯誤碼資料為例,您也可以清洗其他資料,例如每個RemoteAddress的每個Channel的訊息收發速度、每秒鐘每個隊列的活躍情況、每秒鐘的總訊息發送條數和接收條數、每秒鐘各個API的調用次數等。

常用語句

查詢執行個體秒級TPS指標資料

* | select microtime/1000/1000 as time, sum(count) as tps 
from 
  (SELECT  microtime, if(Action!='SendMessage', 1, tps) as count 
   from log 
   Where  InstanceId='amqp-xx-xxx' 
     and Action in ('SendMessage', 'ConnectionOpen', 'ChannelOpen', 'ExchangeDeclare', 'QueueBind', 'QueueDeclare', 'QueueDelete', 'ExchangeDelete', 'QueueUnBind', 'ExchangeBind', 'ExchangeUnBind', 'BasicConsume', 'BasicReject', 'BasicRecover', 'BasicAck', 'BasicNAck', 'PullMessage') 
   limit 90000000) 
  
GROUP by time ORDER by time limit 90000000

查詢結果如下所示:

image

  • 查詢前請將上文中的執行個體IDamqp-xx-xxx替換為待查詢執行個體的ID。

  • 其中BasicNack(multiple=false),計TPS=1,BasicNack(multiple=true),計TPS=N,因此通過SLS日誌配置統計出來的TPS值會小於實際發起的請求量。

  • 查詢TPS流量圖時,如果用戶端的流量比較大,建議將查詢的時間範圍限制在1小時或是更小的範圍,然後在SQL語句後面加上limit 90000000,或者limit取值儘可能大。

查詢各exchange、routing key的訊息發送總量

* and Action : SendMessage and Code : 200 | 
select 
  InstanceId as instance_id,
  VHost as virtual_host, 
  split_part(ResourceName,',',2) as exchange_name, 
  split_part(ResourceName,',',3) as routing_key, 
  count(*) as send_total_num 
group by 
  instance_id,
  virtual_host, 
  exchange_name, 
  routing_key 
order by 
  send_total_num 
limit 10000000

查詢結果如下所示:

image

查詢各exchange、routing key的每秒訊息發送速率

* and Action : SendMessage and Code : 200 | 
select 
  InstanceId as instance_id,
  VHost as virtual_host, 
  split_part(ResourceName,',',2) as exchange_name, 
  split_part(ResourceName,',',3) as routing_key, 
  microtime / 1000 / 1000 as time_second, 
  count(*) as send_qps 
group by 
  instance_id,
  virtual_host, 
  exchange_name, 
  routing_key,
  time_second 
order by 
  time_second, 
  send_qps 
limit 10000000

查詢結果如下所示:

image

查詢各隊列的消費訊息量

* and Action : PushMessage and Code : 200 | 
select 
  InstanceId as instance_id,
  VHost as virtual_host, 
  Queue as queue_name, 
  count(*) as push_total_num 
group by 
  instance_id,
  virtual_host, 
  queue_name 
order by 
  push_total_num 
limit 10000000

查詢結果如下所示:

image

查詢各隊列的每秒消費訊息速率

* and Action : PushMessage and Code : 200 | 
select 
  InstanceId as instance_id,
  VHost as virtual_host, 
  Queue as queue_name, 
  microtime / 1000 / 1000 as time_second, 
  count(*) as push_qps 
group by 
  instance_id,
  virtual_host, 
  queue_name, 
  time_second 
order by 
  time_second, 
  push_qps 
limit 10000000

查詢結果如下所示:

image

查詢各用戶端的每秒訊息發送量

* and Action : SendMessage and Code : 200 | 
select 
  InstanceId as instance_id,
  VHost as virtual_host, 
  RemoteAddress as client_ip_port, 
  microtime / 1000 / 1000 as time_second, 
  count(*) as send_qps 
group by 
  instance_id,
  virtual_host, 
  client_ip_port, 
  time_second 
order by 
  time_second, 
  send_qps 
limit 10000000

查詢結果如下所示:

image

查詢各用戶端的每秒訊息消費量

* and Action : PushMessage and Code : 200 | 
select 
  InstanceId as instance_id,
  VHost as virtual_host, 
  RemoteAddress as client_ip_port, 
  microtime / 1000 / 1000 as time_second, 
  count(*) as push_qps 
group by 
  instance_id,
  virtual_host, 
  client_ip_port, 
  time_second 
order by 
  time_second, 
  push_qps 
limit 10000000

查詢結果如下所示:

image

查詢各用戶端某行為的每秒速率

如果需要查詢某用戶端對於某個行為的操作QPS,請複製下面的語句,並修改{action_name}為您需要查詢的Action名稱,具體Action名稱包括:

  • ConnectionOpen、ChannelOpen

  • QueueDeclare、QueueDelete、QueueBind、QueueUnbind

  • ExchangeDeclare、ExchangeDelete

  • ExchangeBind、ExchangeUnBind

  • SendMessage、BasicConsume、BasicGet、BasicAck、BasicReject、BasicNack、BasicRecover

* and Action : {action_name} and Code : 200 | 
select 
  InstanceId as instance_id,
  VHost as virtual_host, 
  RemoteAddress as client_ip_port, 
  microtime / 1000 / 1000 as time_second, 
  count(*) as {action_name}_qps 
group by 
  instance_id,
  virtual_host, 
  client_ip_port, 
  time_second 
order by 
  time_second, 
  {action_name}_qps 
limit 10000000

例如,如果希望查詢某用戶端開啟Connection的QPS,可使用如下語句:

* and Action : ConnectionOpen and Code : 200 | 
select 
  InstanceId as instance_id,
  VHost as virtual_host, 
  RemoteAddress as client_ip_port, 
  microtime / 1000 / 1000 as time_second, 
  count(*) as connection_open_qps 
group by 
  instance_id,
  virtual_host, 
  client_ip_port, 
  time_second 
order by 
  time_second, 
  connection_open_qps 
limit 10000000

查詢結果如下所示:

image

查詢各Action的QPS

該語句能夠一次性統計各用戶端的所有Action QPS。

* and Code : 200 | 
select 
  InstanceId as instance_id,
  VHost as virtual_host,
  Action as action_type,
  RemoteAddress as client_ip_port, 
  microtime / 1000 / 1000 as time_second, 
  count(*) as action_qps
group by 
  instance_id,
  virtual_host,
  client_ip_port,
  action_type,
  time_second 
order by
  time_second, 
  action_qps
limit 10000000

查詢結果如下所示:

image

查詢各錯誤出現頻次

* and not Code = 200 | 
select 
  Code as error_code,
  VHost as virtual_host,
  split_part(split_part(Info, '[', 1), 'Req', 1) as error_info,
  microtime / 1000 / 1000 as time_second,
  count(*) as error_num
group by 
  virtual_host,
  error_code,
  time_second,
  error_info
order by
  time_second, 
  error_num
limit 10000000

查詢結果如下所示:

image

查詢平均訊息體大小

* and Action : SendMessage and Code: 200 | 
select 
  InstanceId as instance_id, 
  VHost as virtual_host, 
  split_part(Queue, ';', 1) as queue_name, 
  microtime / 1000 / 1000 as time_second, 
  avg(cast(split_part(ResourceName, 'bodySize=', 2) as bigint)) as avg_body_size 
group by 
  instance_id, 
  virtual_host, 
  queue_name, 
  time_second 
order by 
  time_second, 
  avg_body_size 
limit 10000000

查詢結果如下所示:

image

查詢各訊息ID的推送次數

* and Action : PushMessage and Code : 200 | 
select 
  InstanceId as instance_id, 
  VHost as virtual_host, 
  split_part(split_part(ResourceName, ',', 1), '=', 2) as msg_id, 
  count(*) as push_times 
group by 
  instance_id, 
  virtual_host, 
  msg_id 
order by 
  push_times desc 
limit 1000000

查詢結果如下所示:

image