全部產品
Search
文件中心

:接入Datadog警示

更新時間:Jan 14, 2025

Datadog是雲上應用的監控和分析平台,用於自動採集和分析日誌、指標、鏈路追蹤等資料,監控基礎設施事件、雲端服務事件。Datadog為伺服器、應用程式以及採集到的各種資料提供了很好的可觀測效果。您只需在Datadog整合的Webhook中配置Log Service的開放警示介面URL,即可將Datadog的警示訊息發送給Log Service。

前提條件

已建立協議Datadog的開放警示應用。具體操作,請參見配置開放警示對外介面

Datadog配置

  1. 登入Datadog控制台。
  2. 配置Webhook。
    1. 在頂部導覽列中,選擇Integrations表徵圖 > Integrations
    2. Integrations頁簽中,找到webhooks,將滑鼠懸浮在webhooks框中,單擊Install
    3. 安裝完成後,將滑鼠懸浮在webhooks框中,單擊Configure
    4. Webhooks地區,單擊New
    5. New Webhook地區,配置如下參數,然後單擊Save
      datadog-webhook
      參數說明
      Namewebhook的名稱。
      URL警示訊息的接收端,此處配置為您在Log Service中建立開放警示服務和應用後產生的介面資訊(完整URL)。如何擷取,請參見擷取介面資訊
      Payload定義警示訊息的內容,Datadog將根據此配置產生警示訊息內容。關於Datadog提供的警示訊息變數的更多資訊,請參見Datadog官方文檔

      在配置Payload時,需注意如下事項。

      • labels欄位中,必須添加tags欄位。
      • annotations欄位中,必須添加title欄位、event_msg欄位和text_only_msg欄位。
      • 其餘由Datadog提供的但未被使用的變數,您可以自訂選擇添加到labels欄位或者annotations欄位中。
      • labelsannotations之外的其他欄位,您必須按照如下樣本進行配置。

      您可以按照如下內容配置Payload

      {
          "alert_instance_id": "$ID",
          "alert_id": "$ALERT_ID",
          "alert_name": "$ALERT_TITLE",
          "alert_time": "$LAST_UPDATED",
          "fire_time": "$DATE",
          "resolve_time": "$DATE",
          "status": "$ALERT_TRANSITION",
          "labels": {
              "tags": "$TAGS"
          },
          "annotations": {
              "title": "$EVENT_TITLE",
              "event_msg": "$EVENT_MSG",
              "text_only_msg": "$TEXT_ONLY_MSG",
              "alert_metric": "$ALERT_METRIC",
              "alert_query": "$ALERT_QUERY",
              "alert_scope": "$ALERT_SCOPE",
              "alert_status": "$ALERT_STATUS",
              "alert_type": "$ALERT_TYPE",
              "email": "$EMAIL",
              "event_type": "$EVENT_TYPE",
              "hostname": "$HOSTNAME",
              "logs_sample": "$LOGS_SAMPLE",
              "metric_namespace": "$METRIC_NAMESPACE",
              "priority": "$PRIORITY",
              "user": "$USER",
              "username": "$USERNAME",
              "__aggreg_key__": "$AGGREG_KEY",
              "__alert_cycle_key__": "$ALERT_CYCLE_KEY",
              "__incident_attachments__": "$INCIDENT_ATTACHMENTS",
              "__incident_commander__": "$INCIDENT_COMMANDER",
              "__incident_customer_impact__": "$INCIDENT_CUSTOMER_IMPACT",
              "__incident_fildes__": "$INCIDENT_FIELDS",
              "__incident_public_id__": "$INCIDENT_PUBLIC_ID",
              "__incident_title": "$INCIDENT_TITLE",
              "__incident_url__": "$INCIDENT_URL",
              "__org_id__": "$ORG_ID",
              "__org_name__": "$ORG_NAME",
              "__security_rule_name__": "$SECURITY_RULE_NAME",
              "__security_signal_id__": "$SECURITY_SIGNAL_ID",
              "__security_signal_severity__": "$SECURITY_SIGNAL_SEVERITY",
              "__security_signal_title__": "$SECURITY_SIGNAL_TITLE",
              "__security_signal_msg__": "$SECURITY_SIGNAL_MSG",
              "__security_signal_attributes__": "$SECURITY_SIGNAL_ATTRIBUTES",
              "__security_rule_id__": "$SECURITY_RULE_ID",
              "__security_rule_query__": "$SECURITY_RULE_QUERY",
              "__security_rule_group_by_fields__": "$SECURITY_RULE_GROUP_BY_FIELDS",
              "__security_rule_type__": "$SECURITY_RULE_TYPE",
              "__link_snapshot_url__": "$SNAPSHOT",
              "__synthetics_test_name__": "$SYNTHETICS_TEST_NAME",
              "__synthetics_first_failing_step_name__": "$SYNTHETICS_FIRST_FAILING_STEP_NAME"   
          },
          "severity": "$ALERT_PRIORITY",
          "drill_down_query": "$LINK"     
      }
  3. 配置通知渠道。
    1. 在頂部導覽列中,選擇通知渠道表徵圖 > Manage Monitors
    2. 單擊目標Monitor對應的修改表徵圖。
    3. 配置Notify your team為您在步驟2中所建立的Webhook。
    4. 單擊Save

Datadog警示訊息

如果您將所有由Datadog提供的但未被使用的變數都添加到了annotations欄位中,那麼Log Service將收到如下所示的Datadog警示訊息。
{
    "alert_instance_id": "123456",
    "alert_id": "123456",
    "alert_name": "STOP on host:abcdefgh",
    "alert_time": "1628647425000",
    "fire_time": "1628647425000",
    "resolve_time": "1627561306000",
    "status": "Triggered",
    "labels": {
        "tags": "ali,host:abcdefgh,monitor"
    },
    "annotations": {
        "title": "[P1] [Triggered on {host:abcdefgh}] STOP",
        "event_msg": "%%%\nwarning\nhost stop\n @webhook-webhook-test-all\n\nThe monitor was last triggered at Thu Jul 29 2021 12:21:45 UTC.\n\n- - -\n\n[[Monitor Status](https://app.datadoghq.com/monitors/1234?to_ts=1234&group=host%3Aabcdefgh&from_ts=1627560405000)] \u00b7 [[Edit Monitor](https://app.datadoghq.com/monitors#1234/edit)] \u00b7 [[View abcdefgh](https://app.datadoghq.com/infrastructure?filter=abcdefgh)] \u00b7 [[Show Processes](https://app.datadoghq.com/process?sort=memory%2CASC&to_ts=1234&tags=host%abcdefgh&from_ts=1627560405000&live=false&showSummaryGraphs=true)]\n%%%",
        "text_only_msg": "\nwarning\nhost stop\n @webhook-webhook-test-all\n\nMetric Graph: https://app.datadoghq.com/monitors/1234?to_ts=1627561365000&group=host%abcdefgh&from_ts=1627557705000 \u00b7 Monitor Status: https://app.datadoghq.com/monitors/1234?group=host%abcdefgh \u00b7 Edit Monitor: https://app.datadoghq.com/monitors#42655965/edit \u00b7 Event URL: https://app.datadoghq.com/event/event?id=1234 \u00b7 View abcdefgh: https://app.datadoghq.com/infrastructure?filter=abcdefgh \u00b7 Show Processes: https://app.datadoghq.com/process?sort=memory%2CASC&to_ts=None&tags=host%abcdefgh&from_ts=None&live=false&showSummaryGraphs=true",
        "alert_metric": "null",
        "alert_query": "\"datadog.agent.up\".over(\"host:abcdefgh\").by(\"host\").last(2).count_by_status()",
        "alert_scope": "host:abcdefgh",
        "alert_status": "",
        "alert_type": "error",
        "email": "",
        "event_type": "service_check",
        "hostname": "abcdefgh",
        "logs_sample": "null",
        "metric_namespace": "",
        "priority": "normal",
        "user": "null",
        "username": "",
        "__aggreg_key__": "a1b2c3",
        "__alert_cycle_key__": "123456789",
        "__incident_attachments__": "null",
        "__incident_commander__": "null",
        "__incident_customer_impact__": "null",
        "__incident_fildes__": "null",
        "__incident_public_id__": "null",
        "__incident_title": "null",
        "__incident_url__": "null",
        "__org_id__": "123",
        "__org_name__": "ali",
        "__security_rule_name__": "null",
        "__security_signal_id__": "null",
        "__security_signal_severity__": "null",
        "__security_signal_title__": "null",
        "__security_signal_msg__": "null",
        "__security_signal_attributes__": "null",
        "__security_rule_id__": "null",
        "__security_rule_query__": "$SECURITY_RULE_QUERY",
        "__security_rule_group_by_fields__": "null",
        "__security_rule_type__": "null",
        "__link_snapshot_url__": "null",
        "__synthetics_test_name__": "null",
        "__synthetics_first_failing_step_name__": "null"   
    },
    "severity": "P1",
    "drill_down_query": "https://app.datadoghq.com/event/event?id=123456"     
}

欄位對應

Datadog警示訊息被接入到Log Service後,映射為Log Service警示內容。樣本如下:
{
    "aliuid": "aliuid1",
    "alert_instance_id": "123456",
    "alert_id": "123456",
    "alert_type": "sls_pub",
    "alert_name": "STOP on host:abcdefgh",
    "region": "",
    "project": "",
    "project_id": 0,
    "next_eval_interval": 0,
    "alert_time": 1628647425,
    "fire_time": 1628647425,
    "fire_results": null,
    "fire_results_count": 0,
    "resolve_time": 0,
    "status": "firing",
    "results": null,
    "labels":{
        "__ali__": "ali",
        "__host__": "abcdefgh",
        "__monitor__": "monitor"
    },
    "annotations":{
        "__aggreg_key__": "1a2b3c4d",
        "__alert_cycle_key__": "123456",
        "__config_app__": "sls_pub_alert",
        "__link_edit_monitor__": "https://app.datadoghq.com/monitors#1234/edit",
        "__link_metric_graph__": "https://app.datadoghq.com/monitors/1234?to_ts=1628647485000&group=host%abcdefgh&from_ts=1628643825000",
        "__link_monitor_status__": "https://app.datadoghq.com/monitors/123?group=host%abcdefgh",
        "__link_show_processes__": "https://app.datadoghq.com/process?sort=memory%2CASC&to_ts=None&tags=host%abcdefgh&from_ts=None&live=false&showSummaryGraphs=true",
        "__link_view_izbp****hqpwt26z__": "https://app.datadoghq.com/infrastructure?filter=abcdefgh",
        "__org_id__": "579186",
        "__org_name__": "ali",
        "__pub_alert_app__": "",
        "__pub_alert_protocol__": "datadog",
        "__pub_alert_region__": "",
        "__pub_alert_service__": "",
        "alert_query": "\"datadog.agent.up\".over(\"host:abcdefgh\").by(\"host\").last(2).count_by_status()",
        "alert_scope": "host:izbp1cerzh0yyvrhqpwt26z",
        "alert_type": "error",
        "desc": "warning\nhost stop\n@webhook-test\nThe monitor was last triggered at Wed Aug 11 2021 02:03:45 UTC.\n- - -\n",
        "event_type": "service_check",
        "hostname": "abcdefgh",
        "priority": "normal",
        "title": "[P1] [Triggered on {host:abcdefgh}] STOP"
    },
    "severity": 10,
    "policy":{
        "alert_policy_id": "",
        "action_policy_id": "",
        "use_default": false,
        "repeat_interval": "0s"
    },
    "template": null,
    "drill_down_query": "https://app.datadoghq.com/event/event?id=123456"
}
Log ServiceDatadog說明
aliuid用於接入警示的開放警示應用所屬的阿里雲帳號ID
alert_idalert_id警示監控規則的ID
alert_instance_idalert_instance_id警示訊息的ID
alert_type警示類型,固定為sls_pub。
alert_namealert_name警示監控規則的名稱
statusstatus警示狀態。
  • 如果Datadog警示訊息中status欄位的值為Triggered、Re-Triggered、No Data、Re-No Data、Warn、Re-Warn、Renotify,則status的值為firing。
  • 如果Datadog警示訊息中status欄位的值為Recovered,則status的值為resolved。
next_eval_interval警示評估間隔時間,固定為0。
alert_timealert_time警示觸發時間。
fire_timefire_time警示首次觸發時間。
resolve_timeresolve_time警示恢復。
  • 如果status欄位的值為resolved,則resolve_time的值為Datadog警示訊息中resolve_time欄位的值。
  • 如果status欄位的值為firing,則resolve_time的值為0。
labelslabels警示標籤資訊。
Datadog警示訊息的 labels欄位中的tags欄位值將被英文逗號(,)拆分為多個字串。
  • 如果字串為Key:Value格式,則將在Key的前後添加兩個底線(__)。
  • 如果字串為非Key:Value格式,則系統自動將該字串構造為Key:Value格式,Key為__字串__,Value為字串
例如"ali,host:1a2b3c4d"將被解析成如下格式。
{
    "__ali__": "ali",
    "__host__": "1a2b3c4d"
}

另外Datadog警示訊息的labels欄位中,其餘未被使用且欄位值非空的欄位和其欄位值都會被添加到Log Service警示訊息的labels欄位中。

annotationsannotationsDatadog警示被接入到Log Service後,Log Service警示的annotations欄位中將添加如下額外的欄位。
  • desc:警示內容描述。從Datadog警示訊息中的event_msg欄位中解析得到。
  • title:警示訊息的標題。對應Datadog警示訊息中的event_title欄位的值。

以下欄位從Datadog警示訊息中的text_only_msg欄位中解析得到。

  • __link_metric_graph__:指標圖表的URL。
  • __link_monitor_status__:警示規則查詢狀態的URL。
  • __link_edit_monitor__:編輯警示規則的URL。
  • __link_view_{$hostname}__:查看監控主機狀態的URL,其中{$hostname}為監控的主機名稱。
  • __link_show_process__:查看監控主機即時運行進程的URL。

另外Datadog警示訊息annotations欄位中,其餘未被使用且欄位值非空的欄位和其欄位值都會被添加到Log Service警示訊息的annotations欄位中。

severityseverity警示嚴重度,Datadog警示嚴重度與Log Service警示嚴重度的映射關係如下:
  • P1:嚴重
  • P2:高
  • P3:中
  • P4:低
  • P5:報告
說明 如果Datadog警示中未定義嚴重度,則Log Service警示嚴重度映射為中。
policy開放警示應用中配置的警示策略。更多資訊,請參見Policy結構
project警示中心所屬的Project。更多資訊,請參見專案(Project)
drill_down_querydrill_down_query單擊欄位值中的連結,可跳轉到Datadog警示事件的管理頁面。