Amazon CloudWatch is a service that monitors Amazon Web Services (AWS) resources and applications on AWS in real time. CloudWatch can work with Amazon Simple Notification Service (Amazon SNS) to send alerts. You need only to configure the webhook URL that is provided by the alert ingestion system of Log Service in Amazon SNS. This way, alerts can be sent from CloudWatch to Log Service. The alerting system of Log Service processes the alerts, such as denoising the alerts and sending alert notifications.

Prerequisites

An alert ingestion application whose Protocol is CloudWatch is created. For more information, see Configure webhook URLs for alert ingestion.

Configure CloudWatch

  1. Log on to the AWS Management Console.
  2. Create an SNS topic.
    Configure the following required parameters in the Amazon SNS console. For more information, see Creating an Amazon SNS topic.
    Parameter Description
    Type The type of the topic. Select Standard.
    Name The name of the topic.
  3. Subscribe to the SNS topic.
    Configure the following required parameters in the Amazon SNS console. For more information, see Subscribing to an Amazon SNS topic.
    Parameter Description
    Topic ARN Enter the Amazon Resource Name (ARN) of the topic that you created in Step 2.
    Protocol Select HTTP.
    Endpoint Enter the full URL of the webhook URL that is generated after you create an alert ingestion service and an alert ingestion application in the alert ingestion system of Log Service. For more information, see Obtain webhook URLs.
    Enable raw message delivery Select Enable raw message delivery.
    After the configuration is complete, your subscription is in the Pending confirmation state. In this case, Amazon SNS sends a subscription confirmation message to Log Service. When Log Service receives the message, Log Service automatically accesses the link in the message. After the link is accessed, your subscription changes to the Confirmed state, and your subscription succeeds.
    Note If your subscription fails, select your subscription and click Request confirmation to send another subscription confirmation message. If your subscription still fails, view the error log in Troubleshooting Center of Alert Center in the Log Service console.
    Subscribe to an SNS topic
  4. Select the alert that you want to ingest into Log Service and configure the notification methods.
    You must configure two notification methods on the alert editing page in the Amazon CloudWatch console. For more information, see To edit an alarm.
    • Alarm state trigger: Select the state at which alert notifications are triggered.
      • For one notification method, select In alarm or Insufficient data for Alarm state trigger. If an alert is in the selected state, CloudWatch sends an alert notification.
      • For the other notification method, select OK for Alarm state trigger. If an alert is cleared, CloudWatch sends a recovery notification.
    • Select an SNS topic: Select Select an existing SNS topic.
    • Send a notification to…: Select the topic that you created in the Step 2.
    Alert editing page

CloudWatch alerts

CloudWatch alerts are classified into two types: alerts that are created based on static thresholds and alerts that are created based on anomaly detection. The values of the Trigger field vary based on the type of an alert. For more information, see AWS::CloudWatch::Alarm.
  • For alerts that are created based on static thresholds, the value of the Trigger field contains fields such as MetricName and Dimensions.
  • For alerts that are created based on anomaly detection, the value of the Trigger field contains fields such as Metrics. The value of the Metrics field is a list of metrics.
  • Alerts that are created based on static thresholds
    {
        "AlarmName": "test-alert",
        "AlarmDescription": "this is a test alert",
        "AWSAccountId": "123456",
        "NewStateValue": "ALARM",
        "NewStateReason": "Threshold Crossed: 1 out of the last 1 datapoints [1.0 (04/08/21 03:06:00)] was greater than or equal to the threshold (1.0) (minimum 1 datapoint for OK -> ALARM transition).",
        "StateChangeTime": "2021-08-04T03:10:10.215+0000",
        "Region": "US East (Ohio)",
        "AlarmArn": "arn:aws:cloudwatch:us-east-2:123456:alarm:test-alert",
        "OldStateValue": "OK",
        "Trigger":
        {
            "MetricName": "NumberOfMessagesPublished",
            "Namespace": "AWS/SNS",
            "StatisticType": "Statistic",
            "Statistic": "SUM",
            "Unit": null,
            "Dimensions":
            [
                {
                    "value": "my-topic",
                    "name": "TopicName"
                }
            ],
            "Period": 60,
            "EvaluationPeriods": 1,
            "ComparisonOperator": "GreaterThanOrEqualToThreshold",
            "Threshold": 1.0,
            "TreatMissingData": "- TreatMissingData:                    missing",
            "EvaluateLowSampleCountPercentile": ""
        }
    }
  • Alerts that are created based on anomaly detection
    {
        "AlarmName": "cpu alrm",
        "AlarmDescription": "this is a cpu alarm",
        "AWSAccountId": "123456",
        "NewStateValue": "INSUFFICIENT_DATA",
        "NewStateReason": "Threshold Crossed: no datapoints were received for 2 periods and 2 missing datapoints were treated as [Breaching].",
        "StateChangeTime": "2021-08-05T08:38:47.104+0000",
        "Region": "US East (Ohio)",
        "AlarmArn": "arn:aws:cloudwatch:us-east-2:123456:alarm:cpu alrm",
        "OldStateValue": "OK",
        "Trigger":
        {
            "Period": 60,
            "EvaluationPeriods": 2,
            "ComparisonOperator": "GreaterThanUpperThreshold",
            "ThresholdMetricId": "ad1",
            "TreatMissingData": "- TreatMissingData:                    breaching",
            "EvaluateLowSampleCountPercentile": "",
            "Metrics":
            [
                {
                    "Id": "m1",
                    "MetricStat":
                    {
                        "Metric":
                        {
                            "Dimensions":
                            [
                                {
                                    "value": "i-1a2b3c4d",
                                    "name": "InstanceId"
                                }
                            ],
                            "MetricName": "CPUUtilization",
                            "Namespace": "AWS/EC2"
                        },
                        "Period": 60,
                        "Stat": "Average"
                    },
                    "ReturnData": true
                },
                {
                    "Expression": "ANOMALY_DETECTION_BAND(m1, 0.1)",
                    "Id": "ad1",
                    "Label": "CPUUtilization (expected)",
                    "ReturnData": true
                }
            ]
        }
    }

Field mappings

After a CloudWatch alert is ingested into Log Service, the alert is converted to a Log Service alert based on field mappings. The following sample code provides an example of a Log Service alert:

  • Alerts that are created based on static thresholds
    {
        "aliuid": "aliuid1",
        "alert_instance_id": "{Automatically generated}",
        "alert_id": "CloudWatch_test-alert",
        "alert_type": "sls_pub",
        "alert_name": "test-alert",
        "region": "{The region of the project to which Alert Center belongs}",
        "project": "{The project to which Alert Center belongs}",
        "project_id": 0,
        "next_eval_interval": 60,
        "alert_time": 1628046610,
        "fire_time": 1628046610,
        "fire_results": null,
        "fire_results_count": 0,
        "resolve_time": 0,
        "status": "firing",
        "results": null,
        "labels":
        {
            "TopicName": "my-topic",
            "__comparison_operator__": "GreaterThanOrEqualToThreshold",
            "__statistic__": "SUM",
            "__statistic_type__": "Statistic",
            "__threshold__": "1",
            "metric_name": "NumberOfMessagesPublished"
        },
        "annotations":
        {
            "__alarm_arn__": "arn:aws:cloudwatch:us-east-2:123456:alarm:test-alert",
            "__aws_accountId__": "123456",
            "__aws_region__": "US East (Ohio)",
            "__cloud_watch_alert_type__": "StaticThreshold",
            "__config_app__": "sls_pub_alert",
            "__pub_alert_app__": "{The ID of the alert ingestion application}",
            "__pub_alert_protocol__": "cloud_watch",
            "__pub_alert_region__": "{The region of the endpoint to which the alert is sent}",
            "__pub_alert_service__": "{The ID of the alert ingestion service}",
            "desc": "this is a test alert",
            "title": "Threshold Crossed: 1 out of the last 1 datapoints [1.0 (04/08/21 03:06:00)] was greater than or equal to the threshold (1.0) (minimum 1 datapoint for OK -> ALARM transition)."
        },
        "severity": 10,
        "policy":
        {
            "alert_policy_id": "{The ID of the alert policy that is specified for the alert ingestion application}",
            "action_policy_id": "{The ID of the action policy that is specified for the alert ingestion application}",
            "use_default": false,
            "repeat_interval": "{The cycle that is specified for the alert ingestion application}"
        },
        "template": null,
        "drill_down_query": "https://us-east-2.console.aws.amazon.com/cloudwatch/home?region=us-east-2#alarmsV2:alarm/test-alert"
    }
  • Alerts that are created based on anomaly detection
    {
        "aliuid": "aliuid1",
        "alert_instance_id": "{Automatically generated}",
        "alert_id": "CloudWatch_cpu alrm",
        "alert_type": "sls_pub",
        "alert_name": "cpu alrm",
        "region": "{The region of the project to which Alert Center belongs}",
        "project": "{The project to which Alert Center belongs}",
        "project_id": 0,
        "next_eval_interval": 120,
        "alert_time": 1628152727,
        "fire_time": 1628152727,
        "fire_results": null,
        "fire_results_count": 0,
        "resolve_time": 0,
        "status": "firing",
        "results": null,
        "labels":
        {
            "__comparison_operator__": "GreaterThanUpperThreshold",
            "__threshold_metricId__": "ad1"
        },
        "annotations":
        {
            "__alarm_arn__": "arn:aws:cloudwatch:us-east-2:123456:alarm:cpu alrm",
            "__aws_accountId__": "123456",
            "__aws_region__": "US East (Ohio)",
            "__cloud_watch_alert_type__": "AnomalyDetection",
            "__config_app__": "sls_pub_alert",
            "__pub_alert_app__": "{The ID of the alert ingestion application}",
            "__pub_alert_protocol__": "cloud_watch",
            "__pub_alert_region__": "{The region of the endpoint to which the alert is sent}",
            "__pub_alert_service__": "{The ID of the alert ingestion service}",
            "desc": "this is a cpu alarm",
            "title": "Threshold Crossed: no datapoints were received for 2 periods and 2 missing datapoints were treated as [Breaching]."
        },
        "severity": 8,
        "policy":
        {
            "alert_policy_id": "{The ID of the alert policy that is specified for the alert ingestion application}",
            "action_policy_id": "{The ID of the action policy that is specified for the alert ingestion application}",
            "use_default": false,
            "repeat_interval": "{The cycle that is specified for the alert ingestion application}"
        },
        "template": null,
        "drill_down_query": "https://us-east-2.console.aws.amazon.com/cloudwatch/home?region=us-east-2#alarmsV2:alarm/cpu%20alrm"
    }

The following table describes the field mappings between Log Service and CloudWatch alerts.

Log Service field CloudWatch field Description
aliuid None The ID of the Alibaba Cloud account to which the alert ingestion application belongs.
alert_id None The ID of the alert monitoring rule.

The value of the alert_id field is in the CloudWatch_{$alert_name} format. {$alert_name} is the name of the alert monitoring rule.

alert_type None The type of the alert. The value is fixed as sls_pub.
alert_name AlarmName The name of the alert monitoring rule.
status NewStateValue The status of the alert.
  • If the value of the NewStateValue field in the CloudWatch alert is ALARM or INSUFFICIENT_DATA, the value of the status field in the Log Service alert is firing.
  • If the value of the NewStateValue field in the CloudWatch alert is OK, the value of the status field in the Log Service alert is resolved.
next_eval_interval
  • Period
  • EvaluationPeriods
The interval at which the alert is evaluated. The value is the product of the values of the Period field and the EvaluationPeriods field in the CloudWatch alert.
alert_time StateChangeTime The time at which the alert is triggered.
fire_time StateChangeTime The time at which the alert is first triggered.
resolve_time StateChangeTime The time at which the alert is cleared.
  • If the value of the status field is firing, the value of the resolve_time field is 0.
  • If the value of the status field is resolved, the value of the resolve_time field is the value of the StateChangeTime field in the CloudWatch alert.
labels None The labels of the alert.
  • Alerts that are created based on static thresholds
    • The following fields and their values are added to the labels field. The added fields are renamed in the Log Service alert.
      • ComparisonOperator is renamed __comparison_operator__.
      • MetricName is renamed __metric_name__.
      • StatisticType is renamed __statistic_type__.
      • Statistic is renamed __statistic__.
      • Threshold is renamed __threshold__.
    • The value of each name field in the Dimensions field is added as a field to the labels field, and the value of each value field in the Dimensions field is added as a value to the labels field. The added fields and values form key-value pairs in the labels field.
  • Alerts that are created based on anomaly detection
    The following fields and their values are added to the labels field. The added fields are renamed in the Log Service alert.
    • ComparisonOperator is renamed __comparison_operator__.
    • ThresholdMetricId is renamed __threshold_metricId__.
annotations None The annotations of the alert. The following fields are added to the annotations field:
  • desc: the description of the alert content. The description is the value of the NewStateReason field in the CloudWatch alert.
  • title: the title of the alert. The title is the value of the AlarmDescription field in the CloudWatch alert.
  • __cloud_watch_alert_type__: the type of the CloudWatch alert.
    • For alerts that are created based on static thresholds, the field value is StaticThreshold.
    • For alerts that are created based on anomaly detection, the field value is AnomalyDetection.
  • All unused fields except the trigger field are added to the annotations field.

    The added fields are renamed by adding two underscores (__) before and after the field names. The new names are only in lowercase. If the name of an added field contains multiple words, the words are split by underscores (_). For example, the AlarmArn field is renamed __alarm_arn__.

severity NewStateValue The severity of the alert.
  • If the value of the NewStateValue field in the CloudWatch alert is ALARM, the value of the severity field is 10, which indicates the Critical severity in Log Service alerts.
  • If the value of the NewStateValue field in the CloudWatch alert is INSUFFICIENT_DATA, the value of the severity field is 8, which indicates the High severity in Log Service alerts.
  • If the value of the NewStateValue field in the CloudWatch alert is OK, the value of the severity field varies based on the value of the OldStateValue field in the CloudWatch alert.
policy None The alert policy that is specified for the alert ingestion application. For more information, see Description of the policy variable.
project None The project to which Alert Center belongs. For more information, see Project.
drill_down_query None The URL to the CloudWatch alert.