Fully managed Flink allows you to configure alert rules for jobs that are running. If an alert rule is triggered when a job is running, the system sends you an alert to help you detect and handle exceptions at the earliest opportunity. This topic describes how to configure alert rules in the console of fully managed Flink.

Prerequisites

Application Real-Time Monitoring Service (ARMS) is activated. For more information, see Activate and upgrade ARMS.

Background information

You can configure alert rules in the console of fully managed Flink. The operation is simple, easy, and efficient. You can also configure alert rules in ARMS, but you must specify the information, such as the cluster, type, dashboard, and Prometheus Query Language (PromQL). However, this method is complex, difficult to learn, and time-consuming. For more information, see Configure monitoring alerts (in the ARMS console).

Create a custom rule

  1. Log on to the Realtime Compute for Apache Flink console.
  2. On the Fully Managed Flink tab, find the workspace that you want to manage, and click Console in the Actions column.
  3. In the left-side navigation pane, click Deployments.
  4. Click the name of the job for which you want to create a custom alert rule.
  5. Click the Alarm Configuration tab.
  6. Click the Alarm Rules tab.
  7. Choose Add Rule > Custom Rule.
    If you have configured an alert template in the Administration section on the Realtime Compute for Apache Flink console, you can select Create Rule by Template from the Add Rule drop-down list and continue to perform the subsequent steps. This accelerates alert rule configuration. For more information about how to create an alert template, see Create an alert template.
  8. Enter information about the alert rule.
    Rule information
    Section Parameter Description
    Rule Name The name must be 3 to 64 characters in length, and can contain lowercase letters, digits, and underscores (_). It must start with a letter.
    Description The remarks of the rule.
    Content The condition that triggers the alert. After the condition is configured, the system compares the value of the specified metric with the threshold at specified intervals. If the result meets the condition, the alert is automatically triggered.
    • Metrics: Three types of metrics are supported.
      • Restart Count in 1 Minute: the number of times that Job Manager is restarted in 1 minute.
      • Checkpoint Count in 5 Minutes: the number of times that the checkpoint succeeds in 5 minutes.
      • Emit Delay: the business delay. This parameter specifies the difference between the time data is generated and the time data leaves the source operator. Unit: seconds.
    • Time interval N: A check is performed once every N minutes. The time interval must be less than or equal to 60 minutes. Unit: minutes.
    • Comparator: The greater-than-or-equal-to (>=), less-than-or equal-to (<=) signs are supported.
    • Thresholds: The value that is used to compare with a metric.
    Effective Time The period during which the alert rule is effective. You can specify a period from 09:00 to 18:00. By default, the alert rule is effective throughout the day.
    Alarm Rate Valid values: 1 Min and 1 Day. This parameter indicates that an alert notification is sent once per minute or per day.
    Notification Notify Way The following methods are supported:
    • DingTalk
    • Email
    • SMS
    Note You can configure the phone number, email address, and DingTalk ID of a contact in contacts.
    Contact Group The contact group or contact that you can add or edit.
    If you need to configure alert rules on the DingTalk chatbot, perform the following steps:
    1. Click Edit Contacts Group and Contacts.
    2. On the Contact tab, click Add Contact.
    3. In the DingRobot field, enter the webhook address of the DingTalk chatbot.
      Before you specify the DingRobot parameter, you must create a DingTalk chatbot and obtain the webhook address of the DingTalk chatbot. For more information, see Add a custom DingTalk chatbot and obtain the webhook URL.
      Notice You must select at least Custom Keywords in the Security Settings section, and set Alarm as a keyword. Otherwise, you cannot receive alert notifications. You can set more than one keyword.
  9. Click Save.
    By default, the saved alert rule is enabled and appears in the alert rule list. You can stop, edit, or delete the alert rule.

Create an alert template

  1. Log on to the Realtime Compute for Apache Flink console.
  2. On the Fully Managed Flink tab, find the workspace that you want to manage, and click Console in the Actions column.
  3. In the left-side navigation pane, click Deployments.
  4. Click the name of the job for which you want to create an alert template.
  5. Click the Alarm Configuration tab.
  6. Click the Alarm Rules tab.
  7. Choose Add Rule > Create Rule by Template > Add Rule Template.
  8. Enter information about the rule.
    Alert template
    Section Parameter Description
    Rule Name The name must be 3 to 64 characters in length, and can contain lowercase letters, digits, and underscores (_). It must start with a letter.
    Description The remarks of the rule.
    Content The condition that triggers the alert. After the condition is configured, the system compares the value of the specified metric with the threshold at specified intervals. If the result meets the condition, the alert is automatically triggered.
    • Metrics: Three types of metrics are supported.
      • Restart Count in 1 Minute: the number of times that Job Manager is restarted in 1 minute.
      • Checkpoint Count in 5 Minutes: the number of times that the checkpoint succeeds in 5 minutes.
      • Emit Delay: the business delay. This parameter specifies the difference between the time data is generated and the time data leaves the source operator. Unit: seconds.
    • Time interval N: A check is performed once every N minutes. The time interval must be less than or equal to 60 minutes. Unit: minutes.
    • Comparator: The greater-than-or-equal-to (>=), less-than-or equal-to (<=) signs are supported.
    • Thresholds: The value that is used to compare with a metric.
    Effective Time The period during which the alert rule is effective. You can specify a period from 09:00 to 18:00. By default, the alert rule is effective throughout the day.
    Alarm Rate Valid values: 1 Min and 1 Day. This parameter indicates that an alert notification is sent once per minute or per day.
    Notification Notify Way The following methods are supported:
    • DingTalk
    • Email
    • SMS
    Note You can configure the phone number, email address, and DingTalk ID of a contact in contacts.
    Contact Group The contact group or contact that you can add or edit.
    If you need to configure alert rules on the DingTalk chatbot, perform the following steps:
    1. Click Edit Contacts Group and Contacts.
    2. On the Contact tab, click Add Contact.
    3. In the DingRobot field, enter the webhook address of the DingTalk chatbot.
      Before you specify the DingRobot parameter, you must create a DingTalk chatbot and obtain the webhook address of the DingTalk chatbot. For more information, see Add a custom DingTalk chatbot and obtain the webhook URL.
      Notice You must select at least Custom Keywords in the Security Settings section, and set Alarm as a keyword. Otherwise, you cannot receive alert notifications. You can set more than one keyword.
  9. Click Confirm.
    The saved alert template appears in the alert template list. You can edit or delete the template.