All Products
Search
Document Center

DataWorks:Intelligent monitoring

Last Updated:Feb 26, 2024

This topic provides answers to some frequently asked questions about intelligent monitoring.

What do I do if I cannot receive alert notifications after I configure an alert rule in Operation Center?

Check whether an alert is triggered. If an alert is triggered but you cannot receive an alert notification, troubleshoot the issue based on the notification method that you specify. The notification methods include text message, email, and DingTalk group message.

  • Check whether an alert is triggered.

    • If an alert rule is configured for an auto triggered node, view the status of the instances that are generated for the node on the Cycle Instance page in Operation Center and check whether an alert can be triggered for the node.

      For information about the conditions for triggering an alert based on a custom alert rule, see Create a custom alert rule. For more information about the conditions for triggering a baseline alert, see Overview.

    • If an alert rule is configured for a real-time synchronization task, view the status of the real-time synchronization task. To view the status of the task, go to Operation Center. In the left-side navigation pane, choose RealTime Task > Real Time DI.

  • No alert is triggered.

    When a task is incomplete, the system scans the previous 100 incomplete tasks. If a large number of tasks are frozen, the system may fail to scan the task. As a result, no alert is triggered for the task.

  • Alert notifications are not received over text message or email after an alert is triggered.

    Check whether the specified mobile phone numbers and email addresses of alert contacts in DataWorks are correct.

    In the left-side navigation pane of the DataWorks console, choose More > Alert Contacts. On the Alert Contacts page, you can view and specify alert contacts. To specify alert contacts, perform the steps as shown in the following figure.

    配置报警人信息If the specified alert contacts cannot receive alert notifications after an alert is triggered, perform the following checks on the Alert Contacts page:

    • Check whether the specified mobile phone numbers and email addresses of the alert contacts are correct.

    • Check whether the alert contacts activated the mobile phone numbers and email addresses that are specified.

    Note
    • Alibaba Cloud accounts and RAM users to which the AliyunDataWorksFullAccess policy is attached can specify contact information for other RAM users. For more information, see Configure and view alert contacts.

    • If the specified mobile phone numbers or email addresses of alert contacts are incorrect, the system sends alert notifications that are related to overdue payments, service suspension, and release information to the recipients on the Common Settings page. In this case, the specified alert contacts cannot receive the alert notifications.

  • Alert notifications are not received in a DingTalk group after an alert is triggered.

    Perform the following checks:

    • Check whether the webhook URL of the DingTalk chatbot is correct on the alert configuration page.

      • If baseline alert information or a custom alert rule is configured for an auto triggered node, check whether the webhook URL is correct. For example, check for extra spaces.

      • If the alert rule is configured for a real-time synchronization node, check whether the token information of the DingTalk chatbot is correct.实时同步报警

    • Check whether the DingTalk chatbot is correctly configured.

      When you add a chatbot to a DingTalk group for receiving alert notifications, set the Security Settings parameter to Custom Keywords and make sure that the keywords include DataWorks (case-sensitive). For more information, see the "Send alert notifications to a DingTalk group" section in the Create a custom alert rule topic.

What do I do if I want to disable alerting for a node?

After a baseline is created and enabled, the intelligent monitoring service monitors all nodes in the baseline and their ancestor nodes. If a node in the baseline or an ancestor node of the node affects data generation of the monitored nodes in the baseline, the intelligent monitoring service generates an event alert and sends a notification to the node owner by default. For more information, see Overview.1

In the preceding figure, DataWorks has six nodes, and Nodes D and E belong to a baseline. The intelligent monitoring service monitors Nodes D and E and all their ancestor nodes. In this case, the intelligent monitoring service detects errors or slowdowns on Nodes A, B, D, and E. Nodes C and F are not monitored by the intelligent monitoring service.

  • If you want to disable alerting for Nodes D and E, contact the baseline owner to remove Nodes D and E from the baseline.

  • Nodes A and B are ancestor nodes of Nodes D and E and may affect data generation of the monitored nodes in the baseline. If an error or a slowdown occurs on Node A or B, the intelligent monitoring service generates an event alert and sends a notification to the node owner by default.

    If you want to disable alerting for Node A or B, contact the owners of Nodes D and E to delete the dependencies of Nodes D and E on Node A or B.

Why is a baseline in the Empty Baseline state?

In the following scenarios, a baseline may enter the Empty Baseline state:

  • Scenario 1: A node can belong to only one baseline. If you add a node to another baseline, the system removes the node from the current baseline and adds the node to the specified baseline. If all nodes are removed from a baseline, the baseline enters the Empty Baseline state.

  • Scenario 2: On the day when a baseline is created, the baseline is in the Empty Baseline state. After you enable the baseline, a baseline instance is generated on the next day.

  • Scenario 3: You specify an invalid scheduling cycle for an auto triggered node instance in an hour-level baseline.

    Note

    For example, the node is scheduled to run at 6:00 and 18:00 every day. The node has two cycles. When you configure the baseline, you need to specify 6:00 as the execution time of the node in the first cycle and 18:00 as the execution time of the node in the second cycle.

Why is no alert notification sent for a baseline in the Overtime state?

Baseline monitoring is controlled by the baseline switch and enabled for nodes. Overtime is a baseline state, which indicates that the nodes in a baseline are incomplete when the committed completion time is reached. If all nodes in a baseline are run as expected, no alert is triggered even if the baseline enters the Overtime state. This is because the intelligent monitoring service cannot determine which node has an error.

If the baseline enters the Overtime state when all nodes are run as expected, consider the following reasons:

  • The time information configured for the baseline is improper.

  • The node dependency is improper.

Can I disable alerting about the slowdowns of nodes?

The intelligent monitoring service notifies you of a node slowdown only if a node meets both of the following conditions:

  • The node is an ancestor node of an important node in the baseline.

  • Compared with the historical performance of the node, the node obviously slows down.

You can view the descendant node of a node in a baseline on the Events tab in Operation Center. Then, you can confirm the impact with the party whose baseline contains descendant nodes of your node.

  • If the node slowdown has a minor impact, you can ignore the alert.

  • If the node slowdown has a major impact, properly maintain your node.

Why am I unable to receive an alert notification for a node error?

The intelligent monitoring service notifies you of a node error only if a node meets one of the following conditions:

  • The node is an ancestor node of a node in a baseline that is enabled. For more information about baselines, see Manage baselines.

  • A custom alert rule is configured. For more information about how to configure a custom alert rule, see Create a custom alert rule.

What do I do if I receive an alert notification at night?

  1. Go to the DataStudio page.

    1. Log on to the DataWorks console.

    2. In the left-side navigation pane, click Workspaces.

    3. In the top navigation bar, select the region in which the workspace that you want to manage resides. On the Workspaces page, find the desired workspace and choose Shortcuts > Data Development in the Actions column.

  2. In the upper-left corner of the DataStudio page, click the 图标 icon and choose All Products > Data Development and Task Operation > Operation Center.

  3. In the left-side navigation pane, choose Alarm > Smart Baseline. On the page that appears, click the Events tab.

  4. On the Events tab, disable alerting. You can disable alerting by using one of the following methods:

    • Handle the event that triggers the alert. Then, alerting is temporarily disabled for the event.

      1. Find the event and click Handle in the Actions column.

      2. In the Handle Event dialog box, configure the Handling Time parameter.

      3. Click OK.

        Note

        DataWorks records the event handling operation and temporarily disables alerting for the event when the event is being handled.

    • Ignore the event that triggers the alert. Then, alerting is permanently disabled for the event.

      1. Find the event and click Ignore in the Actions column.

      2. In the Ignore Event message, click OK.

        Note

        DataWorks records the event ignoring operation and permanently disables alerting for the event.