All Products
Search
Document Center

DataWorks:Intelligent diagnosis

Last Updated:Apr 03, 2025

The intelligent diagnosis feature allows you to perform end-to-end diagnosis on task instances. If task instances are not run as expected, you can use this feature to identify problems.

Background information

You can use the intelligent diagnosis feature to diagnose and analyze task instances from the following dimensions:

Limits

  • Only users of DataWorks Professional Edition or a more advanced edition can use the intelligent diagnosis feature. If you use another edition, you can have a trial use of the feature for free. However, we recommend that you upgrade the DataWorks service to DataWorks Professional Edition to use more features. For more information, see Differences among DataWorks editions.

  • The intelligent diagnosis feature is supported only in the following regions: China (Hangzhou), China (Shanghai), China (Beijing), China (Zhangjiakou), China (Shenzhen), China (Hong Kong), Japan (Tokyo), Singapore, Malaysia (Kuala Lumpur), Indonesia (Jakarta), Germany (Frankfurt), US (Silicon Valley), US (Virginia), and UAE (Dubai).

Go to the Intelligent Diagnosis page

  1. Go to the Operation Center page.

    Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose Data Development and O&M > Operation Center. On the page that appears, select the desired workspace from the drop-down list and click Go to Operation Center.

  2. On the Operation Center page, use one of the following methods to go to the Intelligent Diagnosis page:

    • Method 1: Go to the Intelligent Diagnosis page of an instance.

      • In the left-side navigation pane, choose Auto Triggered Node O&M > Auto Triggered Instances. On the Instance Perspective tab, find the desired instance and click Perform Diagnostics in the Actions column to go to the Intelligent Diagnosis page.

      • In the left-side navigation pane, choose Auto Triggered Node O&M > Auto Triggered Instances. On the Instance Perspective tab, find the desired instance and click DAG in the Actions column. On the DAG page of the desired instance, right-click the instance and select Instance Diagnose.

    • Method 2: In the left-side navigation pane, choose O&M Assistant > Intelligent Diagnosis.

      Note

      The intelligent diagnosis feature allows you to search for instances only by instance ID.

View the status of the current instance

On the Running Details tab, DataWorks checks the status of ancestor instances of the current instance, the scheduling time configured for the current instance, the usage of scheduling resources, and the status of the current instance in sequence based on the conditions required for running an instance.

  • Upstream Nodes

    In the Upstream Nodes step on the Running Details tab of the Intelligent Diagnosis page, the intelligent diagnosis feature allows you view the status of ancestor instances. If an ancestor instance fails to be run, the current instance is blocked. You can click Instance Diagnose in the Operation column of the ancestor instance to identify the reason for the failure.

    Note

    If some ancestor instances of the current instance are not run and dependencies between the current instance and its ancestor instances are complex, we recommend that you use the upstream analysis feature on the Upstream Analysis tab of the DAG page to identify the key ancestor instances that block the running of the current instance. Then, you can use the intelligent diagnosis feature to identify the reason why the ancestor instances are not run. This improves O&M efficiency.

  • Timing Check

    In the Timing Check step, you can check whether the scheduling time configured for the current instance has arrived. The check is triggered only when the upstream dependency check is successful.

    Note

    When you configure scheduling properties for a task for which the current instance is generated on the DataStudio page, you must specify the time at which the task is scheduled to run in the scheduling system. However, the actual time at which the task starts to be run may be later than the scheduling time of the task due to issues such as the failure of an ancestor task of the current task.

  • Resources

    In the Resources step, you can view the resource usage and the list of instances that occupy resources when the current instance is waiting for the resources. If the current instance fails to pass the resource usage check, the scheduling resources used for running the current instance are insufficient. In this case, the current instance can start to be run only when the scheduling resources are released. You can arrange the scheduling time of the current instance to avoid peak hours based on the information in the Resources step.

    Section

    Description

    Scheduling resource information

    Allows you to view the name of the resource group for scheduling that is used by the current instance, the number of instances that are running on the resource group for scheduling, and the number of instances that are waiting to be run on the resource group for scheduling.

    Note

    We recommend that you use serverless resource groups to ease scheduling resource constraints.

    The peak hours for DataWorks tasks are from 00:00 to 09:00 every day. If you use the shared resource group for scheduling during the peak hours, resources in the resource group may be insufficient, and tasks may wait for resources.

    Diagnosis Results

    Allows you to view the execution status of the current task.

    Resource Usage Trends

    Allows you to view the resource usage of the current resource group for scheduling within each time period and the time consumed by the current instance to wait for resources if you use shared resource groups for scheduling.

  • Execution

    In the Execution step, you can view the run logs of the current instance, details of associated data quality monitoring rules, and code details of the node for which the current instance is generated. For an instance that fails to be run, the intelligent diagnosis feature provides diagnosis results and suggestions based on log information. This helps you identify the cause of the error that occurs on the instance.

    Tab

    Description

    Log

    Allows you to view the running details of the current instance.

    In the Execution step of the Running Details tab, you can click the URL of the web UI for EMR on the Log tab to go to the web page of EMR and view the resource information of EMR. You can also click Intelligent Diagnostics in the lower-right corner to go to the Intelligent Diagnostics tab and analyze errors.

    Intelligent Diagnostics

    Allows you to analyze error logs of the current instance by using LLMs. Tongyi Qianwen, DeepSeek, and DW Knowledge Base are supported.

    You can use Tongyi Qianwen and DeepSeek to analyze and parse error logs to generate analysis results and suggestions. You can also view suggestions in DW Knowledge Base.

    Note

    After the analysis is complete, you can also perform the following operations: edit the instance code, rerun the instance, set the instance status to success, change the resource group for scheduling and resource group for Data Integration, submit a ticket, and apply for table permissions.

    DQC

    Allows you to view details of the data quality monitoring rule. If you associate a data quality monitoring rule with the task for which the current instance is generated, the data quality monitoring rule is triggered after the task is run.

    Code details

    Allows you to view the code details of the task for which the current instance is generated.

View the basic information

On the General tab, you can view key points in time for the current instance and basic information about the current instance. For more information about scheduling properties that are configured for the node for which the current instance is generated, see Configure basic properties.

View the affected baselines

On the Impact baseline tab, you can view the baseline that contains the task for which the current instance is generated within the monitoring scope and the status of the baseline. For more information about the intelligent baseline feature, see Overview.

View the status of the historical instances

On the Historical instance tab, you can view the following information:

  • Trends of the following metrics measured for the current node within recent 15 days in charts: Running time, Start run time, Time consumption of waiting for scheduling resources, and Completed At.

  • Running details of the instances that are generated for the current node over a historical period of time in the Historical instance list, including the time when an instance started to run, the time when the instance was complete, the running duration, and the time spent for waiting for resources. You can click Instance Diagnose in the Operation column of an instance to go to the diagnosis details page of the instance.