The dashboard displays the overall operations and maintenance (O&M) information, including the metrics that require your special attention, overall running information about nodes, and trends on scheduling resources. It also displays the information about data integration, including the running status distribution and data synchronization progress of batch sync nodes and real-time sync nodes. The dashboard helps improve O&M efficiency.

View the dashboard

  1. Log on to the DataWorks console.
  2. In the left-side navigation pane, click Workspaces.
  3. In the top navigation bar, select the region where the target workspace resides. Find the target workspace and click Data Analytics in the Actions column.
  4. Click the Icon icon in the upper-left corner and choose All Products > Task Operation > Operation Center. The Workbench Overview tab of the Overview page appears.

View the overall O&M information

The Workbench Overview tab displays statistics on auto triggered nodes and auto triggered node instances. Other types of nodes or node instances are not included. You can view the following information on the Overview page:
  • The Focus on section displays the following information:
    • The numbers of auto triggered node instances that require your special attention, including failed node instances, slow node instances, and node instances pending for resources. The statistics are collected on node instances whose data timestamp is the day before the current date. A node instance is considered a slow node instance when it meets the following conditions:
      • The node instance is running.
      • The running time of the node instance exceeds 30 minutes.
      • The running time of the node instance is at least 15 minutes longer than the average running time of the past 10 days.
    • The numbers of isolated nodes, paused nodes, and expired nodes.
      • An isolated node is a node that does not have an ancestor node. In this case, the node cannot be run. For example, if you change the output name for the parent node of a node, the dependency becomes invalid.
      • After a node is paused, the node no longer generates node instances and cannot be scheduled.
      • If a node is not triggered at the specified time, the node becomes expired.

    The statistics in this section are updated when you load the page. You can click a type of node or node instance to go to the details page and view the specific nodes or node instances. We recommend that you fix these nodes and node instances at the earliest opportunity to avoid impacts on your business.

  • The RUNNING state distribution section displays the distribution of auto triggered node instances in different states. The data timestamp of these node instances is the day before the current date. The statistics in this section are updated when you load the page. You can click a sector in the pie chart to view the node instances in the specific state.RUNNING state distribution
  • The Task Completion section displays the completion status of node instances between 00:00 and 23:00 of the current date. You can view the number of the node instances that are completed or not run today and yesterday, and the historical average. You can also select a node type to view the status of specific node instances.Node Type
    The line chart displays the numbers of auto triggered node instances that are completed today and yesterday, and the historical average. If the deviations among the three numbers are large, an exception occurred during a specific period of time. Further check and analysis are required.
    Note The statistical aggregation method used by the Operation Center service has been changed. Only node instances in the production environment are counted. Therefore, the line that represents the number of node instances completed today shows obvious fluctuations.
  • The Scheduling resource allocation section displays the usage of a specific resource group and the number of node instances that were running at different time points in the last 24 hours. You can select a resource group from the Resource Group selection drop-down list in the upper-right corner.Scheduling resource allocation
    The Number of instances line shows the number of only the node instances in the current workspace. The Resource Group usage line shows the resource group usage occupied by node instances in all workspaces under the current Alibaba Cloud account.
    Note The resource group usage occupied by data integration nodes is not counted.
  • The Runtime ranking section ranks nodes based on their running time, time pending for resources, or excess running time. The statistics in this section are updated every day. Nodes that were completed on the day before the current date are ranked in this section.Runtime ranking
  • The Error ranking in recent month section ranks nodes with the most errors in the last month and displays the top 10 nodes. The statistics in this section are updated every day. You can view the name, ID, number of occurred errors of each node.Error ranking in recent month
  • The Instances Run in the Last Month section displays the trends on the numbers of nodes and node instances that are scheduled in the specified time range in the production environment. The statistics in this section are updated every day. You can view the trend in the number of nodes and node instances that are scheduled in a time range as wide as one year.Instances Run in the Last Month
  • The Node Types section displays the distribution of nodes in a pie chart. The statistics in this section are updated when you load the page. The pie chart displays a maximum of eight node types. If you have created more than eight types of nodes, specific node types are merged for display.Node Types

View the O&M information about offline synchronization

On the Overview page, click the Data integration tab. The information about offline synchronization within a specific time range is displayed. You can specify the time range of the statistics in the upper-right corner. Time range
You can view the following information about batch sync nodes:
  • The RUNNING state distribution section displays the distribution of auto triggered node instances in different states. The data timestamp of these node instances is in the specified time range. The statistics in this section are updated when you load the page. You can click a sector in the pie chart to view the node instances in the specific state.Status distribution
  • The Data Synchronization progress section displays information about the data that is involved in offline synchronization within the specified time range. The information includes the total amount of data, total amount of Internet traffic, and total number of records.Data Synchronization progress
  • The Synchronize data volume statistics section displays the curves of the data that is pulled from or written to different data stores within the specified time range.Curve
  • The Latest list Top10 section displays the latest 10 node instances that failed and the latest 10 node instances that are successful. The statistics allows you to have an overview of the latest node status.
  • The Synchronization task execution details section allows you to search for a node instance by conditions such as the submission time, node status, and node name. You can also click the ID of a node instance to view the running details of the node instance.Details

View the O&M information about real-time synchronization

On the Data integration tab of the Overview page, click Real-time synchronization. You can view the following information about real-time sync nodes:
  • The RUNNING state distribution section displays the distribution of real-time sync node instances. The statistics in this section are updated when you load the page. You can click a sector in the pie chart to view the node instances in the specific state.Distribution
  • The Overview section displays the total data transmission speed and total record transmission speed of all real-time sync nodes in the current workspace.Overview
  • The Task delay Top10 section displays the top 10 nodes with the highest latency. This section allows you to find the nodes with high latency.Latency
  • The Task alarm information section displays information about the latest alerts. This section allows you to know exceptions with efficiency.Node alerts
  • The Failover information section displays information about failovers within a specified period. This section allows you to have an overview of node failovers.Failover