The Operations and Maintenance (O&M) dashboard displays the O&M stability assessment for auto triggered tasks, key O&M metrics, an overview of schedule resource usage, and the running details of one-time tasks and data integration sync tasks. This dashboard helps you obtain a high-level overview of all tasks in your workspace, quickly find and handle abnormal tasks, and improve your O&M efficiency.
Usage notes
You can view the O&M overview for your workspace from the following perspectives: O&M for auto triggered tasks, O&M for one-time tasks, and O&M for data integration tasks.
Specific project: View the O&M overview for the selected workspace. In this view, you can see the O&M overview for the workspace and for data integration sync tasks.
All Projects: View the O&M overview of all workspaces in the current account. In this view, you cannot separately view the O&M overview of Data Integration sync tasks.
Limits
The O&M dashboard feature is not supported in the development environment of a standard mode workspace.
NoteIn the top menu bar of the Operation Center, you can click to switch between the Production and Development environments.
Auto Triggered Task tab: Collects O&M information for only auto triggered tasks and their instances. Other types of tasks and instances are not included.
One-time Task tab: Collects O&M information for only manually triggered workflows and their inner node instances.
Data Integration Task tab: Collects O&M information for only offline and real-time data integration sync tasks.
Go to the O&M dashboard
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Operation Center.
View O&M information for auto triggered tasks
On the Auto Triggered Task tab, you can view the O&M overview, which includes the O&M stability assessment, key concerns, recurring instance status distribution, recurring instance completion status, and scheduling resource group usage.
O&M stability assessment
The O&M stability of your workspace is assessed based on the overall running status of tasks in the workspace.
Workspace | Single workspace | All my workspaces |
Stability diagram |
|
|
Stability description | The stability health status is categorized into four levels: Excellent, Good, Fair, and Poor. A high-risk or low-risk tag indicates that the workspace health is poor and requires immediate optimization. |
|
View key concerns
The Key Concerns section displays abnormal items based on smart baselines and auto triggered task exceptions. You can view these items from a workspace or personal perspective. This lets you view abnormal issues for the entire workspace or only for the tasks you own. Find and fix these issues immediately to prevent them from affecting your business.
Abnormal issue type | Problem Description | References | Diagram |
Baseline instance breach | The number of baseline instances that breached their committed completion time today. A baseline instance breach means that the estimated completion time of a task on the baseline exceeds the committed time, and an alert is triggered because the task did not complete on time. |
| |
Baseline instance warning | The number of baseline instances with warnings today. The warning margin ensures that important data in complex dependency scenarios is generated on time. Exceeding this margin may cause tasks to fail to complete on time, leading to an exception. | ||
Error event | The number of error events today. When a task is monitored by a baseline, an error event is generated if the task fails. A failed task can block its descendant nodes. Handle the failed task promptly to ensure its descendant nodes can run normally. | ||
Slowdown event | The number of slowdown events today. When a task is monitored by a baseline, a slowdown event is generated if the task runs slowly. A slowdown means the current runtime of the task is significantly longer than its average runtime over a past period. | ||
Isolated task | The number of auto triggered tasks that have no upstream dependencies. When a node has no upstream dependencies, it becomes an isolated node and can no longer be automatically scheduled. | ||
Frozen task | Counts the number of paused auto triggered tasks. After an auto triggered task is frozen, the instances it generates will also be in the Frozen state. Frozen instances will not run and will block their descendant nodes. | ||
Expired task | The number of auto triggered tasks whose scheduling validity period has expired. A node automatically generates and runs recurring instances within its scheduling validity period. Outside this period, it cannot generate recurring instances and be automatically scheduled. | None | |
Modified task | This shows the number of recurring schedule tasks modified today.
Note When you switch to the My view, the system counts the number of modified nodes for tasks that you own. | None |
O&M overview for recurring instances and auto triggered tasks
The following table describes the O&M overview for recurring instances and auto triggered tasks.
O&M category | Description | Diagram |
Recurring instance status distribution |
Note Only Normal tasks are counted here. Dry-run and frozen tasks are not included. |
|
Recurring instance completion status |
|
|
Recurring instance and auto triggered task trends | Scope: Collects statistics on the change trends in the number of auto triggered tasks and recurring instances in the production environment over a specified range of data timestamps. You can view data for up to the last year. |
|
Auto triggered task distribution |
Note In the All my workspaces view, you can view the distribution of auto triggered tasks by workspace. |
|
Scheduling resource group usage
This section shows the usage rate, which is the percentage of resources used by instances running on the resource group, and the trend in the number of instances running on the selected scheduling resource group at different times within a specified period.
Data for up to 7 days is supported.
If resource group usage exceeds 80%, you should scale out the resource group to prevent resource shortages from affecting task execution.
The statistics for resource group usage and the number of running instances are at the resource group level. For example, if the exclusive resource group for scheduling that you use is shared by multiple workspaces, the statistics show the total resource usage rate and instance number trend for that resource group across all workspaces.

Recurring instance runtime and error rankings

Yesterday's recurring instance rankings
This section ranks the top
30recurring instances from the previous day by runtime, resource wait time, and slowdown duration. You can use the rankings to quickly find time-consuming tasks. Click an instance ID to go to the instance details page and view the running details through the run diagnostics.NoteSlowdown duration: The difference between the previous day's runtime and the historical average runtime, sorted in descending order.
Recurring instance error rankings for the last month
This section ranks the top
30recurring instances with errors over the last month. You can quickly locate tasks with high error rates in the last month, view task details, and identify the cause of the errors.
View O&M information for one-time tasks
On the One-time Task tab, you can view the running status of manually triggered workflows and inner node instances.
One-time task overview
This section shows the total number of manually triggered workflows and inner node instances that have run since a specified date and the percentage of successful runs.

Workflow instance status
O&M category | Description | Diagram |
Workflow instance status distribution | A pie chart shows the status distribution of manually triggered workflow instances for the specified run date.
|
|
Workflow rankings | This ranks workflows with long runtimes and high failure rates for a specified run date.
|
|
Internal task instance status
O&M category | Description | Diagram |
Internal Task Distribution | A pie chart shows the real-time distribution of inner node instances in the Operation Center, categorized by Node Type and Owner. |
|
Internal Task Leaderboard | This ranks inner node instances with long runtimes and high failure rates for a specified run date.
|
|
View O&M information for data integration tasks
On the Data Integration tab, you can view the overview and resource group usage for data integration sync tasks from Yesterday or Today.
Data Integration resource group usage
This section shows the resource details for all data integration tasks in the current workspace, including Running Tasks, Resource Usage, and Expired At. Based on the resource group usage and task volume, you can decide whether to perform operations, such as scaling, to allocate resources reasonably.
For more information about operations on exclusive resource groups for Data Integration, see Billing for exclusive resource groups for Data Integration.
For more information about operations on Serverless resource groups, see Use a Serverless resource group.
The page only collects O&M statistics for exclusive resource groups for Data Integration.
Data Integration sync task status distribution
A pie chart shows the status distribution of sync tasks in the current workspace. You can click a segment to go to the details page for tasks in that state to view and handle any issues. Pay close attention to Abnormal and Failed tasks because they usually block downstream task execution.
Offline sync task status
The following table describes the offline sync task status.
O&M category | Description | Diagram |
Data synchronization progress | This shows the total data volume and total traffic usage for offline synchronization within the selected data timestamp. |
|
Data synchronization volume statistics | This shows the data pull and write curves for the synchronized data volume by data source type for the selected data timestamp. You can quickly view DPI engine tasks with large data synchronization volumes and consider allocating more resources to them. |
|
Latest Top 10 rankings | This shows the 10 most recent Latest Failed Instances and Latest Successful Instances so you can get a global view of the latest sync task statuses. Use the error messages to quickly find the cause of instance failures and resolve them. |
|
Data synchronization task execution details | You can filter by conditions such as Commit Time, Task Status, and Task Name to quickly search for task instances and view their running details. |
|
Real-time sync task status
The following table describes the real-time sync task status.
O&M category | Description | Diagram |
Data synchronizationoverview | This shows the sum of the data speed and record speed for all real-time sync tasks in the current workspace. |
|
Top 10 task latency | This shows the 10 real-time sync tasks with the highest latency, so you can quickly locate and optimize them. |
|
Alert information | This shows the alert information generated by real-time sync tasks recently, so you can quickly catch and resolve exceptions. |
|
Failover information | This shows the |
|


















