Baselines let you group critical pipeline nodes under a committed completion deadline and receive proactive alerts before that deadline is breached. DataWorks estimates when each node in the baseline will finish — based on historical run data — and alerts you if it predicts a miss, giving your team time to act before data is late.
Prerequisites
Before you begin, ensure that:
Your workspace runs DataWorks Standard Edition or a more advanced edition. DataWorks Basic Edition does not support baseline management.
You have the workspace administrator or tenant administrator role. Only these roles can create baselines.
You have added alert recipients as contacts on the Alert Contacts page if you plan to use text message or phone call notifications for RAM users.
How it works
When a baseline is enabled, DataWorks calculates an estimated completion time for each node in the baseline using the average run duration over the most recent 10 days. It then compares this estimate against the alert time, which is derived from the parameters you set:
Alert time = Committed finish time − Alert margin thresholdDataWorks checks node status in two phases:
Before a node starts (pre-run warning): If the estimated completion time exceeds the alert time — calculated from historical data — a baseline alert is sent immediately. This early warning is especially valuable when node dependencies are complex and frequently changing.
While a node is running (in-run alert): If the actual completion time is on track to exceed the alert time, a baseline alert is sent.
If any ancestor node of a baseline node is also predicted to finish late, the alert is triggered based on that ancestor's historical average as well.
A baseline takes effect the day after it is created and enabled. View execution results on the Auto Triggered Instances page.
Permissions
| Action | Required role |
|---|---|
| Create a baseline | Workspace administrator or tenant administrator |
| Enable, disable, delete, or modify a baseline | Tenant administrator or baseline owner |
To assign roles, see Add workspace members and assign roles to them.
Alert notification channels
DataWorks supports the following alert notification methods. Review the limits before configuring alerts.
| Method | Supported regions | Supported editions | Notes |
|---|---|---|---|
| All regions | All editions | — | |
| Text message | All regions | Standard Edition and higher | To enable in regions where text message is not yet available, join the DataWorks DingTalk group (pre-sales or after-sales). |
| Phone call | All regions | All editions | Supports phone numbers in the Chinese mainland only. Calls are rate-limited to one per recipient per 20 minutes; excess calls are downgraded to text messages. |
| DingTalk chatbot | All regions | All editions | Sends alerts to DingTalk groups. |
| Webhook URL | All regions | Basic Edition and higher | Basic Edition: DingTalk, Lark, and WeCom group webhooks. Enterprise Edition: also supports custom webhook URLs. For custom webhook setup, see Formats of alert messages sent by using a custom webhook and contact DataWorks technical support. |
Baseline and event alerts only monitor cycle instances with a business date of yesterday or the day before yesterday.
Create a baseline
Go to Operation Center. Log on to the DataWorks console. In the top navigation bar, select a region. In the left-side navigation pane, choose Data Development and O&M > Operation Center. Select a workspace from the drop-down list, then click Go to Operation Center.
In the left-side navigation pane, choose Node Alarm > Smart Baseline.
On the Baselines tab, click + Create Baseline.
In the Create Baseline panel, configure the basic information.
For MaxCompute nodes, baseline priority maps to MaxCompute job priority using the formula
MaxCompute priority = 9 − baseline priority, provided that the priority feature is enabled for the MaxCompute project and the project uses subscription computing resources. For E-MapReduce (EMR) nodes, configure priority mappings between baselines and YARN queues. See Configure mappings between baseline priorities and YARN queue priorities.Parameter Description Baseline Name A name for the baseline. Workspace The workspace whose nodes you want to monitor. Node and workflow selection is scoped to this workspace. Owner The name or ID of the baseline owner. Baseline Type Day-level Baseline: monitors nodes scheduled to run daily. Hour-level Baseline: monitors nodes scheduled to run hourly. For hour-level baselines, specify committed finish times for each hour-level instance. Nodes The nodes or workflows to add. Adding a node migrates it from its current baseline to this one. Adding a workflow adds all nodes in that workflow. To minimize scope, add only the most downstream node in a workflow — DataWorks automatically monitors all ancestor nodes that affect that node's output. Priority A numeric value where higher numbers mean higher priority. When compute resources are insufficient, higher-priority baselines are scheduled first. Priority takes effect for instances scheduled on the next day. Priority propagates upstream: if a node is in scope for multiple baselines with different priorities, it inherits the highest priority. Cross-cycle upstream dependencies are not affected. Estimated finish time Read-only. Calculated from the average completion time of baseline nodes over the most recent 10 days. For example, if nodes consistently finish around 02:45, the estimated finish time is 02:45. If historical data is insufficient, the field shows: *The completion time cannot be estimated due to the lack of historical data*. Committed finish time The deadline by which all nodes in the baseline must complete. Accepts values in the range 00:00to47:59to support nodes that run across midnight. For example, a node with a 1.5-day run duration can have a committed finish time of36:00. Set this value with enough buffer that the resulting alert time is later than the estimated completion time.Alert margin threshold The buffer between the alert time and the committed finish time. Minimum value: 5 minutes. For example, with a committed finish time of 03:30and a threshold of 10 minutes, the alert time is03:20— DataWorks sends an alert if it predicts any node will not finish by03:20. Size this threshold based on how long your team needs to respond. For guidance, see Configure an appropriate committed point in time and an appropriate alert margin threshold for a baseline.Configure alert settings. Toggle Enable Alerting to turn alerts on or off. When disabled, no notifications are sent, but baseline instances are still generated and priority settings still apply. Two types of alerts are available. Both share the same notification channels and throttle controls, but trigger on different conditions: Configure the following parameters for each alert type:
Alert type Triggers when Baseline alert The estimated or actual completion time of any node (or its ancestor nodes) in the baseline exceeds the alert time. Event alert A node in the baseline, or an ancestor node on the key path, fails to run (Error) or runs significantly slower than its historical average (Slow). View triggered events on the Events tab. Parameter Description Alert notification method One or more of: email, text message, phone call, DingTalk chatbot, or webhook URL. Recipients can be the baseline or node owner, on-duty engineers from a shift schedule, or other specified contacts. To set up a shift schedule, see Create and manage a shift schedule. To set up DingTalk chatbot notifications, see Send alert notifications to a DingTalk group. Use Check Contact Information or Send Test Message to verify delivery before saving. Phone call alerts require DataWorks Professional Edition or higher. Maximum alerts The maximum number of notifications sent per alert event. No additional notifications are sent once this limit is reached. Minimum alert interval The minimum time between consecutive notifications. Alerting do-not-disturb period A time window during which no notifications are sent. If an exception occurs during this window, notifications are held and sent after the window ends — provided the exception persists. For example, a do-not-disturb period of 00:00–08:00suppresses overnight alerts; if the issue is still active at 08:00, an alert is sent.Click OK to create the baseline.
Add nodes to a baseline
A node belongs to exactly one baseline at a time. Adding a node to a new baseline automatically removes it from its previous baseline.
If an enabled baseline has no nodes, it enters the Empty Baseline state and generates baseline instances in that state. See Why is a baseline in the Empty Baseline state?
Two methods are available:
Method 1: During baseline creation
On the Baselines tab, click + Create Baseline and add nodes in the Nodes field.
Method 2: From the Auto Triggered Tasks page
On the Auto Triggered Tasks page, find the target node and choose More > Add Baseline in the Actions column.
To add a single node: choose More > Add Baseline in the Actions column.
To add multiple nodes: select nodes, then choose Actions > Add Baseline at the bottom of the page.
When adding multiple nodes using Method 2, the nodes can only be added to a new baseline, not to an existing one.
Manage baselines
On the Baselines tab, filter by Owner, Workspace, Baseline Name, or Priority to find a baseline. The following operations are available:
| Operation | Description |
|---|---|
| View details | View the basic configuration of the baseline. |
| Modify | Update the baseline configuration. Only tenant administrators and the baseline owner can modify a baseline. |
| View change records | Review the history of configuration changes. |
| Enable / Disable | Enable or disable the baseline. Baseline instances are generated daily only when the baseline is enabled. View daily instance details on the Baseline Instances tab. |
| Delete | Delete the baseline. Only tenant administrators and the baseline owner can delete a baseline. |
What's next
Manage baseline instances: A baseline instance is generated daily when the baseline is enabled. View run details on the Baseline Instances tab.
Configure YARN queue priority mappings for EMR nodes: Map baseline priorities to YARN queues to control EMR node scheduling order.
View baseline operation records: Review a complete audit trail of baseline operations in Operation Center.