This topic describes how to configure alerts to monitor data transformation tasks.
Background information
When you perform a data transformation task, a dashboard named Data Transformation Troubleshooting is created for the task. The dashboard displays the metrics that indicate the execution
status of the task. You can subscribe to the dashboard and configure alerts to monitor
the metrics on the dashboard. This allows you to detect and troubleshoot exceptions
such as data traffic exceptions, transformation logic exceptions, and system operation
exceptions with higher efficiency.
We recommend that you take note of the following metrics on the
Data Transformation Troubleshooting dashboard:
- System metrics: the data consumption delay and relevant exceptions
- Application metrics: the number of log entries that are read and the number of output
log entries
Procedure
- Log on to the Log Service console.
- In the Projects section, click the target project.
- In the left-side navigation pane, click
. A list of dashboards appear.
- Click the dashboard named Data Transformation Troubleshooting.
- On the Data Transformation Troubleshooting page, filter jobs and configure alerts for metrics.
Consumption delay
- In the shard consumption delay chart, choose .
- Configure an alert.
For example, if you set
Trigger Condition to
[delay (s)] > 120, the alert is triggered if the data consumption delay is greater than 120 seconds.
For information about other parameters, see
Configure an alert.

- Configure the notification method.
- View alert notifications in the specified DingTalk group.
Exception reporting
- In the Exception detail chart, choose .
- Configure an alert.
For example, you can set
Trigger Condition to
level == 'ERROR'. For information about other parameters, see
Configure an alert.

- Configure the notification method.
- View alert notifications in the specified DingTalk group.

Note In most cases, error logs are generated due to the invalid transformation script of
a task. You can modify the transformation script. After you modify the script, the
task is restarted. Then you can check whether new error logs are generated.
Transformation traffic (absolute value)
- In the Transform speed chart, choose .
- Configure an alert.
For example, if you set
Trigger Condition to
accept < 40000, the alert is triggered if the number of log entries that are transformed every second
is less than 40,000. For information about other parameters, see
Configure an alert.

- Configure the notification method.
- View alert notifications in the specified DingTalk group.
Transformation traffic (day-on-day comparison)
- Customize monitoring metrics.
- Choose . Click the Logstore named internal-etl-log.
- Enter the following query statement in the search box and then click Search & Analyze.
This query statement calculates the ratio of the number of log entries that are written
every 5 minutes on the current day to that of the day before.
__topic__: __etl-log-status__ AND __tag__:__schedule_type__: Resident and event_id: "shard_worker:metrics:checkpoint" |
select dt, today, yesterday, round((today - yesterday) * 100.0 / yesterday, 3) as inc_ration from
(select dt, (case when diff[1] is null then 0 else diff[1] end) as today, (case when diff[2] is null then 0 else diff[2] end) as yesterday from
(select dt, compare("delivered lines", 86400) as diff from
(select date_format(__time__ - __time__ % 300, '%H:%i') as dt, sum("progress.delivered") as "delivered lines" from log group by dt order by dt asc limit 5000)
group by dt order by dt asc limit 5000))
Note You can modify the query statement to create a more fine-grained alert. For example,
you can set an alert only for the task whose ID is 06f239b7362ad238e613abb3f7fe3c87.
__topic__: __etl-log-status__ AND __tag__:__schedule_type__: Resident and event_id: "shard_worker:metrics:checkpoint" and __tag__:__schedule_id__: 06f239b7362ad238e613abb3f7fe3c87 |
select dt, today, yesterday, round((today - yesterday) * 100.0 / yesterday, 3) as inc_ration from
(select dt, (case when diff[1] is null then 0 else diff[1] end) as today, (case when diff[2] is null then 0 else diff[2] end) as yesterday from
(select dt, compare("delivered lines", 86400) as diff from
(select date_format(__time__ - __time__ % 300, '%H:%i') as dt, sum("progress.delivered") as "delivered lines" from log group by dt order by dt asc limit 5000)
group by dt order by dt asc limit 5000))
- On the Graph tab, select the line chart and click Add to Dashboard. In this example, the query statement is saved as a chart on the dashboard named
etl-monitor.
- In the left-side navigation pane, click
. A list of dashboards appear.
- Click the dashboard named etl-monitor.
- On the etl-monitor dashboard, find the target chart, and choose .
- Configure an alert.
For example, if you set
Trigger Condition to
inc_ration < (-40), the alert is triggered if the log transformation speed is 40% lower than that of
the previous day. For information about other parameters, see
Configure an alert.

- Configure the notification method.
- View alert notifications in the specified DingTalk group.
Alert-related operations
You can delete, modify, or disable an alert on the Alert Overview page of the alert.
For more information, see Manage an alert.