This topic describes the monitoring and alerting process in Realtime Compute for Apache Flink and how to create alert rules in Realtime Compute for Apache Flink.
Introduction to CloudMonitor
CloudMonitor helps you collect the monitoring metrics of cloud resources or other custom monitoring metrics, check service availability, and configure alerts based on these monitoring metrics. CloudMonitor helps you view the cloud resource usage, business information, and service health status. In addition, you can receive alerts and respond to these alerts at the earliest opportunity to keep your applications running properly.
Create alert rules
For more information about how to create an alert rule, see Configure alert rules.Monitoring items of Realtime Compute for Apache Flink
Monitoring item | Unit | Metric | Dimensions | Statistics |
---|---|---|---|---|
Service delay | s | inputDelay | userId, regionId, projectName, and jobName | Average |
Read records per second (RPS) | RPS | ParserTpsRate | userId, regionId, projectName, and jobName | Average |
Write RPS | RPS | SinkOutTpsRate | userId, regionId, projectName, and jobName | Average |
Failover rate
Note The failover rate is the average number of failovers per second in the last minute.
For example, if one failover occurred in the last minute, the failover rate is 0.01667
(1/60 = 0.01667).
|
% | TaskFailoverRate | userId, regionId, projectName, and jobName | Average |
Processing delay | s | FetchedDelay | userId, regionId, projectName, and jobName | Average |
View monitoring metrics
- Log on to the Realtime Compute development platform.
- In the top navigation bar, click Administration.
- On the Administration page, click the name of the job for which you want to view monitoring metrics.
- In the upper-right corner of the page, choose .
- On the page that appears, view the monitoring metrics of the job.