本文介绍Kubernetes监控支持的告警规则模板,您可以在创建告警时根据需要选择对应的告警模板。

告警模板 告警表达式 说明
DaemonSet DNS错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="daemonset", protocol="dns"}[5m]))) > 0 最近5分钟DaemonSet DNS错误数大于0。
DaemonSet DNS平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="daemonset", protocol="dns"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="daemonset", protocol="dns"}[5m]))) / 1000000)) > 500 最近5分钟DaemonSet DNS平均响应时间超过500 ms。
DaemonSet HTTP错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="daemonset", protocol="http"}[5m]))) > 0 最近5分钟DaemonSet HTTP错误数大于0。
DaemonSet HTTP平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="daemonset", protocol="http"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="daemonset", protocol="http"}[5m]))) / 1000000)) > 500 最近5分钟DaemonSet HTTP平均响应时间超过500 ms。
DaemonSet MySQL错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="daemonset", protocol="mysql"}[5m]))) > 0 最近5分钟DaemonSet MySQL错误数大于0。
DaemonSet MySQL平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="daemonset", protocol="mysql"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="daemonset", protocol="mysql"}[5m]))) / 1000000)) > 500 最近5分钟DaemonSet MySQL平均响应时间超过500 ms。
DaemonSet Redis错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="daemonset", protocol="redis"}[5m]))) > 0 最近5分钟DaemonSet Redis错误数大于0。
DaemonSet Redis平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="daemonset", protocol="redis"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="daemonset", protocol="redis"}[5m]))) / 1000000)) > 500 最近5分钟DaemonSet Redis平均响应时间超过500 ms。
Deployment DNS错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="deployment", protocol="dns"}[5m]))) > 0 最近5分钟Deployment DNS错误数大于0。
Deployment DNS平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="deployment", protocol="dns"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="deployment", protocol="dns"}[5m]))) / 1000000)) > 500 最近5分钟Deployment DNS平均响应时间超过500 ms。
Deployment HTTP错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="deployment", protocol="http"}[5m]))) > 0 最近5分钟Deployment HTTP错误数大于0。
Deployment HTTP平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="deployment", protocol="http"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="deployment", protocol="http"}[5m]))) / 1000000)) > 500 最近5分钟Deployment HTTP平均响应时间超过500 ms。
Deployment MySQL错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="deployment", protocol="mysql"}[5m]))) > 0 最近5分钟Deployment MySQL错误数大于0。
Deployment MySQL平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="deployment", protocol="mysql"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="deployment", protocol="mysql"}[5m]))) / 1000000)) > 500 最近5分钟Deployment MySQL平均响应时间超过500 ms。
Deployment Redis错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="deployment", protocol="redis"}[5m]))) > 0 最近5分钟Deployment Redis错误数大于0。
Deployment Redis平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="deployment", protocol="redis"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="deployment", protocol="redis"}[5m]))) / 1000000)) > 500 最近5分钟Deployment Redis平均响应时间超过500 ms。
Service DNS错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="service", protocol="dns"}[5m]))) > 0 最近5分钟Service DNS错误数大于0。
Service DNS平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="service", protocol="dns"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="service", protocol="dns"}[5m]))) / 1000000)) > 500 最近5分钟Service DNS平均响应时间超过500 ms。
Service HTTP错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="service", protocol="http"}[5m]))) > 0 最近5分钟Service HTTP错误数大于0。
Service HTTP平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="service", protocol="http"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="service", protocol="http"}[5m]))) / 1000000)) > 500 最近5分钟Service HTTP平均响应时间超过500 ms。
Service MySQL错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="service", protocol="mysql"}[5m]))) > 0 最近5分钟Service MySQL错误数大于0。
Service MySQL平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="service", protocol="mysql"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="service", protocol="mysql"}[5m]))) / 1000000)) > 500 最近5分钟Service MySQL平均响应时间超过500 ms。
Service Redis错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="service", protocol="redis"}[5m]))) > 0 最近5分钟Service Redis错误数大于0。
Service Redis平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="service", protocol="redis"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="service", protocol="redis"}[5m]))) / 1000000)) > 500 最近5分钟Service Redis平均响应时间超过500 ms。
StatefulSet DNS错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="statefulset", protocol="dns"}[5m]))) > 0 最近5分钟StatefulSet DNS错误数大于0。
StatefulSet DNS平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="statefulset", protocol="dns"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="statefulset", protocol="dns"}[5m]))) / 1000000)) > 500 最近5分钟StatefulSet DNS平均响应时间超过500 ms。
StatefulSet HTTP错误数阈值告警 (node_filesystem_avail_bytes * 100) / node_filesystem_size_bytes < 10 最近5分钟StatefulSet HTTP错误数大于0。
StatefulSet HTTP平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="statefulset", protocol="http"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="statefulset", protocol="http"}[5m]))) / 1000000)) > 500 最近5分钟StatefulSet HTTP平均响应时间超过500 ms。
StatefulSet MySQL错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="statefulset", protocol="mysql"}[5m]))) > 0 最近5分钟StatefulSet MySQL错误数大于0。
StatefulSet MySQL平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="statefulset", protocol="mysql"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="statefulset", protocol="mysql"}[5m]))) / 1000000)) > 500 最近5分钟StatefulSet MySQL平均响应时间超过500 ms。
StatefulSet Redis错误数阈值告警 sum by (namespace, name) (floor(increase(agg_npm_entity_requests_error_total{type="statefulset", protocol="redis"}[5m]))) > 0 最近5分钟StatefulSet Redis错误数大于0。
StatefulSet Redis平均响应时间阈值告警 floor(((avg by (namespace, name) (increase(agg_npm_entity_requests_duration_nanoseconds_total{type="statefulset", protocol="redis"}[5m])) / avg by (namespace, name) (increase(agg_npm_entity_requests_total{type="statefulset", protocol="redis"}[5m]))) / 1000000)) > 500 最近5分钟StatefulSet Redis平均响应时间超过500 ms。