ApsaraDB for ClickHouse provides the monitoring and alerting feature for you to monitor cluster status. You can configure alert rules for key metrics. This way, you can identify abnormal metric data at the earliest opportunity. This helps you identify and handle faults.
Background information
The monitoring and alerting feature is implemented by using CloudMonitor. CloudMonitor allows you to configure metrics. If the alert rule of a metric is triggered, CloudMonitor notifies all the contacts within the alert contact group. You can maintain alert contact groups for metrics to ensure that the contacts receive alerts at the earliest opportunity.
The ApsaraDB for ClickHouse console has been optimized to improve user experience. This topic describes how to configure an alert rule before and after optimization.
In this topic, the console before optimization is called the old console. The console after optimization is called the new console. This classification is applicable to only the monitoring and alerting feature.
If your ApsaraDB for ClickHouse cluster meets the following conditions, configure an alert rule for the cluster based on the instructions in the Procedure in the new console section of this topic.
The ApsaraDB for ClickHouse cluster was created after December 1, 2021.
The region where the ApsaraDB for ClickHouse cluster is deployed is not China (Qingdao) or China (Hohhot).
If your ApsaraDB for ClickHouse cluster does not meet the preceding conditions, configure an alert rule for the cluster based on the instructions in the Procedure in the old console section of this topic.
Procedure in the new console
Log on to the ApsaraDB for ClickHouse console.
In the top navigation bar, select the region where the cluster that you want to manage is deployed.
On the Clusters page, click the Default Instances tab, find the cluster that you want to manage, and then click the ID of the cluster.
In the left-side navigation pane, click Monitoring and Alerting.
On the Monitoring and Alerting page, choose .
On the Create ClickHouseAlert Rule page, set the parameters described in the following table.
When you create an alert rule, you can set Check Type to Static Threshold or Custom PromQL.
If you set Check Type to Static Threshold, you can select a preset metric and create an alert rule by using the metric.
To monitor a metric other than the preset metrics, you can use a custom PromQL statement to create an alert rule.
The following table describes the parameters that you need to set when you set Check Type to Static Threshold.
Parameter
Description
Example
Alert Rule Name
Enter a name for the alert rule.
CPU utilization alert
Check Type
Select Static Threshold.
Static Threshold
Cluster
Select the cluster for which you want to create an alert rule.
cc-bp1lxbo89u95****
Alert Contact Group
Select an alert contact group.
ClickHouse
Alert Metric
Select the metric that you want to monitor by using the alert rule. Different alert contact groups provide different metrics.
cpu_usage
Alert Condition
Specify the condition based on which alert events are generated.
when cpu usage
>
80%, trigger alertFilter Conditions
Specify the applicable scope of the alert rule.
No Filter
Data Preview
The Data Preview section displays the PromQL statement that corresponds to the alert condition. The section also displays the values of the specified metric in a time series graph.
By default, only the real-time values of one resource are displayed. You can specify filter conditions to view the metric values of different resources in different time ranges.
NoteThe threshold in the time series graph is represented by a red line. The part of the curve that meets the alert condition is displayed in dark red, and the part of the curve that does not meet the alert condition is displayed in blue.
You can move the pointer over the curve to view resource details at a specific point in time.
You can also select a time period on the time series curve to view the time series curve of the selected time period.
N/A
Duration
If the alert condition is met, an alert event is generated: If a data point reaches the threshold, an alert event is generated.
If the alert condition is met continuously for N minutes, an alert event is generated: An alert event is generated only if the duration for which the threshold is reached is greater than or equal to N minutes.
1
Alert Level
Specify the alert level. Valid values: Default, P4, P3, P2, and P1. Default value: Default. The preceding values are listed in ascending order of severity.
P2
Alert Message
Specify the alert message that you want to send to the end users. You can specify custom variables in the alert message based on the Go template syntax.
node: {{$labels.pod_name}} CPU usage {{$labels.metrics_params_opt_label_value}} {{$labels.metrics_params_value}}%, current value {{ printf "%.2f" $value }}%
Advanced Settings
Alert Check Cycle
The interval at which an alert rule is triggered. An alert rule is triggered every N minutes to check whether the alert conditions are met. Default value: 1. Minimum value: 1.
1 Minute
Specify Notification Policy
Do Not Specify Notification Policy: If you select this option, you can create a notification policy on the Notification Policy page after you create the alert rule. On the Notification Policy page, you can specify match rules and match conditions. For example, you can specify an alert rule name as the match condition. When the alert rule is triggered, an alert event is generated and an alert notification is sent to the contacts or contact groups that are specified in the notification policy. For more information, see Create and manage a notification policy.
You can also select a notification policy from the drop-down list. Application Real-Time Monitoring Service (ARMS) automatically adds a match rule to the selected notification policy and specifies the ID of the alert rule as the match condition. The name of the alert rule is displayed on the Notification Policy page. This way, the alert events that are generated based on the alert rule can be matched by the selected notification policy.
ImportantAfter you select a notification policy, the alert events that are generated based on the alert rule can be matched by the notification policy and alerts can be generated. The alert events may also be matched by other notification policies that use fuzzy match, and alerts may be generated. One or more alert events can be matched by one or more notification policies.
Do Not Specify Notification Policy
Tags
Specify tags for the alert rule. The specified tags can be used to match notification policies.
N/A
Annotations
Specify annotations for the alert rule.
N/A
The following table describes the parameters that you need to set when you set Check Type to Custom PromQL.
Parameter
Description
Example
Alert Rule Name
Enter a name for the alert rule.
Pod CPU utilization exceeds 8%
Check Type
Select Custom PromQL.
Custom PromQL
Cluster
Select the cluster for which you want to create an alert rule.
cc-bp1lxbo89u95****
Reference Alert Contact Group
Select an alert contact group.
ClickHouse
Reference Metrics
Optional. The Reference Metrics drop-down list displays common metrics. After you select a metric, the PromQL statement of the metric is displayed in the Custom PromQL Statements field. You can modify the statement based on your business requirements.
The values in the Reference Metrics drop-down list vary based on the type of the Prometheus instance.
cpu_usage
Custom PromQL Statements
Specify the PromQL statement based on which alert events are generated.
max(container_fs_usage_bytes{pod!="", namespace!="arms-prom",namespace!="monitoring"}) by (pod_name, namespace, device)/max(container_fs_limit_bytes{pod!=""}) by (pod_name,namespace, device) * 100 > 90
Duration
If the alert condition is met, an alert event is generated: If a data point reaches the threshold, an alert event is generated.
If the alert condition is met continuously for N minutes, an alert event is generated: An alert event is generated only if the duration for which the threshold is reached is greater than or equal to N minutes.
1
Alert Level
Specify the alert level. Valid values: Default, P4, P3, P2, and P1. Default value: Default. The preceding values are listed in ascending order of severity.
Default
Alert Message
Specify the alert message that you want to send to the end users. You can specify custom variables in the alert message based on the Go template syntax.
namespace: {{$labels.namespace}}/pod: {{$labels.pod_name}}/disk: {{$labels.device}} usage exceeds 90%, current value {{ printf "%.2f" $value }}%
Advanced Settings
Alert Check Cycle
The interval at which an alert rule is triggered. An alert rule is triggered every N minutes to check whether the alert conditions are met. Default value: 1. Minimum value: 1.
1 Minute
Specify Notification Policy
Do Not Specify Notification Policy: If you select this option, you can create a notification policy on the Notification Policy page after you create the alert rule. On the Notification Policy page, you can specify match rules and match conditions. For example, you can specify an alert rule name as the match condition. When the alert rule is triggered, an alert event is generated and an alert notification is sent to the contacts or contact groups that are specified in the notification policy. For more information, see Create and manage a notification policy.
You can also select a notification policy from the drop-down list. ARMS automatically adds a match rule to the selected notification policy and specifies the ID of the alert rule as the match condition. The name of the alert rule is displayed on the Notification Policy page. This way, the alert events that are generated based on the alert rule can be matched by the selected notification policy.
ImportantAfter you select a notification policy, the alert events that are generated based on the alert rule can be matched by the notification policy and alerts can be generated. The alert events may also be matched by other notification policies that use fuzzy match, and alerts may be generated. One or more alert events can be matched by one or more notification policies.
Do Not Specify Notification Policy
Tags
Specify tags for the alert rule. The specified tags can be used to match notification policies.
N/A
Annotations
Specify annotations for the alert rule.
N/A
Click Save. The alert rule automatically takes effect.
Procedure in the old console
Log on to the ApsaraDB for ClickHouse console.
In the top navigation bar, select the region where the cluster that you want to manage is deployed.
On the Clusters page, click the Default Instances tab, find the cluster that you want to manage, and then click the ID of the cluster.
In the left-side navigation pane, click Monitoring Details.
In the upper-right corner, click Alert Monitoring.
In the left-side navigation pane of the CloudMonitor console, choose Alerts > Alert Rules.
On the Threshold Value Alert tab, click Create Alert Rule.
On the Create Alert Rule page, set the following parameters.
In the Relate Resource section, set the following parameters.
Parameter
Description
Product
Select Clickhouse from the drop-down list.
Resource Range
All Resources: If you set this parameter to All Resources and a cluster in ApsaraDB for ClickHouse meets the conditions specified in Rule Description, the system sends alert notifications.
Cluster: If you set this parameter to Cluster and the selected clusters meet the conditions specified in Rule Description, the system sends alert notifications.
Region
This parameter is required if you set the Resource Range parameter to Cluster.
Select the region where the cluster for which you want to set the alert rule is deployed.
Cluster
This parameter is required if you set the Resource Range parameter to Cluster.
Select the ID of the cluster. You can select multiple cluster IDs.
Configure the alert rule. For more information about how to configure an alert rule, see Create an alert rule.
Configure a notification method for the alert rule.
NoteYou must create a contact group before you configure an alert rule. For more information about how to create a contact group, see Create an alert contact or alert contact group.
After the preceding parameters are set, click Confirm. The alert rule automatically takes effect.