ACK One fleet monitoring uses Managed Service for Prometheus to collect metrics from your Fleet instances. You can define custom Prometheus alert rules to watch Argo CD pod resource usage in real time and send notifications when thresholds are breached.
Prerequisites
Before you begin, make sure that you have:
-
Fleet monitoring enabled. See Enable Fleet monitoring.
-
A notification object configured. See Notification objects.
If your notification object is DingTalk, add Custom Keywords alerts in the security settings of the DingTalk chatbot before proceeding.
Create an Argo CD alert rule
-
Log on to the ACK One console. In the left-side navigation pane, choose Fleet > Fleet Observability > Fleet Monitoring.
-
In the upper-right corner of the Fleet Monitoring page, click Alert Settings to open the Prometheus Alert Rules page.
-
Click Create Prometheus Alert Rule and fill in the fields described in the following table.
Parameter Description Default Example Alert rule name A name for the alert rule. — ACK One Argo CD pod memory alertCheck type Detection method. Static Threshold compares a metric against a fixed value. Custom PromQL lets you write a PromQL expression directly. — Static ThresholdPrometheus instance The ACK One Fleet instance to monitor. — text-XXXXAlert contact group The Kubernetes application group to be monitored within your environment. — Kubernetes workloadAlert metric The metric to evaluate. For Argo CD pods, Container Memory Usage and Container CPU Utilization are the most important metrics to watch. — Container Memory UsageAlert condition The threshold condition that triggers an alert event. — CPU utilization greater than 80%Filter conditions Narrows the scope of the alert rule. See the filter condition types below. Traverse Namespace Equal argocd, Pod: TraverseDuration Controls when an alert event fires. See the duration options below. — Alert condition continuously met for 2 minutes Alert level Severity level. Default is the lowest; P1 is the highest. Valid values: Default, P4, P3, P2, P1. Default P1Alert message The message sent to recipients when an alert fires. You can specify custom variables in the alert message based on the Go template syntax. — Namespace: {{$labels.namespace}} / Pod: {{$labels.pod_name}} / Container: {{$labels.container}} CPU utilization: {{$labels.metrics_params_opt_label_value}} {{$labels.metrics_params_value}}%. Current value: {{ printf "%.2f" $value }}%Alert notification Notification format. Valid values: Simple Mode and Standard Mode. — Simple ModeNotification objects The channels that receive alert messages, such as a DingTalk group. — DingTalk alertNotification period The time window during which alert notifications are sent. — 23:00–01:00Whether to resend notifications How often to resend the alert if it is not cleared. — Every 10 minutes Filter condition types
Type Scope Additional input required Supports multiple values Traverse (default) All resources in the Prometheus instance No N/A Equal The specified resource only Resource name No Not equal All resources except the specified one Resource name No Regex match Resources whose names match the expression Regular expression Yes (via regex) Regex not match Resources whose names do not match the expression Regular expression Yes (via regex) Duration options
Option When an alert fires If the alert condition is met As soon as a single data point crosses the threshold If the alert condition is continuously met for N minutes Only after the threshold has been exceeded for at least N minutes continuously -
Click Completed to save the alert rule.
Verify the alert rule
After saving the rule, simulate the alert condition to confirm that notifications reach the intended recipients:
-
Temporarily lower the alert threshold to a value that your Argo CD pods currently exceed, or generate a load spike on the pods.
-
Wait for the configured Duration to elapse.
-
Check the notification object (for example, the DingTalk group) for an alert message.
-
Confirm that the message content matches the Alert message template you configured.
To review historical alert events, open the Prometheus console. See View historical alerts.
What's next
-
Create and manage an alert rule template — reuse alert configurations across multiple Fleet instances.
-
Enable Fleet monitoring — set up the monitoring foundation if you have not done so yet.