By default, alert rules propagated from a Fleet instance are identical across all associated clusters. When different clusters need different alert configurations—such as separate thresholds, contact groups, or GPU-specific alerts—use override policies to deliver cluster-specific alert rules.
Prerequisites
Before you begin, ensure that you have:
-
The Fleet management feature enabled
-
Two clusters associated with the Fleet instance: the service provider cluster and the service consumer cluster
How it works
ACK uses KubeVela to define and propagate override policies at the Fleet level, following the same model as application differentiated configuration. Define a baseline set of alert rules on the Fleet instance, then apply override policies for specific clusters.
The following figure shows how alert rules are differentiated across clusters. An override policy is created on the Fleet instance. The cluster-specific configuration is delivered to ACK Cluster 2, while ACK Cluster 1 keeps the original alert rules.
YAML resource types
The ackalertrule-app-override.yaml file in this topic defines four types of KubeVela resources:
| Resource type | Kind | Purpose |
|---|---|---|
| Topology policy | Policy (type: topology) |
Specifies which clusters receive the alert rules |
| Override policy | Policy (type: override) |
Defines the cluster-specific configuration changes |
| Workflow | Workflow |
Orchestrates deployment steps: deploys baseline rules to some clusters and overridden rules to others |
| Application | Application |
Ties all components, policies, and the workflow together |
Step 1: Create a contact and a contact group
Step 2: Propagate differentiated alert rules
-
Get the IDs of the clusters to which you want to propagate the alert rules:
kubectl get managedclusterExpected output:
NAME HUB ACCEPTED MANAGED CLUSTER URLS JOINED AVAILABLE AGE c565e4**** true True True 12d cbaa12**** true True True 12dNoteTo select clusters by label instead of by ID, see Select a cluster to distribute applications.
-
Create a file named
ackalertrule-app-override.yamlwith the following content. In this example,ack-cluster-1is a CPU-accelerated cluster andack-cluster-2is a GPU-accelerated cluster. The override policy targetsack-cluster-2, enabling GPU alerting, modifying the alert threshold, and changing the contact group.apiVersion: core.oam.dev/v1alpha1 # Topology policy: routes baseline alert rules to ack-cluster-1. kind: Policy metadata: name: cluster-cpu namespace: kube-system type: topology properties: clusters: ["<ack-cluster-1>"] # Replace <ack-cluster-1> with the cluster ID of ack-cluster-1. --- apiVersion: core.oam.dev/v1alpha1 # Topology policy: routes overridden alert rules to ack-cluster-2. kind: Policy metadata: name: cluster-gpu namespace: kube-system type: topology properties: clusters: ["<ack-cluster-2>"] # Replace <ack-cluster-2> with the cluster ID of ack-cluster-2. --- apiVersion: core.oam.dev/v1alpha1 # Override policy: defines the cluster-specific changes for ack-cluster-2. kind: Policy metadata: name: override-gpu namespace: kube-system type: override properties: components: - name: ackalertrules # Must match the component name in the Application. traits: - type: alert-rule # The alert-rule trait modifies alert rules on the target cluster. properties: groups: # Override configurations. The structure mirrors that of the alert rules. - name: res-exceptions # Name of the alert group to override. rules: - contactGroups: # Override the contact group. - arms_contact_group_id: "12345" cms_contact_group_name: ack_Default Contact Group id: "1234" enable: enable # Set to enable. name: node_cpu_util_high # Name of the alert rule to override. thresholds: # Override the alert threshold. - key: CMS_ESCALATIONS_CRITICAL_Threshold unit: percent value: "60" - name: cluster-error # Name of the alert group to override. rules: - enable: enable # Set to enable. name: gpu-xid-error # Name of the alert rule to override. --- apiVersion: core.oam.dev/v1alpha1 # Workflow: defines the deployment steps. kind: Workflow metadata: name: deploy-ackalertrules namespace: kube-system steps: - type: deploy name: deploy-cpu properties: policies: ["cluster-cpu"] # Deploy baseline alert rules to ack-cluster-1. - type: deploy name: deploy-gpu properties: policies: ["override-gpu", "cluster-gpu"] # Apply the override policy and deploy to ack-cluster-2. --- apiVersion: core.oam.dev/v1beta1 # Application: the top-level KubeVela resource. kind: Application metadata: name: alertrules namespace: kube-system annotations: app.oam.dev/publishVersion: version1 # Increment this value each time you update the alert rules to trigger re-propagation. spec: components: - name: ackalertrules type: ref-objects properties: objects: - resource: ackalertrules # References the alert rules created in Step 3. name: default workflow: ref: deploy-ackalertrules -
Apply the override policy:
kubectl apply -f ackalertrule-app-override.yaml -
Check the propagation status:
kubectl amc appstatus alertrules -n kube-system --tree --detailExpected output:
CLUSTER NAMESPACE RESOURCE STATUS APPLY_TIME DETAIL c565e4**** (ack-cluster-1)─── kube-system─── AckAlertRule/default updated 2022-**-** **:**:** Age: ** cbaa12**** (ack-cluster-2)─── kube-system─── AckAlertRule/default updated 2022-**-** **:**:** Age: **Both clusters show
updated, confirming that the override policy was applied and the differentiated alert rules were propagated successfully.