All Products
Search
Document Center

Container Service for Kubernetes:Override multi-cluster alert configurations

Last Updated:Mar 26, 2026

By default, alert rules propagated from a Fleet instance are identical across all associated clusters. When different clusters need different alert configurations—such as separate thresholds, contact groups, or GPU-specific alerts—use override policies to deliver cluster-specific alert rules.

Prerequisites

Before you begin, ensure that you have:

How it works

ACK uses KubeVela to define and propagate override policies at the Fleet level, following the same model as application differentiated configuration. Define a baseline set of alert rules on the Fleet instance, then apply override policies for specific clusters.

The following figure shows how alert rules are differentiated across clusters. An override policy is created on the Fleet instance. The cluster-specific configuration is delivered to ACK Cluster 2, while ACK Cluster 1 keeps the original alert rules.

image

YAML resource types

The ackalertrule-app-override.yaml file in this topic defines four types of KubeVela resources:

Resource type Kind Purpose
Topology policy Policy (type: topology) Specifies which clusters receive the alert rules
Override policy Policy (type: override) Defines the cluster-specific configuration changes
Workflow Workflow Orchestrates deployment steps: deploys baseline rules to some clusters and overridden rules to others
Application Application Ties all components, policies, and the workflow together

Step 1: Create a contact and a contact group

  1. Create a contact and a contact group.

  2. Get the contact group ID.

  3. Create alert rules.

Step 2: Propagate differentiated alert rules

  1. Get the IDs of the clusters to which you want to propagate the alert rules:

    kubectl get managedcluster

    Expected output:

    NAME            HUB ACCEPTED   MANAGED CLUSTER URLS   JOINED   AVAILABLE   AGE
    c565e4****      true                                  True     True        12d
    cbaa12****      true                                  True     True        12d
    Note

    To select clusters by label instead of by ID, see Select a cluster to distribute applications.

  2. Create a file named ackalertrule-app-override.yaml with the following content. In this example, ack-cluster-1 is a CPU-accelerated cluster and ack-cluster-2 is a GPU-accelerated cluster. The override policy targets ack-cluster-2, enabling GPU alerting, modifying the alert threshold, and changing the contact group.

    apiVersion: core.oam.dev/v1alpha1  # Topology policy: routes baseline alert rules to ack-cluster-1.
    kind: Policy
    metadata:
      name: cluster-cpu
      namespace: kube-system
    type: topology
    properties:
      clusters: ["<ack-cluster-1>"] # Replace <ack-cluster-1> with the cluster ID of ack-cluster-1.
    ---
    apiVersion: core.oam.dev/v1alpha1  # Topology policy: routes overridden alert rules to ack-cluster-2.
    kind: Policy
    metadata:
      name: cluster-gpu
      namespace: kube-system
    type: topology
    properties:
      clusters: ["<ack-cluster-2>"] # Replace <ack-cluster-2> with the cluster ID of ack-cluster-2.
    ---
    apiVersion: core.oam.dev/v1alpha1  # Override policy: defines the cluster-specific changes for ack-cluster-2.
    kind: Policy
    metadata:
      name: override-gpu
      namespace: kube-system
    type: override
    properties:
      components:
      - name: ackalertrules  # Must match the component name in the Application.
        traits:
        - type: alert-rule   # The alert-rule trait modifies alert rules on the target cluster.
          properties:
            groups:           # Override configurations. The structure mirrors that of the alert rules.
            - name: res-exceptions      # Name of the alert group to override.
              rules:
              - contactGroups:           # Override the contact group.
                - arms_contact_group_id: "12345"
                  cms_contact_group_name: ack_Default Contact Group
                  id: "1234"
                enable: enable           # Set to enable.
                name: node_cpu_util_high # Name of the alert rule to override.
                thresholds:              # Override the alert threshold.
                - key: CMS_ESCALATIONS_CRITICAL_Threshold
                  unit: percent
                  value: "60"
            - name: cluster-error    # Name of the alert group to override.
              rules:
              - enable: enable       # Set to enable.
                name: gpu-xid-error  # Name of the alert rule to override.
    ---
    apiVersion: core.oam.dev/v1alpha1  # Workflow: defines the deployment steps.
    kind: Workflow
    metadata:
      name: deploy-ackalertrules
      namespace: kube-system
    steps:
      - type: deploy
        name: deploy-cpu
        properties:
          policies: ["cluster-cpu"]   # Deploy baseline alert rules to ack-cluster-1.
      - type: deploy
        name: deploy-gpu
        properties:
          policies: ["override-gpu", "cluster-gpu"]  # Apply the override policy and deploy to ack-cluster-2.
    ---
    apiVersion: core.oam.dev/v1beta1   # Application: the top-level KubeVela resource.
    kind: Application
    metadata:
      name: alertrules
      namespace: kube-system
      annotations:
        app.oam.dev/publishVersion: version1  # Increment this value each time you update the alert rules to trigger re-propagation.
    spec:
      components:
        - name: ackalertrules
          type: ref-objects
          properties:
            objects:
              - resource: ackalertrules    # References the alert rules created in Step 3.
                name: default
      workflow:
        ref: deploy-ackalertrules
  3. Apply the override policy:

    kubectl apply -f ackalertrule-app-override.yaml
  4. Check the propagation status:

    kubectl amc appstatus alertrules -n kube-system --tree --detail

    Expected output:

    CLUSTER                       NAMESPACE       RESOURCE             STATUS    APPLY_TIME          DETAIL
    c565e4**** (ack-cluster-1)─── kube-system─── AckAlertRule/default updated   2022-**-** **:**:** Age: **
    cbaa12**** (ack-cluster-2)─── kube-system─── AckAlertRule/default updated   2022-**-** **:**:** Age: **

    Both clusters show updated, confirming that the override policy was applied and the differentiated alert rules were propagated successfully.