All Products
Search
Document Center

Container Service for Kubernetes:Perform canary releases with ACK One GitOps and Argo Rollouts

Last Updated:Mar 26, 2026

ACK One integrates Argo CD GitOps with Argo Rollouts to automate canary releases triggered by Git commits. This tutorial walks you through deploying the required components, setting up a GitOps-managed application, and running a canary release — either with manual promotion or automated promotion based on Prometheus metrics.

Prerequisites

Before you begin, make sure you have:

If you want to use GitHub repositories, avoid creating your ACK cluster in a region in the Chinese mainland. If your cluster is already in the Chinese mainland, use a GitHub service provider. This tutorial uses a Fleet instance and an associated ACK cluster deployed in the China (Hong Kong) region.

Key concepts

GitOps is a framework that uses Git repositories as the single source of truth to manage application configuration and drive continuous deployment. For more information, see GitOps overview.

GitOps workflow diagram

Argo Rollouts is a Kubernetes controller that provides advanced deployment strategies, including blue-green deployment, canary releases, and progressive delivery. For more information, see Argo Rollouts documentation.

Argo Rollouts architecture diagram

Canary release is a deployment strategy that gradually shifts traffic to a new application version, starting with a small subset of users. Because traffic is controlled via the Ingress controller, you can verify the new version in production and roll back instantly by redirecting traffic — without affecting all users.

Canary release workflow diagram

How it works

Any change to spec.template in a Rollout resource — typically an image tag update committed to Git — triggers a new canary analysis. Argo CD detects the commit, syncs the updated manifest to the cluster, and Argo Rollouts starts shifting traffic according to the steps defined in the Rollout spec.

During a canary release:

  • The canary service routes traffic to the new version.

  • The stable service routes traffic to the current version.

  • The NGINX Ingress controller splits traffic based on the weight defined at each step.

Only changes to spec.template trigger a new canary analysis. Changes to labels, annotations, or other metadata outside spec.template do not start a rollout.

Step 1: Deploy Argo Rollouts in the ACK cluster

Run the following commands to create the argo-rollouts namespace and deploy the Argo Rollouts controller:

kubectl create namespace argo-rollouts
kubectl apply -n argo-rollouts -f https://github.com/argoproj/argo-rollouts/releases/latest/download/install.yaml

For full installation options, see Controller Installation.

Step 2: Deploy the ack-arms-prometheus add-on in the ACK cluster

Managed Service for Prometheus (the ack-arms-prometheus add-on) collects Ingress metrics used for automated canary promotion in Step 4.

  1. Log on to the ACK console. In the left-side navigation pane, click Cluster.

  2. On the Clusters page, find your cluster and click its name. In the left-side pane, choose Operations > Add-ons.

  3. On the Add-ons page, click the Logs and Monitoring tab and find ack-arms-prometheus.

    • If Installed is displayed, the add-on is already active.

    • If Install is displayed, click Install.

Step 3: Deploy an application with ACK One GitOps

Use the Argo CD CLI to register a Git repository and create an application. Alternatively, use the GitOps console — see Work with GitOps.

  1. Add the demo Git repository:

    argocd repo add https://github.com/AliyunContainerService/gitops-demo.git --name gitops-demo

    Expected output:

    Repository 'https://github.com/AliyunContainerService/gitops-demo.git' added
  2. Verify the repository was added:

    argocd repo list

    Expected output:

    TYPE  NAME         REPO                                                       INSECURE  OCI    LFS    CREDS  STATUS      MESSAGE  PROJECT
    git   gitops-demo  https://github.com/AliyunContainerService/gitops-demo.git  false     false  false  false  Successful
  3. List the clusters registered with Argo CD:

    argocd cluster list

    Expected output:

    SERVER                          NAME                                                                 VERSION  STATUS      MESSAGE                                                  PROJECT
    https://192.168.XX.XX:6443      c76073b011afb4de2a8****-ack-gitops-demo-192-10-110-0-0-16  1.26+    Successful
    https://kubernetes.default.svc  in-cluster                                                                    Unknown     Cluster has no applications and is not being monitored.
  4. Create the application. Use the cluster server address from the previous step as --dest-server:

    argocd app create rollouts-demo \
      --repo https://github.com/AliyunContainerService/gitops-demo.git \
      --project default \
      --sync-policy automated \
      --revision rollouts \
      --path . \
      --dest-namespace default \
      --dest-server https://192.168.XX.XX:6443
  5. Confirm the application is synced and healthy:

    argocd app list

    Expected output:

    NAME           CLUSTER                     NAMESPACE  PROJECT  STATUS  HEALTH   SYNCPOLICY  CONDITIONS  REPO                                            PATH  TARGET
    rollouts-demo  https://192.168.XX.XX:6443  default    default  Synced  Healthy  Auto        <none>      https://github.com/AliyunContainerService/gitops-demo.git  .     rollouts
  6. Watch the initial rollout complete:

    kubectl argo rollouts get rollout rollouts-demo --watch

    Expected output:

    Initial rollout progress

    Tip: Run kubectl argo rollouts dashboard to open a browser-based UI showing all rollouts, ReplicaSets, Pods, and AnalysisRuns.

Step 4: Perform a canary release

Trigger a canary release by updating the container image tag in rollout.yaml and pushing the change to Git. Argo CD detects the commit and Argo Rollouts starts shifting traffic.

Choose one of the following promotion methods:

  • Manual promotion — review the canary yourself before advancing each traffic step.

  • Automated promotion with Prometheus metrics — Argo Rollouts advances the canary automatically when the success rate threshold is met.

Option 1: Manual promotion

This approach pauses the canary after the first traffic step so you can verify the new version before proceeding.

Traffic steps: 20% → pause (indefinite) → 40% (5 min) → 60% (5 min) → 80% (5 min) → 100%

With three timed steps of 5 minutes each, promotion after the first manual approval takes approximately 15 minutes.

  1. Update rollout.yaml with the new image tag and a manual pause after the first step, then commit and push:

    apiVersion: argoproj.io/v1alpha1
    kind: Rollout
    metadata:
      name: rollouts-demo
    spec:
      replicas: 4
      strategy:
        canary:
          canaryService: rollouts-demo-canary
          stableService: rollouts-demo-stable
          trafficRouting:
            nginx:
              stableIngress: rollouts-demo-stable
          steps:
            - setWeight: 20
            - pause: {}          # Indefinite pause — advance manually by replacing {} with a duration
            - setWeight: 40
            - pause: {duration: 5m}
            - setWeight: 60
            - pause: {duration: 5m}
            - setWeight: 80
            - pause: {duration: 5m}
      revisionHistoryLimit: 2
      selector:
        matchLabels:
          app: rollouts-demo
      template:
        metadata:
          labels:
            app: rollouts-demo
        spec:
          containers:
          - name: rollouts-demo
            image: argoproj/rollouts-demo:yellow   # New image tag
            ports:
            - name: http
              containerPort: 8080
              protocol: TCP
            resources:
              requests:
                memory: 32Mi
                cpu: 5m
  2. Watch the rollout pause at 20% traffic:

    kubectl argo rollouts get rollout rollouts-demo --watch

    Expected output: The rollout stops at the first pause: {} step because no duration is set. It advances only after you resume it.

    Rollout paused at 20% traffic

  3. Resume the canary release by updating the pause duration in rollout.yaml, then commit and push:

    steps:
      - setWeight: 20
      - pause: {duration: 10s}   # Replace {} with a duration to resume

    Then watch the release complete:

    kubectl argo rollouts get rollout rollouts-demo --watch

    Expected output during promotion:

    Rollout promotion in progress

    Expected output after completion:

    Rollout promotion complete

Option 2: Automated promotion with Prometheus metrics

This approach uses Managed Service for Prometheus to continuously evaluate the canary's HTTP success rate. If the success rate stays at or above 95% throughout the analysis window, the canary is promoted automatically. If it drops below the threshold for more than 10 consecutive checks, the release is rolled back automatically.

Traffic steps: 20% (5 min) → 40% (5 min, analysis starts here) → 60% (5 min) → 80% (5 min) → 100%

Total promotion time: approximately 20 minutes.

Step 4a: Configure the Rollout with metric analysis

Update rollout.yaml with the new image tag and the analysis configuration, then commit and push:

strategy:
  canary:
    analysis:
      templates:
      - templateName: success-rate
      startingStep: 2   # Start analysis at the 40% step, after initial traffic stabilizes
      args:
        - name: service-name
          value: rollouts-demo-stable
    canaryService: rollouts-demo-canary
    stableService: rollouts-demo-stable
    trafficRouting:
      nginx:
        stableIngress: rollouts-demo-stable
    steps:
      - setWeight: 20
      - pause: {duration: 5m}
      - setWeight: 40
      - pause: {duration: 5m}
      - setWeight: 60
      - pause: {duration: 5m}
      - setWeight: 80
      - pause: {duration: 5m}
revisionHistoryLimit: 2
selector:
  matchLabels:
    app: rollouts-demo
template:
  metadata:
    labels:
      app: rollouts-demo
  spec:
    containers:
    - name: rollouts-demo
      image: argoproj/rollouts-demo:blue   # New image tag

Step 4b: Get the Managed Service for Prometheus endpoint

Managed Service for Prometheus is exposed as a Kubernetes Service at:

http://{ServiceName}.{Namespace}.svc.{ClusterDomain}:{ServicePort}

For the ack-arms-prometheus add-on deployed in the arms-prom namespace with the default cluster domain, the endpoint is:

http://arms-prom-server.arms-prom.svc.cluster.local:9090

Step 4c: Create the AnalysisTemplate

Create analysis.yaml with the following content. The successCondition passes the canary step if the ratio of 2xx responses to all canary requests is 95% or higher over a 5-minute window.

apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
  name: success-rate
spec:
  args:
  - name: service-name
  metrics:
  - name: success-rate
    interval: 5m
    successCondition: result[0] >= 0.95   # Promote if 95%+ requests succeed
    failureLimit: 10                       # Abort after 10 consecutive failures
    provider:
      prometheus:
        address: http://arms-prom-server.arms-prom.svc.cluster.local:9090
        query: |
          sum(
            irate(nginx_ingress_controller_requests{status=~"(1|2).*", canary!="" ,service="{{args.service-name}}"}[5m]))
            /
            sum(irate(nginx_ingress_controller_requests{canary!="",service="{{args.service-name}}"}[5m])
          )

The PromQL query divides the rate of 1xx/2xx canary requests by the total canary request rate. The canary!="" label selector filters for traffic routed through the canary Ingress annotation — this ensures only canary traffic is evaluated, not stable traffic. The service label scopes the query to the specific application.

Step 4d: Generate continuous traffic for metric collection

Prometheus needs a steady request stream to evaluate the success rate. Run the following commands in a separate terminal.

  1. Get the Ingress external IP:

    kubectl get ingress

    Expected output:

    NAME                                        CLASS   HOSTS                 ADDRESS         PORTS   AGE
    rollouts-demo-rollouts-demo-stable-canary   nginx   rollouts-demo.local   8.217.XX.XX   80      9h
    rollouts-demo-stable                        nginx   rollouts-demo.local   8.217.XX.XX   80      9h
  2. Add the host mapping to your local Hosts file:

    8.217.XX.XX  rollouts-demo.local
  3. Send continuous requests to the application:

    while true; do curl -s "http://rollouts-demo.local/" | grep -o "<title>.*</title>"; sleep 200ms; done

Step 4e: Watch the automated rollout

kubectl argo rollouts get rollout rollouts-demo --watch

Expected output:

Automated rollout in progress

To view the success rate metrics in the Alibaba Cloud console:

  1. Log on to the ACK console. In the left-side navigation pane, click Cluster.

  2. Click the cluster name. In the left-side pane, choose Operations > Prometheus Monitoring.

  3. On the Prometheus Monitoring page, click the Network Monitoring tab, then click Ingresses.

    Prometheus monitoring — Ingress metrics

After the canary release completes successfully:

Automated canary release complete

Step 5 (optional): Roll back a canary release

If the new version causes issues during the canary release, revert the image tag in rollout.yaml to a known-stable version and push the change to Git. Argo CD syncs the change, and Argo Rollouts shifts all traffic back to the stable version.

What a failing canary looks like

When automated promotion is enabled, a failing canary produces output similar to the following before rollback completes:

Name:            rollouts-demo
Namespace:       default
Status:          ✖ Degraded
Message:         RolloutAborted: Rollout aborted update to revision 2: Metric "success-rate" assessed Failed due to failed (10) > failureLimit (10)
Strategy:        Canary
  Step:          4/8
  SetWeight:     40
  ActualWeight:  40
Images:          argoproj/rollouts-demo:yellow (stable)
                 argoproj/rollouts-demo:blue (canary, error)

The rollout is aborted when 10 consecutive metric evaluations fall below the 95% threshold (failureLimit: 10). Argo Rollouts then shifts all traffic back to the stable version automatically.

How to roll back manually

Update rollout.yaml with the stable image tag and commit:

spec:
  containers:
  - name: rollouts-demo
    image: argoproj/rollouts-demo:yellow  # Revert to the stable image tag

Expected output after rollback:

Rollback complete

What's next

References