Canary releases let you roll out application updates gradually by shifting a small percentage of production traffic to the new version and increasing it only when metrics stay healthy. Service Mesh (ASM) uses Mixerless Telemetry to collect request metrics directly from Envoy sidecars, without the legacy Mixer component. This reduces latency and resource overhead compared to Mixer-based telemetry.
This tutorial sets up an automated canary release pipeline that combines three components:
Prometheus collects request success rates and latency metrics from Envoy sidecars.
Flagger watches those metrics and progressively shifts traffic to the new version.
Horizontal Pod Autoscaler (HPA) scales pods based on load during the rollout.
How it works
Flagger automates the canary release lifecycle in five stages:
Detect -- A new revision is detected (for example, an image tag change).
Scale up -- The canary deployment scales up and pre-rollout checks run.
Analyze -- Flagger queries Prometheus at each interval for request success rate and P99 latency. If metrics meet the thresholds, canary traffic increases by a fixed step.
Promote -- Once canary traffic reaches the configured maximum, Flagger copies the canary spec to the primary deployment and routes all traffic to it.
Scale down -- The canary deployment scales to zero and the rollout is marked as succeeded.
If metrics fail threshold checks more than the configured number of times, Flagger routes all traffic back to the primary and marks the rollout as failed. See Automated rollback for details.
Time estimates based on this tutorial's configuration:
| Scenario | Formula | Duration |
|---|---|---|
| Successful promotion | interval * (maxWeight / stepWeight) = 1 min * (50 / 10) | ~5 minutes |
| Rollback on failure | interval * threshold = 1 min * 5 | ~5 minutes |
Prerequisites
Before you begin, ensure that you have:
An ASM instance with Mixerless Telemetry enabled and connected to Prometheus. For setup instructions, see Use Mixerless Telemetry to observe ASM instances
A Container Service for Kubernetes (ACK) cluster connected to the ASM instance
kubectl configured for both the ACK cluster and the ASM control plane
Helm 3 installed
Step 1: Deploy Flagger
Connect to your ACK cluster and install Flagger with Helm.
Connect to the ACK cluster with kubectl. For instructions, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
Create a Kubernetes secret that stores the ASM kubeconfig so Flagger can manage Istio resources on the ASM control plane:
# Set aliases for convenience alias k="kubectl --kubeconfig $USER_CONFIG" alias h="helm --kubeconfig $USER_CONFIG" # Create a secret from the ASM kubeconfig cp $MESH_CONFIG kubeconfig k -n istio-system create secret generic istio-kubeconfig --from-file kubeconfig k -n istio-system label secret istio-kubeconfig istio/multiCluster=trueInstall Flagger from the official Helm chart:
h repo add flagger https://flagger.app h repo update k apply -f $FLAAGER_SRC/artifacts/flagger/crd.yaml h upgrade -i flagger flagger/flagger --namespace=istio-system \ --set crd.create=false \ --set meshProvider=istio \ --set metricsServer=http://prometheus:9090 \ --set istio.kubeconfig.secretName=istio-kubeconfig \ --set istio.kubeconfig.key=kubeconfig
Step 2: Deploy an Istio gateway
Create a gateway on the ASM control plane to expose the application to external traffic.
Connect to the ASM instance with kubectl. For instructions, see Use kubectl on the control plane to access Istio resources.
Save the following YAML as
public-gateway.yaml:apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: public-gateway namespace: istio-system spec: selector: istio: ingressgateway servers: - port: number: 80 name: http protocol: HTTP hosts: - "*"Apply the gateway:
kubectl --kubeconfig <asm-kubeconfig-path> apply -f public-gateway.yamlReplace
<asm-kubeconfig-path>with the path to the kubeconfig file of the ASM instance.
Step 3: Deploy the sample application
Deploy the podinfo application, an HPA, and a Flagger load tester in the ACK cluster.
Deploy the Flagger load tester, which generates synthetic traffic against the canary during analysis:
kubectl --kubeconfig <ack-kubeconfig-path> apply -k \ "https://github.com/fluxcd/flagger//kustomize/tester?ref=main"Deploy the podinfo application (V3.1.0) and an HPA:
kubectl --kubeconfig <ack-kubeconfig-path> apply -k \ "https://github.com/fluxcd/flagger//kustomize/podinfo?ref=main"The HPA scales out pods when CPU utilization reaches 99%.
Replace <ack-kubeconfig-path> with the path to the kubeconfig file of the ACK cluster.
Step 4: Configure the canary resource
The canary resource tells Flagger how to manage the release: which deployment to watch, what metrics to check, and how much traffic to shift at each step.
Note: For a full reference on canary resource fields, see How it works in the Flagger documentation.
Save the following YAML as
podinfo-canary.yaml:Key fields in the
analysissection:Field Value Effect stepWeight10 Increase canary traffic by 10% at each step maxWeight50 Stop increasing at 50% and promote if metrics pass threshold5 Roll back after 5 consecutive failed checks request-duration.max500 P99 latency must stay below 500 ms request-success-rate.min99 At least 99% of requests must return non-5xx Apply the canary resource:
kubectl --kubeconfig <ack-kubeconfig-path> apply -f podinfo-canary.yamlAfter you apply the canary resource, Flagger creates a
podinfo-primarydeployment as the stable production version and scales the originalpodinfodeployment to zero. Flagger scales it up again only when it detects a new revision.
Step 5: Trigger the canary release
Update the podinfo container image to trigger a new canary rollout:
kubectl --kubeconfig <ack-kubeconfig-path> -n test \
set image deployment/podinfo podinfod=stefanprodan/podinfo:3.1.1Flagger detects the image change, scales up the canary, and runs the pre-rollout acceptance test. It then begins progressive traffic shifting: 10% -> 20% -> 30% -> 40% -> promotion.
Verify the canary release
Watch the rollout progress
Run a polling loop to watch Flagger events in real time:
while true; do
kubectl --kubeconfig <ack-kubeconfig-path> -n test describe canary/podinfo
sleep 10s
doneA successful rollout produces events similar to:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Synced 39m flagger podinfo-primary.test not ready: waiting for rollout to finish: observed deployment generation less then desired generation
Normal Synced 38m (x2 over 39m) flagger all the metrics providers are available!
Normal Synced 38m flagger Initialization done! podinfo.test
Normal Synced 37m flagger New revision detected! Scaling up podinfo.test
Normal Synced 36m flagger Starting canary analysis for podinfo.test
Normal Synced 36m flagger Pre-rollout check acceptance-test passed
Normal Synced 36m flagger Advance podinfo.test canary weight 10
Normal Synced 35m flagger Advance podinfo.test canary weight 20
Normal Synced 34m flagger Advance podinfo.test canary weight 30
Normal Synced 33m flagger Advance podinfo.test canary weight 40
Normal Synced 29m (x4 over 32m) flagger (combined from similar events): Promotion completed! Scaling down podinfo.testCheck canary status
Get a summary of all canary resources:
kubectl --kubeconfig <ack-kubeconfig-path> get canaries --all-namespacesExample output:
NAMESPACE NAME STATUS WEIGHT LASTTRANSITIONTIME
test podinfo Succeeded 0 2026-03-11T08:15:07ZWait for a canary to complete in a CI/CD pipeline:
kubectl --kubeconfig <ack-kubeconfig-path> -n test wait canary/podinfo --for=condition=promotedAutomated rollback
If metrics fail threshold checks during the analysis phase, Flagger automatically rolls back. It routes all traffic to the primary deployment, scales the canary to zero, and marks the rollout as failed.
To test rollback behavior, generate HTTP 500 errors while a canary analysis is active:
# In a separate terminal, send requests that return HTTP 500
watch curl http://podinfo-canary.test:9898/status/500When the number of failed checks reaches the threshold value (5 in this configuration), the canary events show:
Warning Synced flagger Halt podinfo.test advancement success rate 88.76% < 99%
Warning Synced flagger Rolling back podinfo.test failed checks threshold reached 5
Warning Synced flagger Canary failed! Scaling down podinfo.test