The Mixerless Telemetry technology of Alibaba Cloud Service Mesh (ASM) allows you
to obtain telemetry data on containers in a non-intrusive manner. Telemetry data is
collected by Prometheus as monitoring metrics. Flagger is a tool that automates the
release process of applications. You can use Flagger to monitor the metrics that are
collected by Prometheus to manage traffic in canary releases. This topic describes
how to use Mixerless Telemetry to implement a canary release.
Procedure for implementing a canary release
- Connect ASM to Prometheus to collect application monitoring metrics.
- Deploy Flagger and an Istio gateway.
- Deploy a Flagger load tester to detect traffic routing for the pods of an application
in the canary release.
- Deploy an application. In this example, the podinfo application V3.1.0 is deployed.
- Deploy a Horizontal Pod Autoscaler (HPA) to scale out the pods of the podinfo application
if the CPU utilization of the podinfo application reaches 99%.
- Implement a canary resource to specify that the traffic routed to the podinfo application
is progressively increased by a fixed percentage of 10% if the P99 latency keeps being
greater than or equal to 500 ms for 30s.
- Flagger copies the podinfo application and generates the podinfo-primary application.
The podinfo application is used as the deployment of the canary release version. The
podinfo-primary application is used as the deployment of the production version.
- Update the podinfo application to V3.1.1.
- Flagger monitors the metrics that are collected by Prometheus to manage traffic in
the canary release. Flagger progressively increases the traffic routed to the podinfo
application V3.1.1 by a fixed percentage of 10% if the P99 latency keeps being greater
than or equal to 500 ms for 30s. In addition, the HPA scales out the pods of the podinfo
application and scales in the pods of the podinfo-primary application based on the
status of the canary release.
Procedure
- Use kubectl to connect to a Container Service for Kubernetes (ACK) cluster. For more
information, see Connect to Kubernetes clusters by using kubectl.
- Run the following commands to deploy Flagger:
alias k="kubectl --kubeconfig $USER_CONFIG"
alias h="helm --kubeconfig $USER_CONFIG"
cp $MESH_CONFIG kubeconfig
k -n istio-system create secret generic istio-kubeconfig --from-file kubeconfig
k -n istio-system label secret istio-kubeconfig istio/multiCluster=true
h repo add flagger https://flagger.app
h repo update
k apply -f $FLAAGER_SRC/artifacts/flagger/crd.yaml
h upgrade -i flagger flagger/flagger --namespace=istio-system \
--set crd.create=false \
--set meshProvider=istio \
--set metricsServer=http://prometheus:9090 \
--set istio.kubeconfig.secretName=istio-kubeconfig \
--set istio.kubeconfig.key=kubeconfig
- Use kubectl to connect to an ASM instance. For more information, see Use kubectl to connect to an ASM instance.
- Deploy an Istio gateway.
- Use the following content to create the public-gateway.yaml file:
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: public-gateway
namespace: istio-system
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "*"
- Run the following command to deploy the Istio gateway:
kubectl --kubeconfig <Path of the kubeconfig file of the ASM instance> apply -f resources_canary/public-gateway.yaml
- Run the following command to deploy a Flagger load tester in the ACK cluster:
kubectl --kubeconfig <Path of the kubeconfig file of the ACK cluster> apply -k "https://github.com/fluxcd/flagger//kustomize/tester?ref=main"
- Run the following command to deploy the podinfo application and an HPA in the ACK
cluster:
kubectl --kubeconfig <Path of the kubeconfig file of the ACK cluster> apply -k "https://github.com/fluxcd/flagger//kustomize/podinfo?ref=main"
- Deploy a canary resource in the ACK cluster.
Note For more information about a canary resource, see
How it works.
- Use the following content to create the podinfo-canary.yaml file:
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: podinfo
namespace: test
spec:
# deployment reference
targetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
# the maximum time in seconds for the canary deployment
# to make progress before it is rollback (default 600s)
progressDeadlineSeconds: 60
# HPA reference (optional)
autoscalerRef:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
name: podinfo
service:
# service port number
port: 9898
# container port number or name (optional)
targetPort: 9898
# Istio gateways (optional)
gateways:
- public-gateway.istio-system.svc.cluster.local
# Istio virtual service host names (optional)
hosts:
- '*'
# Istio traffic policy (optional)
trafficPolicy:
tls:
# use ISTIO_MUTUAL when mTLS is enabled
mode: DISABLE
# Istio retry policy (optional)
retries:
attempts: 3
perTryTimeout: 1s
retryOn: "gateway-error,connect-failure,refused-stream"
analysis:
# schedule interval (default 60s)
interval: 1m
# max number of failed metric checks before rollback
threshold: 5
# max traffic percentage routed to canary
# percentage (0-100)
maxWeight: 50
# canary increment step
# percentage (0-100)
stepWeight: 10
metrics:
- name: request-success-rate
# minimum req success rate (non 5xx responses)
# percentage (0-100)
thresholdRange:
min: 99
interval: 1m
- name: request-duration
# maximum req duration P99
# milliseconds
thresholdRange:
max: 500
interval: 30s
# testing (optional)
webhooks:
- name: acceptance-test
type: pre-rollout
url: http://flagger-loadtester.test/
timeout: 30s
metadata:
type: bash
cmd: "curl -sd 'test' http://podinfo-canary:9898/token | grep token"
- name: load-test
url: http://flagger-loadtester.test/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test:9898/"apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: podinfo
namespace: test
spec:
# deployment reference
targetRef:
apiVersion: apps/v1
kind: Deployment
name: podinfo
# the maximum time in seconds for the canary deployment
# to make progress before it is rollback (default 600s)
progressDeadlineSeconds: 60
# HPA reference (optional)
autoscalerRef:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
name: podinfo
service:
# service port number
port: 9898
# container port number or name (optional)
targetPort: 9898
# Istio gateways (optional)
gateways:
- public-gateway.istio-system.svc.cluster.local
# Istio virtual service host names (optional)
hosts:
- '*'
# Istio traffic policy (optional)
trafficPolicy:
tls:
# use ISTIO_MUTUAL when mTLS is enabled
mode: DISABLE
# Istio retry policy (optional)
retries:
attempts: 3
perTryTimeout: 1s
retryOn: "gateway-error,connect-failure,refused-stream"
analysis:
# schedule interval (default 60s)
interval: 1m
# max number of failed metric checks before rollback
threshold: 5
# max traffic percentage routed to canary
# percentage (0-100)
maxWeight: 50
# canary increment step
# percentage (0-100)
stepWeight: 10
metrics:
- name: request-success-rate
# minimum req success rate (non 5xx responses)
# percentage (0-100)
thresholdRange:
min: 99
interval: 1m
- name: request-duration
# maximum req duration P99
# milliseconds
thresholdRange:
max: 500
interval: 30s
# testing (optional)
webhooks:
- name: acceptance-test
type: pre-rollout
url: http://flagger-loadtester.test/
timeout: 30s
metadata:
type: bash
cmd: "curl -sd 'test' http://podinfo-canary:9898/token | grep token"
- name: load-test
url: http://flagger-loadtester.test/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test:9898/"
stepWeight
: the percentage by which the traffic routed to the application is to be progressively
increased. In this example, set the value to 10.
max
: the value of P99 latency that triggers traffic routing.
interval
: the duration of the value of P99 latency that triggers traffic routing.
- Run the following command to deploy the canary resource:
kubectl --kubeconfig <Path of the kubeconfig file of the ACK cluster> apply -f resources_canary/podinfo-canary.yaml
- Run the following command to update the podinfo application from V3.1.0 to V3.1.1:
kubectl --kubeconfig <Path of the kubeconfig file of the ACK cluster> -n test set image deployment/podinfo podinfod=stefanprodan/podinfo:3.1.1
Verify whether the canary release is implemented as expected
Run the following command to view the process of progressive traffic routing:
while true; do kubectl --kubeconfig <Path of the kubeconfig file of the ACK cluster> -n test describe canary/podinfo; sleep 10s;done
Expected output:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Synced 39m flagger podinfo-primary.test not ready: waiting for rollout to finish: observed deployment generation less then desired generation
Normal Synced 38m (x2 over 39m) flagger all the metrics providers are available!
Normal Synced 38m flagger Initialization done! podinfo.test
Normal Synced 37m flagger New revision detected! Scaling up podinfo.test
Normal Synced 36m flagger Starting canary analysis for podinfo.test
Normal Synced 36m flagger Pre-rollout check acceptance-test passed
Normal Synced 36m flagger Advance podinfo.test canary weight 10
Normal Synced 35m flagger Advance podinfo.test canary weight 20
Normal Synced 34m flagger Advance podinfo.test canary weight 30
Normal Synced 33m flagger Advance podinfo.test canary weight 40
Normal Synced 29m (x4 over 32m) flagger (combined from similar events): Promotion completed! Scaling down podinfo.test
The result indicates that the traffic routed to the podinfo application V3.1.1 is
progressively increased from 10% to 40%.