×
Community Blog Flagger on ASM: Progressive Canary Release Based on Mixerless Telemetry (Part 1) – Telemetry Data

Flagger on ASM: Progressive Canary Release Based on Mixerless Telemetry (Part 1) – Telemetry Data

Part 1 of this 3-part series discusses telemetry data and monitoring metrics.

Preface

Alibaba Cloud Service Mesh (ASM) provides non-intrusive telemetry data for business containers with its Mixerless Telemetry. On the one hand, telemetry data is collected by ARMPS and Prometheus as a monitoring metric for service mesh observability. On the other hand, it is used by HPA and flaggers as the basis of application-level scaling and progressive canary release.

In terms of the implementation of telemetry data in application-level scaling and progressive canary release, there will be three articles in this series, focusing on telemetry data (monitoring metrics), application-level scaling, and progressive canary release.

Overall Architecture

The overall architecture of the implementation is shown in the following figure:

  1. ASM distributes the EnvoyFilter configurations related to Mixerless Telemetry to each ASM sidecar (envoy) to enable the collection of application-level monitoring metrics.
  2. Business traffic flows in through the Ingress Gateway. The ASM sidecar starts to collect relevant monitoring metrics.
  3. Prometheus collects metrics from each POD.
  4. HPA queries the POD-related monitoring metrics from Prometheus through Adapter and scales based on the configurations.
  5. Flagger queries the POD-related monitoring metrics through Prometheus and initiates VirtualService configuration updates to ASM based on the configurations.
  6. ASM distributes the VirtualService configurations to each ASM sidecar to implement the progressive canary release.

83e8d7e7abdd1cdf0bfdb70dede748d13e64e8cc

The Process of Flagger Progressive Canary Release

The Flagger official website describes the process of progressive canary release, and the original is listed below:

Flagger will run a canary release with the configuration above using the following steps:

  1. Detect new revision (deployment spec, secrets, or configmaps changes)
  2. Scale the canary deployment from zero
  3. Wait for the HPA to set the canary minimum replicas
  4. Check canary pods health
  5. Run the acceptance tests
  6. Abort the canary release if tests fail
  7. Start the load tests
  8. Mirror 100% of the traffic from primary to canary
  9. Check request success rate and request latency every minute
  10. Abort the canary release if the metrics check failure threshold is reached
  11. Stop traffic mirroring after the number of iterations is reached
  12. Route live traffic to the canary pods
  13. Promote the canary (update the primary secrets, configmaps, and deployment spec)
  14. Wait for the primary deployment rollout to finish
  15. Wait for the HPA to set the primary minimum replicas
  16. Check primary pods health
  17. Switch live traffic back to primary
  18. Scale the canary to zero
  19. Send a notification with the canary analysis result

Prerequisites

Setup Mixerless Telemetry

This section describes how to configure and collect application-level monitoring metrics (such as request total istio_requests_total and request latency istio_request_duration) based on ASM. The main steps include creating the EnvoyFilter, verifying envoy telemetry data, and verifying telemetry data collected by Prometheus.

1. EnvoyFilter

Log onto the ASM console, select Service Mesh > Mesh Management on the left navigation bar, and enter the feature configuration page for an ASM instance.

1 Check Enable Prometheus monitoring metrics collection

2 Click Enable self-built Prometheus and enter the Prometheus service address: prometheus:9090. (The community version Prometheus is used in this series, and the configuration above will be used in later articles.) If you use Alibaba Cloud ARMS, please refer to Integrate ARMS Prometheus to implement mesh monitoring.

3 Check Enable Kiali (optional).

9e158140ce05c00e8b40d74a21578ca276c23b31

4 Click OK. Then, the related EnvoyFilter list generated by ASM is available in the control plane:

cbf64e0c66c016806cb63e9963aea782ea1b6d18

2. Prometheus

2.1 Install

Run the following command to install Prometheus. For the complete script, please see: demo_mixerless.sh.

kubectl --kubeconfig "$USER_CONFIG" apply -f $ISTIO_SRC/samples/addons/prometheus.yaml

2.2 Config Scrape

After installing Prometheus, add istio-related monitoring metrics to its configuration. Log onto the ACK console, select Configuration Management > Configuration Items on the left navigation bar, find Prometheus under istio-system, and click Edit.

e8a9a641dde73134bd2dda1e55f69debb28cc230

In the prometheus.yaml configuration, add the configuration in scrape_configs.yaml to scrape_configs.

5

After the configuration is saved, select Workload > Container Group on the left navigation bar, find Prometheus under istio-system, and delete Prometheus POD to ensure that the configuration takes effect in the new POD.

Users can run the following command to view job_name in the Prometheus configuration:

kubectl --kubeconfig "$USER_CONFIG" get cm prometheus -n istio-system -o jsonpath={.data.prometheus\\.yml} | grep job_name
- job_name: 'istio-mesh'
- job_name: 'envoy-stats'
- job_name: 'istio-policy'
- job_name: 'istio-telemetry'
- job_name: 'pilot'
- job_name: 'sidecar-injector'
- job_name: prometheus
  job_name: kubernetes-apiservers
  job_name: kubernetes-nodes
  job_name: kubernetes-nodes-cadvisor
- job_name: kubernetes-service-endpoints
- job_name: kubernetes-service-endpoints-slow
  job_name: prometheus-pushgateway
- job_name: kubernetes-services
- job_name: kubernetes-pods
- job_name: kubernetes-pods-slow

Mixerless Test

1. Podinfo

1.1 Deployment

Run the following command to deploy the podinfo as an example in this series:

kubectl --kubeconfig "$USER_CONFIG" apply -f $PODINFO_SRC/kustomize/deployment.yaml -n test
kubectl --kubeconfig "$USER_CONFIG" apply -f $PODINFO_SRC/kustomize/service.yaml -n test

1.2 Generate Load

Run the following command to request the monitoring metrics generated from podinfo:

podinfo_pod=$(k get po -n test -l app=podinfo -o jsonpath={.items..metadata.name})
for i in {1..10}; do
   kubectl --kubeconfig "$USER_CONFIG" exec $podinfo_pod -c podinfod -n test -- curl -s podinfo:9898/version
  echo
done

2. Confirm the Metrics Generated in Envoy Containers

The key metrics in this series are istio_requests_total and istio_request_duration, which should be confirmed to have been generated in the envoy container.

2.1 istio_requests_total

Run the following command to request stats-related metric data from envoy and confirm that istio_requests_total is included.

kubectl --kubeconfig "$USER_CONFIG" exec $podinfo_pod -n test -c istio-proxy -- curl -s localhost:15090/stats/prometheus | grep istio_requests_total

The returned results are listed below:

:::: istio_requests_total ::::
# TYPE istio_requests_total counter
istio_requests_total{response_code="200",reporter="destination",source_workload="podinfo",source_workload_namespace="test",source_principal="spiffe://cluster.local/ns/test/sa/default",source_app="podinfo",source_version="unknown",source_cluster="c199d81d4e3104a5d90254b2a210914c8",destination_workload="podinfo",destination_workload_namespace="test",destination_principal="spiffe://cluster.local/ns/test/sa/default",destination_app="podinfo",destination_version="unknown",destination_service="podinfo.test.svc.cluster.local",destination_service_name="podinfo",destination_service_namespace="test",destination_cluster="c199d81d4e3104a5d90254b2a210914c8",request_protocol="http",response_flags="-",grpc_response_status="",connection_security_policy="mutual_tls",source_canonical_service="podinfo",destination_canonical_service="podinfo",source_canonical_revision="latest",destination_canonical_revision="latest"} 10

istio_requests_total{response_code="200",reporter="source",source_workload="podinfo",source_workload_namespace="test",source_principal="spiffe://cluster.local/ns/test/sa/default",source_app="podinfo",source_version="unknown",source_cluster="c199d81d4e3104a5d90254b2a210914c8",destination_workload="podinfo",destination_workload_namespace="test",destination_principal="spiffe://cluster.local/ns/test/sa/default",destination_app="podinfo",destination_version="unknown",destination_service="podinfo.test.svc.cluster.local",destination_service_name="podinfo",destination_service_namespace="test",destination_cluster="c199d81d4e3104a5d90254b2a210914c8",request_protocol="http",response_flags="-",grpc_response_status="",connection_security_policy="unknown",source_canonical_service="podinfo",destination_canonical_service="podinfo",source_canonical_revision="latest",destination_canonical_revision="latest"} 10

2.2 istio_request_duration

Run the following command to request stats-related metric data from envoy and confirm that istio_request_duration is included.

kubectl --kubeconfig "$USER_CONFIG" exec $podinfo_pod -n test -c istio-proxy -- curl -s localhost:15090/stats/prometheus | grep istio_request_duration

The returned results are listed below:

:::: istio_request_duration ::::
# TYPE istio_request_duration_milliseconds histogram
istio_request_duration_milliseconds_bucket{response_code="200",reporter="destination",source_workload="podinfo",source_workload_namespace="test",source_principal="spiffe://cluster.local/ns/test/sa/default",source_app="podinfo",source_version="unknown",source_cluster="c199d81d4e3104a5d90254b2a210914c8",destination_workload="podinfo",destination_workload_namespace="test",destination_principal="spiffe://cluster.local/ns/test/sa/default",destination_app="podinfo",destination_version="unknown",destination_service="podinfo.test.svc.cluster.local",destination_service_name="podinfo",destination_service_namespace="test",destination_cluster="c199d81d4e3104a5d90254b2a210914c8",request_protocol="http",response_flags="-",grpc_response_status="",connection_security_policy="mutual_tls",source_canonical_service="podinfo",destination_canonical_service="podinfo",source_canonical_revision="latest",destination_canonical_revision="latest",le="0.5"} 10

istio_request_duration_milliseconds_bucket{response_code="200",reporter="destination",source_workload="podinfo",source_workload_namespace="test",source_principal="spiffe://cluster.local/ns/test/sa/default",source_app="podinfo",source_version="unknown",source_cluster="c199d81d4e3104a5d90254b2a210914c8",destination_workload="podinfo",destination_workload_namespace="test",destination_principal="spiffe://cluster.local/ns/test/sa/default",destination_app="podinfo",destination_version="unknown",destination_service="podinfo.test.svc.cluster.local",destination_service_name="podinfo",destination_service_namespace="test",destination_cluster="c199d81d4e3104a5d90254b2a210914c8",request_protocol="http",response_flags="-",grpc_response_status="",connection_security_policy="mutual_tls",source_canonical_service="podinfo",destination_canonical_service="podinfo",source_canonical_revision="latest",destination_canonical_revision="latest",le="1"} 10
...

3. Confirm the Collection from Prometheus

The final step is to check whether the metric data generated by Envoy is collected by Prometheus in real-time. Expose the Prometheus service externally and use a browser to request the service. Enter istio_requests_total in the query box, and the results are listed below:

6

0 0 0
Share on

feuyeux

6 posts | 0 followers

You may also like

Comments

feuyeux

6 posts | 0 followers

Related Products