Configuration push optimization after sidecar recommendation is enabled - Alibaba Cloud Service Mesh

If you have deployed a large number of services in a single namespace, you can use the sidecar recommendation feature to reduce the size of sidecar configurations to the maximum extent. In this topic, a large-scale cluster that has 420 pods is used to test and analyze the optimization of configuration push of Service Mesh (ASM) after sidecar recommendation is enabled.

Prerequisites

The cluster is added to the ASM instance. For more information, see Add a cluster to an ASM instance.
You have connected to the cluster by using kubectl. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
Access logs of the data plane are collected by using Simple Log Service. For more information, see Use Simple Log Service to collect access logs on the data plane.

Deploy applications in the test cluster

In this test, multiple sleep and HTTPBin applications are created in the ns-in-mesh namespace to simulate a large number of services that have few call dependencies in the cluster. For more information about how to create a namespace, see Manage global namespaces.

After an HTTPBin application is started, the HTTPBin application exposes an HTTP service on port 8000. These HTTP services simulate a large number of services that are called in the cluster.
Each sleep application contains a curl container. You can modify the command field in the configuration file of a sleep application. This way, you can configure the sleep application to call the services provided by multiple HTTPBin containers before the sleep application enters the sleeping state. The services provided by these sleep applications simulate the services that depend on other services in the cluster.

Deploy HTTPBin applications in the cluster.

Create a YAML file named httpbin-{i} based on the following template.

Note

Replace {i} in httpbin-{i} with a specific number. This way, you can create multiple HTTPBin applications that have different IDs. You can use this template to generate as many HTTPBin applications as required. The maximum number of applications that you can generate depends on the size of your cluster. In this example, 200 HTTPBin applications are generated by using this template, and 400 pods are deployed in the cluster to run the HTTPBin applications.

Expand to view the YAML template for HTTPBin applications

apiVersion: v1
kind: ServiceAccount
metadata:
  name: httpbin
---
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: null
  labels:
    app: httpbin-{i}
    service: httpbin-{i}
  name: httpbin-{i}
spec:
  ports:
  - name: http
    port: 8000
    targetPort: 80
  selector:
    app: httpbin-0
---
apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: httpbin-{i}
  name: httpbin-{i}
spec:
  replicas: 2
  selector:
    matchLabels:
      app: httpbin-{i}
      version: v1
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: httpbin-{i}
        version: v1
    spec:
      containers:
      - image: docker.io/kennethreitz/httpbin
        imagePullPolicy: IfNotPresent
        name: httpbin
        ports:
        - containerPort: 80
      serviceAccountName: httpbin

Run the following command to create the httpbin-{i} application:
```
kubectl apply -f httpbin-{i}.yaml
```
Expected output:
```
deployment.apps/httpbin-{i} created
```

Deploy sleep applications in the cluster.

Create a YAML file named sleep-{i} based on the following template.

Note

Replace {i} in sleep-{i} with a specific number. This way, you can create multiple sleep applications that have different IDs. In this template, the curl httpbin-{i*10}:8000 command parameters that are added to the args field simulate call dependencies on different HTTPBin applications. The IDs of the HTTPBin applications that you specify in the command parameters must not exceed the maximum ID of the HTTPBin applications that are deployed. Otherwise, valid calls cannot be made. In this test, each sleep application depends on 10 HTTPBin applications. Therefore, 20 sleep applications are created, and 20 pods are deployed to run the sleep applications.

Expand to view the YAML template for sleep applications

apiVersion: v1
kind: ServiceAccount
metadata:
  name: sleep
---
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: null
  labels:
    app: sleep-{i}
    service: sleep-{i}
  name: sleep-{i}
spec:
  ports:
  - name: http
    port: 80
    targetPort: 0
  selector:
    app: sleep-{i}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: sleep-{i}
  name: sleep-{i}
spec:
  replicas: 1
  selector:
    matchLabels:
      app: sleep-{i}
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: sleep-{i}
    spec:
      containers:
      - args:
        - curl httpbin-{i*10}:8000; curl httpbin-{i*10+1}:8000; curl httpbin-{i*10+2}:8000; curl httpbin-{i*10+3}:8000;
          curl httpbin-{i*10+4}:8000; curl httpbin-{i*10+5}:8000; curl httpbin-{i*10+6}:8000; curl httpbin-{i*10+7}:8000;
          curl httpbin-{i*10+8}:8000; curl httpbin-{i*10+9}:8000; sleep 3650d
        command:
        - /bin/sh
        - -c
        image: curlimages/curl
        imagePullPolicy: IfNotPresent
        name: sleep
        volumeMounts:
        - mountPath: /etc/sleep/tls
          name: secret-volume
      serviceAccountName: sleep
      terminationGracePeriodSeconds: 0
      volumes:
      - name: secret-volume
        secret:
          optional: true
          secretName: sleep-secret

Run the following command to create the sleep-{i} application:
```
kubectl apply -f sleep-{i}.yaml
```
Expected output:
```
deployment.apps/sleep-{i} created
```

Test the configuration push of the control plane before sidecar recommendation is enabled

Test 1: Test the configuration size of each sidecar before sidecar recommendation is enabled

Run the following command to obtain the names of the pods that run the httpbin-0 application:

kubectl get pod -n ns-in-mesh | grep httpbin-0

Expected output:

NAME                         READY   STATUS    RESTARTS   AGE
httpbin-0-756995d867-jljgp   2/2     Running   0          9m15s
httpbin-0-756995d867-whstr   2/2     Running   0          9m15s

Run the following command to dump the configurations of the sidecars in the pods that run the httpbin-0 application to a local file:
```
kubectl exec -it httpbin-0-756995d867-jljgp -c istio-proxy -n ns-in-mesh -- curl -s localhost:15000/config_dump > config_dump.json
```
Run the following command to view the size of the configuration file:
```
du -sh config_dump.json
```
Expected output:
```
1.2M    config_dump.json
```
The preceding output shows that the configuration size of a sidecar is about 1.2 MB when 420 pods are deployed in the cluster. If a sidecar is created for each pod in the cluster, the configurations of these sidecars reach a considerable size and increase the configuration push load of the control plane.

Scenario 2: Test the configuration push efficiency of the control plane before sidecar recommendation is enabled

Create a virtual service for the httpbin-0 application in the ASM console. This triggers the control plane to push configurations to the sidecars on the data plane. You can view the logs of the control plane to determine the efficiency of the control plane in one configuration push. For more information about how to enable control-plane log collection, see Enable control-plane log collection and log-based alerting in an ASM instance of a version earlier than 1.17.2.35.

Use a YAML file that contains the following content to create a virtual service for the httpbin-0 application in the Service Mesh instance to handle the timeout of requests. For more information about how to create a virtual service, see Manage virtual services.

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: httpbin-0-timeout
  namespace: default
spec:
  hosts:
    - httpbin-0.default.svc.cluster.local
  http:
    - route:
        - destination:
            host: httpbin-0.default.svc.cluster.local
      timeout: 5s

View the newly generated control-plane logs.

For an ASM instance whose version is 1.17.2.35 or later

Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.
On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose Observability Management Center > Log Center.
On the Log Center page, click the Control-Plane Logs tab to view logs.

For an ASM instance whose version is earlier than 1.17.2.35

Log on to the ASM console.
In the left-side navigation pane, choose Service Mesh > Mesh Management.
On the Mesh Management page, click the name of the ASM instance.
On the details page of the ASM instance, choose ASM Instance > Base Information.
On the page that appears, click View log on the right of Control-plane log collection.

Sample logs:

021-12-01T10:20:09.708673Z info  ads CDS: PUSH for node:httpbin-27-7dd8578b46-nkmvg.default resources:227 size:169.3kB
2021-12-01T10:20:09.710469Z info  ads CDS: PUSH for node:httpbin-184-65d97797db-njst5.default resources:227 size:169.3kB
2021-12-01T10:20:09.713567Z info  ads CDS: PUSH for node:httpbin-86-5b64586bbf-jv92w.default resources:227 size:169.3kB
2021-12-01T10:20:09.714514Z info  ads LDS: PUSH for node:httpbin-86-5b64586bbf-jv92w.default resources:16 size:70.7kB
2021-12-01T10:20:09.792732Z info  ads LDS: PUSH for node:httpbin-27-7dd8578b46-nkmvg.default resources:16 size:70.7kB
2021-12-01T10:20:09.792982Z info  ads LDS: PUSH for node:httpbin-184-65d97797db-njst5.default resources:16 size:70.7kB
2021-12-01T10:20:09.796430Z info  ads RDS: PUSH for node:httpbin-86-5b64586bbf-jv92w.default resources:8 size:137.4kB
...
2021-12-01T10:20:13.405850Z info  ads RDS: PUSH for node:httpbin-156-68b85b4f79-2znmp.default resources:8 size:137.4kB
2021-12-01T10:20:13.406154Z info  ads RDS: PUSH for node:httpbin-121-7c4cff97b9-sn5g4.default resources:8 size:137.4kB
2021-12-01T10:20:13.406420Z info  ads CDS: PUSH for node:httpbin-161-7bc74c5fb5-ldgn4.default resources:227 size:169.3kB
2021-12-01T10:20:13.407230Z info  ads LDS: PUSH for node:httpbin-161-7bc74c5fb5-ldgn4.default resources:16 size:70.7kB
2021-12-01T10:20:13.410147Z info  ads RDS: PUSH for node:httpbin-161-7bc74c5fb5-ldgn4.default resources:8 size:137.4kB
2021-12-01T10:20:13.494840Z info  ads RDS: PUSH for node:httpbin-57-69b756f779-db7vv.default resources:8 size:137.4kB

The preceding content shows that the control plane pushes configuration changes to all sidecars on the data plane after a virtual service is added. A large number of push logs are generated in the cluster that has 420 pods deployed. The amount of data that is pushed to each sidecar is large. To apply a virtual service in the Service Mesh instance, the control plane takes about 4 seconds to push the configurations to the data plane. The configuration push efficiency of the control plane is low.

Test the configuration push of the control plane after sidecar recommendation is enabled

Use the sidecar recommendation feature to recommend a sidecar for each workload in the test cluster based on access log analysis. This helps improve the configuration push efficiency of the control plane. For more information, see Use the sidecars that are automatically recommended based on access log analysis.

Test 1: Test the configuration size of each sidecar after sidecar recommendation is enabled

Run the following command to dump the configurations of the sidecars in the pods that run the httpbin-0 application to a local file:
```
kubectl exec -it httpbin-0-756995d867-jljgp -c istio-proxy -n ns-in-mesh -- curl -s localhost:15000/config_dump > config_dump.json
```
Run the following command to view the size of the configuration file:
```
du -sh config_dump.json
```
Expected output:
```
105k    config_dump.json
```
The preceding output shows that the configuration size of a sidecar is reduced by more than 10 times after sidecar recommendation is enabled in the cluster that has 420 pods deployed. The efficiency of pushing configurations from the control plane to the sidecars on the data plane is significantly improved.

Scenario 2: Test the configuration push efficiency of the control plane after sidecar recommendation is enabled

Create a virtual service for the httpbin-0 application in the ASM console again. This triggers the control plane to push configurations to the sidecars on the data plane.

Delete the virtual service that you created earlier in the Service Mesh instance. Use a YAML file that contains the following content to create a virtual service for the httpbin-0 application in the Service Mesh instance to handle the timeout of requests. For more information about how to create a virtual service, see Manage virtual services.

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: httpbin-0-timeout
  namespace: default
spec:
  hosts:
    - httpbin-0.default.svc.cluster.local
  http:
    - route:
        - destination:
            host: httpbin-0.default.svc.cluster.local
      timeout: 5s

View the newly generated control-plane logs.
For an ASM instance whose version is 1.17.2.35 or later
1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.
2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose Observability Management Center > Log Center.
3. On the Log Center page, click the Control-Plane Logs tab to view logs.
For an ASM instance whose version is earlier than 1.17.2.35
1. Log on to the ASM console.
2. In the left-side navigation pane, choose Service Mesh > Mesh Management.
3. On the Mesh Management page, click the name of the ASM instance.
4. On the details page of the ASM instance, choose ASM Instance > Base Information.
5. On the page that appears, click View log on the right of Control-plane log collection.
Sample logs:
```
2021-12-01T12:12:43.498048Z info  ads Push debounce stable[750] 1: 100.03379ms since last change, 100.033692ms since last push, full=true
2021-12-01T12:12:43.504270Z info  ads XDS: Pushing:2021-12-01T12:12:43Z/493 Services:230 ConnectedEndpoints:421  Version:2021-12-01T12:12:43Z/493
2021-12-01T12:12:43.507451Z info  ads CDS: PUSH for node:sleep-0-b68c8c5d9-5kww5.default resources:14 size:7.8kB
2021-12-01T12:12:43.507739Z info  ads LDS: PUSH for node:sleep-0-b68c8c5d9-5kww5.default resources:3 size:15.5kB
2021-12-01T12:12:43.508029Z info  ads RDS: PUSH for node:sleep-0-b68c8c5d9-5kww5.default resources:1 size:6.3kB
```
The preceding content shows that each workload on the data plane will not receive changes related to services with which the workload has no dependencies after sidecar recommendation is enabled. After you create the virtual service for the httpbin-0 application, the control plane pushes configuration changes only to the sidecars in the pods that run the sleep-0 application. This is because only the sleep-0 application has dependencies with the httpbin-0 application. The configuration push takes about 0.01 seconds to complete. Compared with the time taken before optimization, the configuration push efficiency is improved by about 400 times. In addition, the amount of pushed data is reduced by about 10 times. This indicates that the sidecar recommendation feature can significantly improve the efficiency of the control plane in pushing configurations to the data plane.

Test summary

This test uses multiple sleep and HTTPBin applications to simulate a large number of services that have few call dependencies in a cluster. A total of 200 HTTPBin applications, 400 pods that run the HTTPBin applications, 20 sleep applications, and 20 pods that run the sleep applications are deployed. The following table describes the comparison before and after sidecar recommendation is enabled.

Item	Before sidecar recommendation is enabled	After sidecar recommendation is enabled
Sidecar configuration size	1.2 MB	105 KB
Whether the information about unrelated services is pushed	Yes	No
Time taken by the control plane to push configurations	About 4 seconds	About 0.01 seconds