Serverless and service mesh are two popular cloud-native technologies. This topic describes how to use the autoscaling and traffic splitting features of Knative Serving in a managed service mesh. This method allows you to easily build a serverless platform that does not require complex maintenance of underlying infrastructure.

Prerequisites

Step 1: Install Knative Serving

  1. Prepare Knative Serving installation files.
    To install Knative Serving, you need to configure two YAML files. The Knative CLI provides a quick and easy interface for creating Knative resources such as Knative Services and event sources, without the need to create or modify YAML files directly.
    File name Description Dependency
    serving-crds.yaml Core custom resource definition (CRD) of Knative Serving N/A
    serving-core.yaml Core component of Knative Serving serving-crds.yaml

  2. Install Knative Serving components.
    1. Run the following command to install the CRD of Knative Serving:
      kubectl apply -f  https://alibabacloudservicemesh.oss-cn-beijing.aliyuncs.com/knative/v0.26/serving-crds.yaml
    2. Run the following command to install the core component of Knative Serving:
      kubectl apply -f https://alibabacloudservicemesh.oss-cn-beijing.aliyuncs.com/knative/v0.26/serving-core.yaml

Step 2: Enable access to Istio resources and integrate Knative

The ACK managed cluster and the ASM instance are not in the same cluster environment. You must enable the Kubernetes API on the data plane to access the Istio resources of the ASM instance.

  1. Run the following command to install the Knative Istio controller:
    kubectl apply -f  https://alibabacloudservicemesh.oss-cn-beijing.aliyuncs.com/knative/v0.26/net-istio.yaml
  2. Run the following command to query the IP address of the ingress gateway for external access:
    kubectl --namespace istio-system get service istio-ingressgateway
  3. Run the following command to check whether Knative components are installed:
    kubectl get pods -n knative-serving
    Expected output:
    NAME                                    READY   STATUS    RESTARTS   AGE
    activator-558bf66d75-svbgz              1/1     Running   0          94m
    autoscaler-6fcd9d475-5kgmw              1/1     Running   0          94m
    controller-5f98898db-d5zh4              1/1     Running   0          94m
    domain-mapping-67d655f47d-wrzz7         1/1     Running   0          94m
    domainmapping-webhook-9f59bb774-z792g   1/1     Running   0          94m
    net-istio-controller-846c69dbb-qnchg    1/1     Running   0          94m
    net-istio-webhook-86cf98b497-vmdqn      1/1     Running   0          94m
    webhook-777c5d4548-6tzj6                1/1     Running   0          94m

    If the STATUS of all components are displayed as Running or Completed, Knative components are installed.

Step 3: Deploy a Knative Service

  1. Create a hello.yaml file that contains the following information:
    apiVersion: serving.knative.dev/v1
    kind: Service
    metadata:
      name: helloworld-go
    spec:
      template:
        spec:
          containers:
            - image: registry.cn-hangzhou.aliyuncs.com/acs/helloworld-go:160e4dc8
              ports:
                - containerPort: 8080
              env:
                - name: TARGET
                  value: "World"
  2. Run the following command to deploy the Knative Service:
    kubectl apply -f hello.yaml
  3. Run the following command to view the Knative Service list:
    kubectl get ksvc
    Expected output:
    NAME            URL                                        LATESTCREATED         LATESTREADY           READY   REASON
    helloworld-go   http://helloworld-go.default.example.com   helloworld-go-00001   helloworld-go-00001   True
  4. Run the following command to access the sample application:
    In this example, http://39.97.XX.XX:80 is used to access the helloworld-go application.
    curl -H "Host: helloworld-go.default.example.com" http://39.97.XX.XX:80
    Expected output:
    Hello World!

Step 4: Scale pods to zero

Knative Serving provides automatic scaling, also known as autoscaling. If an application receives no traffic and scale to zero is enabled, Knative Serving scales the application down to zero pods. Pods are scaled up to meet demand if traffic to the application increases. In this example, the helloworld-go application is used.

  1. Run the following command to access the sample application:
    curl -H "Host: helloworld-go.default.example.com" http://39.97.XX.XX:80
  2. Run the following command to query the pod status:
    kubectl get pod -l serving.knative.dev/service=helloworld-go  -w
    Expected output:
    NAME                                             READY   STATUS    RESTARTS   AGE
    helloworld-go-00001-deployment-6f8dfb548-nwfqc   3/3     Running   0          39s
    helloworld-go-00001-deployment-6f8dfb548-nwfqc   3/3     Terminating   0          65s
    helloworld-go-00001-deployment-6f8dfb548-nwfqc   0/3     Terminating   0          102s
    helloworld-go-00001-deployment-6f8dfb548-nwfqc   0/3     Terminating   0          102s
    helloworld-go-00001-deployment-6f8dfb548-nwfqc   0/3     Terminating   0          102s
    You can observe how pods are scaled to zero after traffic stops flowing to the URL.

Step 5: Split traffic

Each time you change the configuration of Knative Service, a new revision is created. You can split traffic between different revisions of your Knative Service.

  1. Edit the hello.yaml file created in Step 3.
    Sample YAML file:
    apiVersion: serving.knative.dev/v1
    kind: Service
    metadata:
      name: helloworld-go
    spec:
      template:
        spec:
          containers:
            - image: registry.cn-hangzhou.aliyuncs.com/acs/helloworld-go:160e4dc8
              ports:
                - containerPort: 8080
              env:
                - name: TARGET
                  value: "Knative"
                            
  2. Run the following command to deploy the updated Knative Service:
    kubectl apply -f hello.yaml
    Note After you modify an existing Knative Service, the URL does not change, but the new revision has a new name. The new name is helloworld-go-00002 in this example.
  3. Run the following command to access the new revision:
    curl -H "Host: helloworld-go.default.example.com" http://39.97.XX.XX:80
    Expected output:
    Hello World!
  4. Run the following command to view all revisions:
    kubectl get revisions
    Expected output:
    NAME                  CONFIG NAME     K8S SERVICE NAME   GENERATION   READY   REASON   ACTUAL REPLICAS   DESIRED REPLICAS
    helloworld-go-00001   helloworld-go                      1            True             0                 0
    helloworld-go-00002   helloworld-go                      2            True             0                 0
  5. Split traffic between the revisions.
    In this example, traffic is split between two revisions.
    1. Add the traffic section to the bottom of the existing hello.yaml file.
      apiVersion: serving.knative.dev/v1
      kind: Service
      metadata:
        name: helloworld-go
      spec:
        template:
          spec:
            containers:
              - image: registry.cn-hangzhou.aliyuncs.com/acs/helloworld-go:160e4dc8
                ports:
                  - containerPort: 8080
                env:
                  - name: TARGET
                    value: "Knative"
        traffic:
        - latestRevision: true
          percent: 50
        - latestRevision: false
          percent: 50
          revisionName: hello-00001
                                      
    2. Run the following command to deploy the helloworld-go application:
      kubectl apply -f hello.yaml
  6. Verify that traffic can be split.
    1. Run the following command to view the revisions:
      kn revisions list
      Expected output:
      NAME                  SERVICE         TRAFFIC   TAGS   GENERATION   AGE     CONDITIONS   READY   REASON
      helloworld-go-00002   helloworld-go   50%              2            59s     4 OK / 4     True
      helloworld-go-00001   helloworld-go   50%              1            2m58s   4 OK / 4     True
      The output shows that half of the traffic goes to each revision.
    2. Access the Knative Service multiple times in your browser to view the output served by each revision.
      Alternatively, you can access the URL of the Knative Service from your terminal multiple times to see that traffic is split between the revisions. Expected output:
      Hello Knative!
      Hello World!
      Hello Knative!
      Hello World!
      The output shows that half of the traffic goes to each revision.