×
Community Blog Practice of Non-Intrusive A/B Testing in the Comprehensive Procedure Based on WASM

Practice of Non-Intrusive A/B Testing in the Comprehensive Procedure Based on WASM

This article describes how to implement and verify a WASM that passes through customized headers to support the non-intrusive A/B testing in the comprehensive procedure.

1. Background

Alibaba Cloud Service Mesh (ASM) can provide non-intrusive traffic governance for the microservices running on it. By configuring VirtualService and DestinationRule, functions, such as traffic management, timeout retry, traffic replication, traffic throttling, and breaking, can be implemented without modifying the microservice code.

The premise of traffic management is that a service has multiple versions. The following section describes the different types of traffic management for multi-version deployment:

  • Traffic Routing: Request traffic is routed to the endpoints (Pod[]) of different versions (Deployment[]) of the specified service based on the request information (Header/Cookie/Query Params). This is the A/B testing.
  • Traffic Shifting: Request traffic is proportionately routed to the endpoints (Pod[]) of different versions (Deployment[]) of the specified service uniformly through grey or canary release.
  • Traffic Switching/Mirroring: Traffic is proportionately switched and replicated based on the request information through blue/green release.

This article introduces the practice of implementing A/B testing in the whole procedure based on the request header.

1.1 Function Introduction

In the Istio community, it is easy to find documentation and examples of how to route traffic to a specific version of a service based on the request header. These examples only take effect for the first service in the comprehensive procedure.

For example, a request needs to access three services: A, B, and C. All three services have two versions, en and fr. The demands are:

  • For requests with user:en as the header, the routing path in the comprehensive procedure is A1-B1-C1.
  • For requests with user:fr as the header, the routing path in the comprehensive procedure is A2-B2-C2.

The corresponding VirtualService configuration is listed below:

http:
- name: A|B|C-route
  match:
  - headers:
      user:
        exact: en
  route:
  - destination:
      host: A|B|C-svc
      subset: v1
- route:
  - destination:
      host: A|B|C-svc
      subset: v2

After testing, only the request routing of service A meets demands. B and C cannot route requests to the specified versions based on the header value.

1

Why? For the microservices on ASM, this header appears out of thin air, that is, the microservice code does not perceive the header. Therefore, when service A requests service B, this header is not passed through. In other words, when service A requests service B, this header has been lost. In this state, the VirtualService configuration that routes requests based on the header is meaningless.

The only way to solve this problem on the microservice side is to modify the code to enumerate and pass through all headers concerned by the business. However, this modification is intrusive and does not support new headers flexibly.

From the infrastructure perspective of ASM, headers should be KV pairs for pass-through and have no business meaning. Thus, ASM can pass through the user-defined header without discrimination and support the non-intrusive A/B testing in the comprehensive procedure. How can it be implemented?

1.2 Status Quo in the Community

As mentioned earlier, if the header cannot be passed through, this function cannot be implemented simply by configuring header matching for VirtualService. Are there other configurations in VirtualService that can achieve header pass-through? If the answer is yes, VirtualService is the most cost-effective.

After various attempts, such as carefully configuring header-related set and add statements, the result shows that the idea above is not possible. VirtualService interferes with the headers in the inbound phase, while pass-through needs to interfere with the headers in the outbound phase. A microservice workload cannot pass through a header value that appears out of space. Therefore, when a request is routed to the next service, the header is lost.

2

Therefore, it can be concluded that the non-intrusive A/B testing in the comprehensive procedure cannot be implemented only using VirtualService. In other words, none of the existing configurations provided by the community can directly support this function. The only solution is the more advanced configuration, EnvoyFliter, which is not expected at first. The reasons are:

  1. The configuration of EnvoyFliter is too complex, making it difficult for common users to learn and use ASM quickly. Even if examples are provided, these examples are of little referential value for modifying EnvoyFliter once the requirements change slightly.
  2. EnvoyFilter does not directly support this function even if it is used. Therefore, Lua or WebAssembly (WASM) is required for development.

1.3 Implementation Plan

Now, it is time for technology selection. The advantages of disadvantages of Lua and WASM are listed below:

  • Lua is small but has poor performance.
  • WASM has good performance but is difficult for development and distribution.
  • WASM is mainly implemented based on C++ and Rust, while implementations based on other languages are immature or poor-performance. This article uses Rust to implement WASM.

We use Rust to develop a WASM to obtain headers defined by users in EnvoyFilter and transmit backward in the outbound phase.

WASM packages are distributed using the configmap storage of Kubernetes. The pod obtains and loads the WASM configurations through definition in the annotation. The reasons for adopting this distribution mode will be described later.

3

2. Technical Implementation

For the related code in this section, please see this link.

2.1 WASM Implementation Based on RUST

2.1.1. Define Dependency

The WASM project only has one core dependency crate called proxy-wasm, which is the basic package for WASM development using Rust. Other packages, such as serde_json for deserialization and log for log printing are also needed. The definition in Cargo.toml is listed below:

[dependencies]
proxy-wasm = "0.1.3"
serde_json = "1.0.62"
log = "0.4.14"

2.1.2 Define Building

The ultimate building form of WASM is a dynamic link library that is compatible with the C languages. The building definition in Cargo.toml is listed below:

[lib]
name = "propaganda_filter"
path = "src/propagate_headers.rs"
crate-type = ["cdylib"]

2.1.3 Header Pass-Through

First, define the structure. head_tag_name is the name of the header key defined by the user, and head_tag_value is the name of the corresponding value.

struct PropagandaHeaderFilter {
    config: FilterConfig,
}

struct FilterConfig {
    head_tag_name: String,
    head_tag_value: String,
}

trait HttpContext in {proxy-wasm}/src/traits.rs defines the method on_http_request_headers. We implement this method to pass through the header.

impl HttpContext for PropagandaHeaderFilter {
    fn on_http_request_headers(&mut self, _: usize) -> Action {
        let head_tag_key = self.config.head_tag_name.as_str();
        info!("::::head_tag_key={}", head_tag_key);
        if !head_tag_key.is_empty() {
            self.set_http_request_header(head_tag_key, Some(self.config.head_tag_value.as_str()));
            self.clear_http_route_cache();
        }
        for (name, value) in &self.get_http_request_headers() {
            info!("::::H[{}] -> {}: {}", self.context_id, name, value);
        }
        Action::Continue
    }
}

Line 3 to line 6 obtains the user-defined header key-value pair in configuration files. If the key-value pair exists, call the method set_http_request_header and write the pair to the current header.

Line 7 is a workaround implemented for the current proxy-wasm. For more information, please see the links below:

2.2 Local Authentication Based on Envoy

2.2.1. WASM Building

Run the following command to build a WASM project. Note: the target wasm32-unknown-unknown currently only exists in nightly. Therefore, the building environment should be switched temporarily before building the project.

rustup override set nightly
cargo build --target=wasm32-unknown-unknown --release

After the building, use Docker Compose to start the Envoy locally to verify the WASM functions.

2.2.2 Envoy Configuration

In this example, two files are necessary for Envoy to start. One is propaganda_filter.wasm, and the other is the Envoy configuration file envoy-local-wasm.yaml, as shown in the following figure:

volumes:
  - ./config/envoy/envoy-local-wasm.yaml:/etc/envoy-local-wasm.yaml
  - ./target/wasm32-unknown-unknown/release/propaganda_filter.wasm:/etc/propaganda_filter.wasm

Envoy supports dynamic configuration. In local testing, it uses static configuration.

static_resources:
  listeners:
    - address:
        socket_address:
          address: 0.0.0.0
          port_value: 80
      filter_chains:
        - filters:
            - name: envoy.filters.network.http_connection_manager
...
                http_filters:
                  - name: envoy.filters.http.wasm
                    typed_config:
                      "@type": type.googleapis.com/udpa.type.v1.TypedStruct
                      type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
                      value:
                        config:
                          name: "header_filter"
                          root_id: "propaganda_filter"
                          configuration:
                            "@type": "type.googleapis.com/google.protobuf.StringValue"
                            value: |
                              {
                                "head_tag_name": "custom-version",
                                "head_tag_value": "hello1-v1"
                              }
                          vm_config:
                            runtime: "envoy.wasm.runtime.v8"
                            vm_id: "header_filter_vm"
                            code:
                              local:
                                filename: "/etc/propaganda_filter.wasm"
                            allow_precompiled: true
...

Envoy configuration focuses on the following three places:

  • Line 15: A type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm named header_filter is defined in http_filters.
  • Line 32: The local file path is /etc/propaganda_filter.wasm.
  • Line 20 to Line 26: The type of related configuration is type.googleapis.com/google.protobuf.StringValue```, and the value is {"head_tag_name": "custom-version","head_tag_value ": "hello1-v1"}. The custom header key is named custom-version and the value is hello1-v1`.

2.2.3 Local Verification

Run the following command to start Docker Compose:

docker-compose up --build

Request local service:

curl -H "version-tag":"v1" "localhost:18000"

The Envoy logs should output like this:

proxy_1        | [2021-02-25 06:30:09.217][33][info][wasm] [external/envoy/source/extensions/common/wasm/context.cc:1152] wasm log: ::::create_http_context head_tag_name=custom-version,head_tag_value=hello1-v1
proxy_1        | [2021-02-25 06:30:09.217][33][info][wasm] [external/envoy/source/extensions/common/wasm/context.cc:1152] wasm log: ::::head_tag_key=custom-version
...
proxy_1        | [2021-02-25 06:30:09.217][33][info][wasm] [external/envoy/source/extensions/common/wasm/context.cc:1152] wasm log: ::::H[2] -> custom-version: hello1-v1

2.3 Distribution Mode of WASM

WASM distribution refers to the process of storing WASM packages in a distribution warehouse for specified pods to pull.

2.3.1. Local Mode of Configmap and Envoy

Although this mode is not the final solution for WASM distribution, it is easy to understand and suitable for simple scenarios. Therefore, this mode is selected as an example in this article. Configmap does not store WASM packages, but the local modes of configmap and Envoy are very mature. The combination of the two can meet the current requirements.

ASM has provided a similar mode. For more details, please see: Write WASM Filter for Envoy and deploy in ASM. The first priority is the package size when importing the WASM package into the configuration. We use wasm-gc to crop the package, as shown in the following figure:

ls -hl target/wasm32-unknown-unknown/release/propaganda_filter.wasm
wasm-gc ./target/wasm32-unknown-unknown/release/propaganda_filter.wasm ./target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm
ls -hl target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm

The execution results are listed below. It shows the comparison of package sizes before and after cropping:

-rwxr-xr-x  2 han  staff   1.7M Feb 25 15:38 target/wasm32-unknown-unknown/release/propaganda_filter.wasm
-rw-r--r--  1 han  staff   136K Feb 25 15:38 target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm

Create configmap:

wasm_image=target/wasm32-unknown-unknown/release/propaganda-header-filter.wasm
kubectl -n $NS create configmap -n $NS propaganda-header --from-file=$wasm_image

Patch the specified deployment:

patch_annotations=$(cat config/annotations/patch-annotations.yaml)
kubectl -n $NS patch deployment "hello$i-deploy-v$j" -p "$patch_annotations"

The code of patch-annotations.yaml is listed below:

spec:
  template:
    metadata:
      annotations:
        sidecar.istio.io/userVolume: '[{"name":"wasmfilters-dir","configMap": {"name":"propaganda-header"}}]'
        sidecar.istio.io/userVolumeMount: '[{"mountPath":"/var/local/lib/wasm-filters","name":"wasmfilters-dir"}]'

2.3.2 Remote Mode of Envoy

Envoy supports the resource definition in local and remote modes. The comparison is listed below:

vm_config:
  runtime: "envoy.wasm.runtime.v8"
  vm_id: "header_filter_vm"
  code:
    local:
      filename: "/etc/propaganda_filter.wasm"
vm_config:
  runtime: "envoy.wasm.runtime.v8"
  code:
    remote:
      http_uri:
        uri: "http://*.*.*.216:8000/propaganda_filter.wasm"
        cluster: web_service
        timeout:
          seconds: 60
      sha256: "da2e22*"

Remote mode is closest to the original Envoy, so this mode was originally the first choice for this article. However, hash verification errors of packages occurred during the practices. For more information about the errors, please check the links below. Zhou Lizan, an expert in the Envoy community, told me that the remote mode is not the future direction of WASM distribution supported by Envoy. Therefore, we gave up this mode.

2.3.3 ORAS-Based Local Mode

ORAS is the reference implementation of the OCI Artifacts project, which can significantly simplify the storage of any content in the OCI registry.

The WASM modules with the permissible media types are pushed to the registry (an OCI-compatible registry) using the ORAS client or API/SDK. Then, the WASM Filter is deployed to the pod corresponding to the specified workload through the controller and mounted in local mode.

ASM provides support for WASM technology. ASM users can deploy the extended WASM Filter to the corresponding Envoy proxy in the data plane cluster through ASM. Features, such as dynamic plug-in loading and hot update, are supported through the ASMFilterDeployment Controller component. ASM provides a new CRD ASMFilterDeployment and a related controller component. This controller component monitors the ASMFilterDeployment resource objects and does two things:

  • It creates an Istio EnvoyFilter Custom Resource for the control plane and pushes it to the corresponding Istiod in the ASM control plane.
  • It pulls the corresponding WASM Filter image from the OCI registry and mounts it to the corresponding workload pod.

For more details, please see Simplify and extend functions of ASM based on WASM and ORAS (Article in Chinese). This mode will be used for WASM distribution in future practice sharing.

Similarly, other companies in the industry are also promoting this mode. Solo.io provides a complete set of WASM development framework, wasme. Based on this framework, WASM packages (OCI image) can be developed, built, distributed, and deployed to WebAssembly Hub. This solution covers the lifecycle of WASM from development to release. However, this solution has various shortcomings. Wasme is self-contained, which makes it difficult to split and extend outside the solo system.

The ASM Team is communicating with other teams in the industry, including Solo.io, about how to jointly promote the OCI specification and lifecycle management of the WASM Filter. By doing so, they aim to help customers expand Envoy capabilities and promote the application of the WASM Filter in ASM.

2.4 Cluster Verification Based on Istio

2.4.1 Sample Practice

After WASM is distributed to the configmap of Kubernetes, the cluster verification can be conducted. The sample practice (source code) contains 3 services: hello1, hello2, and hello3. Each service has 2 versions: v1/en and v2/fr.

Each service is configured with VirtualService and DestinationRule to define and match headers and route headers to specified versions.

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: hello2-vs
spec:
  hosts:
    - hello2-svc
  http:
  - name: hello2-v2-route
    match:
    - headers:
        route-v:
          exact: hello2v2
    route:
    - destination:
        host: hello2-svc
        subset: hello2v2
  - route:
    - destination:
        host: hello2-svc
        subset: hello2v1
----
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: hello2-dr
spec:
  host: hello2-svc
  subsets:
    - name: hello2v1
      labels:
        version: v1
    - name: hello2v2
      labels:
        version: v2

The code of EnvoyFilter is listed below:

apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: hello1v2-propaganda-filter
spec:
  workloadSelector:
    labels:
      app: hello1-deploy-v2
      version: v2
  configPatches:
    - applyTo: HTTP_FILTER
      match:
        context: SIDECAR_OUTBOUND
        proxy:
          proxyVersion: "^1\\.8\\.*"
        listener:
          filterChain:
            filter:
              name: envoy.filters.network.http_connection_manager
              subFilter:
                name: envoy.filters.http.router
      patch:
        operation: INSERT_BEFORE
        value:
          name: envoy.filters.http.wasm
          typed_config:
            "@type": type.googleapis.com/udpa.type.v1.TypedStruct
            type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
            value:
              config:
                name: propaganda_filter
                root_id: propaganda_filter_root
                configuration:
                  '@type': type.googleapis.com/google.protobuf.StringValue
                  value: |
                    {
                      "head_tag_name": "route-v",
                      "head_tag_value": "hello2v2"
                    }
                vm_config:
                  runtime: envoy.wasm.runtime.v8
                  vm_id: propaganda_filter_vm
                  code:
                    local:
                      filename: /var/local/lib/wasm-filters/propaganda-header-filter.wasm
                  allow_precompiled: true

2.4.2. Verification Method

The request curl -H "version:v1" "http://$ingressGatewayIp:8001/hello/xxx" that carries a header uses istio-ingressgateway to access services. Then, it is routed to the specified version of the services based on the header value in the comprehensive procedure. In this case, the specified version is v2, so the routing path is hello1 v2- hello2 v2- hello3 v2 in the comprehensive procedure. The process is shown in the following figure:

4

The verification process and result are listed below:

for i in {1..5}; do
    curl -s -H "route-v:v2" "http://$ingressGatewayIp:$PORT/hello/eric" >>result
    echo >>result
done
check=$(grep -o "Bonjour eric" result | wc -l)
if [[ "$check" -eq "15" ]]; then
    echo "pass"
else
    echo "fail"
    exit 1
fi

Result:

Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182
Bonjour eric@hello1:172.17.68.205<Bonjour eric@hello2:172.17.68.206<Bonjour eric@hello3:172.17.68.182

We can see that the output information Bonjour eric comes from the fr version of each service, which indicates that the function verification has succeeded.

3. Performance Analysis

After the verification is passed, how much latency costs will occur after adding EnvoyFilter and WASM? ASM providers and users are concerned about this situation. This section explains the following two concerns:

  • The incremental latency costs after adding EnvoyFilter and WASM
  • The cost comparison between WASM and Lua implementations

3.1 Lua Implementation

Lua implementation can be written directly into the EnvoyFilter without requiring an independent project, as shown in the following code:

patch:
  operation: INSERT_BEFORE
  value:
    name: envoy.lua
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.lua.v3.Lua
      inlineCode: |
        function envoy_on_request(handle)
          handle:logInfo("[propagate header] route-v:hello3v2")
          handle:headers():add("route-v", "hello3v2")
        end

3.2 Stress Testing Method

3.2.1 Deployment

  • Deploy the same Deployment, Service, VirtualService, and DestinationRule on three namespaces, respectively.
  • Deploy Lua-based EnvoyFilter in hello-abtest-lua
  • Deploy WASM-based EnvoyFilter in hello-abtest-wasm
hello-abtest        Baseline Environment
hello-abtest-lua    Add the environment of EnvoyFilter and LUA
hello-abtest-wasm   Add the environment of EnvoyFilter and WASM

3.2.2 Tools

In this practice, hey is used as the stress testing tool, whose predecessor is boom. Hey is used to replace Apache Bench (AB). Now, perform stress testing on the three environments with the same testing parameters, as shown in the following statements:

# The number of concurrent work
export NUM=2000
# QPS
export QPS=2000
#Stress testing duration
export Duration=10s

hey -c $NUM -q $QPS -z $Duration -H "route-v:v2" http://$ingressGatewayIp:$PORT/hello/eric > $SIDECAR_WASM_RESULT

The result files of hey stress testing cannot contain the socket: too many open files. Otherwise, the results will be affected. The ulimit -n $MAX_OPENFILE_NUM command and adjustment of stress testing parameters can ensure the result accuracy.

3.3 Report

We selected four key metrics from the three result reports, as shown in the following figure:

5
6

3.4 Conclusion

1.  When two new versions of EnvoyFilter are added, the average latency is dozens to hundreds of milliseconds longer than the baseline version. The increased time-consuming ratio is:

  • WASM: 1.2% (0.6395-0.6317)/0.6317 and 1% (1.3290-1.2078)/1.2078
  • Lua: 11% (0.7012-0.6317)/0.6317 and 20% (1.4593-1.2078)/1.2078

2.  The performance of the WASM version is better than the Lua version.

Note: The WASM version has multiple configurations for one set of code compared to the Lua version. Therefore, the WASM version has an extra process of obtaining configuration variables than the Lua version.

4. Prospect

4.1 About Use

This article describes how to implement and verify a WASM that passes through customized headers to support the non-intrusive A/B testing in the comprehensive procedure. However, it is tedious and error-prone for ASM users to implement the WASM in the steps described in this article.

The ASM Team is launching a mechanism for the ASM plug-in directory. Users only need to select plug-ins in the plug-in directory and provide a small number of KV configurations, such as custom headers, for the plug-ins. By doing so, EnvoyFilter, WASM, VirtualService, and DestinationRule can be generated and deployed automatically.

4.2 About Extension

This article only demonstrates the header-based routing and matching functions. How can we extend the functions if we want to match and route based on Query Params? This is where the ASM plug-in directory will focus. In the future, the ASM plug-in directory will provide best practices.

0 0 0
Share on

feuyeux

6 posts | 0 followers

You may also like

Comments

feuyeux

6 posts | 0 followers

Related Products