All Products
Search
Document Center

Alibaba Cloud Service Mesh:Configure observability settings

Last Updated:Apr 09, 2024

Service Mesh (ASM) allows you to configure observability settings, including log settings, metric settings, and trace analysis settings. In the ASM console, you can customize global, namespace-level, or workload-level configurations, such as the format of access logs, the dimensions of metrics, whether to enable specific metrics, and the sampling percentage of trace analysis. This topic describes how to configure observability settings.

Prerequisites

An ASM instance of version 1.17.2.35 or later is created. For more information, see Create an ASM instance or Update an ASM instance.

Applicable scope

Configuration type

Description

Global

A global configuration includes Log Settings, Metric Settings, and Tracing Analysis Settings. Global configuration is mandatory, and only one global configuration is allowed. Tracing Analysis Settings are available only on the Global tab.

Namespace-level

The namespace-specific observability configuration. Each namespace has only one namespace-level observability configuration.

Custom

You can use a workload selector to select the applicable scope of a custom configuration. Each workload can be selected by only one custom configuration.

Procedure

Global configuration

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose Observability Management Center > Observability Settings.

  3. On the Observability Settings page, click the Global tab, configure Log Settings, Metric Settings, and Tracing Analysis Settings based on your business requirements, and then click submit.

    You can click the links in the following table to view the detailed configuration description.

    Section

    Description

    Log Settings

    Metric Settings

    Tracing Analysis Settings

Namespace-level configuration

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose Observability Management Center > Observability Settings.

  3. On the Observability Settings page, click the Namespace tab and then click Create. On the page that appears, select the desired namespace from the Namespace drop-down list, configure Log Settings and Metric Settings based on your business requirements, and then click Create.

    You can click the links in the following table to view the detailed configuration description.

    Section

    Description

    Log Settings

    Metric Settings

Custom configuration

  1. Log on to the ASM console. In the left-side navigation pane, choose Service Mesh > Mesh Management.

  2. On the Mesh Management page, click the name of the ASM instance. In the left-side navigation pane, choose Observability Management Center > Observability Settings.

  3. On the Observability Settings page, click the Custom tab, select the desired namespace from the Namespace drop-down list, and then click Create. On the page that appears, set Name and Matching Label, configure Log Settings and Metric Settings based on your business requirements, and then click Create.

    You can click the links in the following table to view the detailed configuration description.

    Section

    Description

    Log Settings

    Metric Settings

Description of Log Settings

In the Log Settings section, you can enable or disable access log output, customize the output format of access logs, customize the format of access logs, and filter logs.

Enable or disable access log output

  1. In the Log Settings section, turn on or turn off the Enable Log Output switch based on your business requirements.

    • If you turn on the switch, the sidecar proxies or gateways on the data plane of the ASM instance send access logs to containers, which then emit logs to stdout output streams.

    • If you turn off the switch, the sidecar proxies or gateways on the data plane of an ASM instance stop sending access logs to containers, which no longer emit logs to stdout output streams.

  2. View logs in the standard output streams of sidecar containers on the data plane.

    The following example shows how to use kubectl to view access logs.

    1. Run the following command to view the logs of a sidecar proxy:

      kubectl logs httpbin-5c5944c58c-w**** -c istio-proxy --tail 1

      Show the sample output

      {
          "authority_for":"47.110.XX.XXX",
          "bytes_received":"0",
          "bytes_sent":"22382",
          "downstream_local_address":"192.168.0.29:80",
          "downstream_remote_address":"221.220.XXX.XXX:0",
          "duration":"80",
          "istio_policy_status":"-",
          "method":"GET",
          "path":"/static/favicon.ico",
          "protocol":"HTTP/1.1",
          "request_id":"0f2cf829-3da5-4810-a618-08d9745d****",
          "requested_server_name":"outbound_.8000_._.httpbin.default.svc.cluster.local",
          "response_code":"200",
          "response_flags":"-",
          "route_name":"default",
          "start_time":"2023-06-30T04:00:36.841Z",
          "trace_id":"-",
          "upstream_cluster":"inbound|80||",
          "upstream_host":"192.168.0.29:80",
          "upstream_local_address":"127.0.X.X:55879",
          "upstream_response_time":"79",
          "upstream_service_time":"79",
          "upstream_transport_failure_reason":"-",
          "user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.X.X Safari/537.36",
          "x_forwarded_for":"221.220.XXX.XXX"
      }
    2. Run the following command to view the logs of an ingress gateway:

      kubectl -n istio-system logs istio-ingressgateway-6cff9b6b58-r**** --tail 1

      Show the sample output

      {
          "authority_for":"47.110.XX.XXX",
          "bytes_received":"0",
          "bytes_sent":"22382",
          "downstream_local_address":"192.168.0.63:80",
          "downstream_remote_address":"221.220.XXX.XXX:64284",
          "duration":"81",
          "istio_policy_status":"-",
          "method":"GET",
          "path":"/static/favicon.ico",
          "protocol":"HTTP/1.1",
          "request_id":"0f2cf829-3da5-4810-a618-08d9745d****",
          "requested_server_name":"-",
          "response_code":"200",
          "response_flags":"-",
          "route_name":"httpbin",
          "start_time":"2023-06-30T04:00:36.841Z",
          "trace_id":"-",
          "upstream_cluster":"outbound|8000||httpbin.default.svc.cluster.local",
          "upstream_host":"192.168.0.29:80",
          "upstream_local_address":"192.168.0.63:36140",
          "upstream_response_time":"81",
          "upstream_service_time":"81",
          "upstream_transport_failure_reason":"-",
          "user_agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.X.X Safari/537.36",
          "x_forwarded_for":"221.220.XXX.XXX"
      }
  3. (Optional) View access logs in the Container Service for Kubernetes (ACK) console.

    If you use an ACK cluster, you can also view access logs in the ACK console.

    1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

    2. On the Clusters page, click the name of the cluster that you want to manage and choose Workloads > Pods in the left-side navigation pane.

    3. On the Pods page, click the name of the desired pod and click the Logs tab to view access logs.

Customize the output format of access logs

Note

This feature is available only for ASM instances of v1.20.6.36 or later. For more information about how to update an ASM instance, see Update an ASM instance.

In the Log Settings section, set the Log Output Format parameter to JSON or TEXT based on your business requirements.

  • If you set the parameter to JSON, access logs are output to the corresponding container as JSON strings.

  • If you set the parameter to TEXT, access logs are output to the corresponding container as plain text strings.

Customize the format of access logs

  1. In the Log Settings section, select the desired custom fields or modify the information of custom fields. You can also click the 增加.png icon to the right of log metrics in the lower part to add log fields.

    You can customize the format of logs only if you turn on Enable Log Output. In the Log Format section, the log fields that are selected by default are mandatory and cannot be modified. You can choose to obtain the value of a log field from the request header, response header, or Envoy built-in value.

    The following example sets accessLogFormat key to accept-encoding, Type to Request Properties, and accessLogFormat value to Accept-Encoding to print the accept-encoding header in requests.日志格式.png

  2. Run the following command to view the logs of the components on the data plane of the ASM instance:

    kubectl logs httpbin-5c5944c58c-w**** -c istio-proxy --tail 1|grep accept-encoding --color=auto

    Show the sample output

    {
        "bytes_received":"0",
        "bytes_sent":"9593",
        "downstream_local_address":"192.168.0.29:80",
        "downstream_remote_address":"69.164.XXX.XX:0",
        "duration":"2",
        "istio_policy_status":"-",
        "method":"GET",
        "path":"/",
        "protocol":"HTTP/1.1",
        "request_id":"29939dc9-62be-4ddf-acf6-32cb098d****",
        "requested_server_name":"outbound_.8000_._.httpbin.default.svc.cluster.local",
        "response_code":"200",
        "response_flags":"-",
        "route_name":"default",
        "start_time":"2023-06-30T04:18:19.734Z",
        "trace_id":"-",
        "upstream_cluster":"inbound|80||",
        "upstream_host":"192.168.0.29:80",
        "upstream_local_address":"127.0.X.X:34723",
        "upstream_service_time":"2",
        "upstream_transport_failure_reason":"-",
        "user_agent":"Mozilla/5.0 zgrab/0.x",
        "x_forwarded_for":"69.164.XXX.XX",
        "authority_for":"47.110.XX.XXX",
        "upstream_response_time":"2",
        "accept-encoding":"gzip"
    }

    The value of the accept-encoding header that is added in Step 1 is printed in the access log.

Filter logs

In the lower part of the Log Settings section, select Enable Log Filter and enter a log filter expression in the text box based on your business requirements. The access logs of requests that do not match the expression are not printed.

For example, to print the logs of requests whose HTTP status code in the response is greater than or equal to 400, set the expression to response.code >= 400. For more information, see CEL expressions and frequently used attributes.

CEL expressions and frequently used attributes

Log filter expressions are written in the Common Expression Language (CEL). The following table describes frequently used attributes in CEL expressions. For more information, see CEL and Envoy.

Attribute

Type

Description

request.path

string

The request path.

request.url_path

string

The request path without the query string.

request.host

string

The host name portion of the URL.

request.method

string

The request method.

request.headers

map<string, string>

All request headers indexed by the lowercase header name.

request.useragent

string

The value of the user agent header.

request.time

timestamp

The time when the first byte of the request arrives.

request.id

string

The request ID.

request.protocol

string

The request protocol. Valid values: HTTP/1.0, HTTP/1.1, HTTP/2, and HTTP/3.

request.query

string

The query portion of the request URL.

response.code

int

The HTTP status code in the response.

response.code_details

string

The details of the response code.

response.grpc_status

int

The Google Remote Procedure Call (gRPC) status code in the response.

response.headers

map<string, string>

All response headers indexed by the lowercase header name.

response.size

int

The size of the response body. Unit: byte.

response.total_size

int

The total size of the response. Unit: byte.

Description of Metric Settings

In the Metric Settings section, you can enable or disable metrics and edit metric dimensions.

Enable or disable metrics

Metrics are divided into client side metrics and server side metrics.

  • Client side metrics are those generated when a sidecar proxy works as a client and initiates requests. Gateway metrics also belong to the client side metrics.

  • Server side metrics are those generated when a sidecar proxy works as a server and is accessed.

  1. In the Metric Settings section, select or clear the Enabled check box of the corresponding metric in the CLIENT side Indicator or SERVER side index column.

    • When a metric is enabled, the sidecar proxy or gateway on the data plane of the ASM instance exposes the metric in the /stats/prometheus path over port 15020.

    • If a metric is disabled, it is not exposed over the port.

  2. Run the following command to view the metrics exposed by the sidecar proxy or gateway:

    You can use kubectl to connect to the container of the sidecar proxy or gateway and run the curl command to access the /stats/prometheus path over port 15020 and view the exported metrics.

    kubectl exec httpbin-5c5944c58c-w**** -c istio-proxy -- curl 127.0.0.1:15020/stats/prometheus|head -n 10

    Sample output:

    # TYPE istio_agent_cert_expiry_seconds gauge
    istio_agent_cert_expiry_seconds{resource_name="default"} 46725.287654548
    # HELP istio_agent_endpoint_no_pod Endpoints without an associated pod.
    # TYPE istio_agent_endpoint_no_pod gauge
    istio_agent_endpoint_no_pod 0
    # HELP istio_agent_go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
    # TYPE istio_agent_go_gc_duration_seconds summary
    istio_agent_go_gc_duration_seconds{quantile="0"} 5.0149e-05
    istio_agent_go_gc_duration_seconds{quantile="0.25"} 9.8807e-05
    ......

Edit metric dimensions

The dimensions of a metric indicate more information. You can use these dimensions to filter metrics in Prometheus. For example, you can use the source_app dimension to filter metrics whose requesting client is a specific application.

  1. In the Metric Settings section, click Edit dimension in the CLIENT side Indicator or SERVER side index column.

  2. In the Customize CLIENT dimension configuration dialog box, select or clear the dimensions that you want to use to export metrics, and then click Submit.

    You cannot add metric dimensions. You can remove unnecessary metric dimensions to save the storage space of Prometheus. Typically, most dimensions are retained. Therefore, only removed dimensions are displayed in the Metric Settings section.

Description of Tracing Analysis Settings

In the Tracing Analysis Settings section, you can configure Sampling Percentage and Custom Tags. The settings take effect globally.

Sampling Percentage

Sampling Percentage indicates the percentage of requests that trigger Tracing Analysis. If you set the value to 0, Tracing Analysis is disabled, and no request triggers Tracing Analysis.

Custom Tags

You can customize the tags carried by spans of Tracing Analysis. In the Tracing Analysis Settings section, click Add Custom Tags and set Name, Type, and Value.

The following table describes the values of Type and provides tag configuration examples.

Type

Description

Tag configuration example

Fixed Value

The value of a tag of this type is fixed to the value you set.

  • Name: env

  • Type: Fixed Value

  • Value: prod

Request Header

The value of a tag of this type is the value of the specified request header. If the header does not exist in a request, the default value is used as the tag value.

In the configuration example, the tag value is obtained from the User-Agent header. If the header does not exist, the default value unknow is used as the tag value.

  • Name: useragent

  • Type: Request Header

  • Value:

    • Header name: User-Agent

    • Default value: unknow

Environment Variable

The value of a tag of this type is obtained from the specified environment variable of the workload. If the environment variable does not exist in a workload, the default value is used as the tag value.

In the configuration example, the tag value is obtained from the ENV environment variable. If the environment variable does not exist, the default value unknow is used as the tag value.

  • Name: env

  • Type: Environment Variable

  • Value:

    • Environment Variable name: ENV

    • Default value: unknow