All Products
Search
Document Center

Container Service for Kubernetes:Horizontal pod scaling based on Managed Service for Prometheus metrics

Last Updated:Feb 07, 2024

This topic describes how to convert Managed Service for Prometheus metrics to metrics that are supported by the Horizontal Pod Autoscaler (HPA). This enables the HPA to scale applications based on the metrics.

Prerequisites

Features

By default, the HPA supports only auto scaling based on the CPU and memory usage. This cannot meet the O&M requirements. Managed Service for Prometheus is a fully managed monitoring service that is interfaced with the open source Prometheus ecosystem. Managed Service for Prometheus monitors various components and provides multiple ready-to-use dashboards. To enable horizontal pod autoscaling based on Managed Service for Prometheus metrics, perform the following steps:

  1. Use Managed Service for Prometheus in the ACK cluster to expose the metrics.

  2. Use the alibaba-cloud-metrics-adapter component to convert Managed Service for Prometheus metrics to Kubernetes metrics supported by the HPA. For more information, see Autoscaling on multiple metrics and custom metrics.

  3. Configure and deploy the HPA to perform auto scaling based on the preceding metrics.

    The metrics can be classified into the following types based on scenarios:

The following section describes how to configure alibaba-cloud-metrics-adapter to convert Managed Service for Prometheus metrics to metrics supported by the HPA for auto scaling.

Step 1: Collect Managed Service for Prometheus metrics

Example 1: Use the predefined metrics

You can perform auto scaling based on the predefined metrics available in the Managed Service for Prometheus component that is installed in your ACK cluster. The predefined metrics include cadvisor metrics for container monitoring, Node-Exporter and GPU-Exporter metrics for node monitoring, and all metrics provided by Managed Service for Prometheus. To view the predefined metrics in the Managed Service for Prometheus component, perform the following steps:

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage and choose Operations > Prometheus Monitoring in the left-side navigation pane.

  3. Click Go to ARMS Prometheus in the upper-right corner.

  4. In the left-side navigation pane of the Managed Service for Prometheus console, click Settings to view all metrics supported by Managed Service for Prometheus.

Example 2: Use the Managed Service for Prometheus metrics reported by pods

Deploy a testing application and expose the metrics of the application based on the metric standards of open source Prometheus. For more information, see metric_type. The following section describes how to deploy an application named sample-app and expose the http_requests_total metric to indicate the number of requests sent to the application.

  1. Deploy the workload of the application.

    1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

    2. On the Clusters page, click the name of the cluster. In the left-side navigation pane, choose Workloads > Deployments.

    3. On the Deployments page, click Create from YAML. On the Create page, set Sample Template to Custom, copy the following content to the YAML editor, and then click Create.

      Click to view YAML content

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: sample-app
        labels:
          app: sample-app
      spec:
        replicas: 1
        selector:
          matchLabels:
            app: sample-app
        template:
          metadata:
            labels:
              app: sample-app
          spec:
            containers:
            - image: luxas/autoscale-demo:v0.1.2
              name: metrics-provider
              ports:
              - name: http
                containerPort: 8080
      ---
      apiVersion: v1
      kind: Service
      metadata:
        name: sample-app
        namespace: default
        labels:
          app: sample-app
      spec:
        ports:
          - port: 8080
            name: http
            protocol: TCP
            targetPort: 8080
        selector:
          app: sample-app
        type: ClusterIP
      Note

      The application pod is used to expose the http_requests_total metric, which indicates the number of requests.

  2. Create a ServiceMonitor.

    1. Log on to the Application Real-Time Monitoring Service (ARMS) console.

    2. In the left-side navigation pane, choose Managed Service for Prometheus > Instances.

    3. In the upper-left corner of the Managed Service for Prometheus page, select the region in which your ACK cluster is deployed and click the Prometheus instance that you want to manage. Then, you are redirected to the instance details page.

    4. In the left-side navigation pane, click Service Discovery. Then, click the Configure tab.

    5. On the Configure tab, click the ServiceMonitor tab.

    6. On the ServiceMonitor tab, click Add ServiceMonitor to create a ServiceMonitor and click OK.

      apiVersion: monitoring.coreos.com/v1
      kind: ServiceMonitor
      metadata:
        name: sample-app
        namespace: default
      spec:
        endpoints:
        - interval: 30s
          port: http
          path: /metrics
        namespaceSelector:
          any: true
        selector:
          matchLabels:
            app: sample-app
  3. Confirm the status of Managed Service for Prometheus.

    On the Service Discovery page, click the Targets tab. If default/sample-app/0(1/1 up) is displayed, Managed Service for Prometheus is monitoring your application.

  4. In the dashboard provided by Managed Service for Prometheus, query the value of http_requests_total within a period of time to confirm that monitoring data is collected without errors.

Step 2: Modify the configuration of the alibaba-cloud-metrics-adapter component

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster. In the left-side navigation pane, choose Applications > Helm.

  3. On the Helm page, click Update in the Actions column of ack-alibaba-cloud-metrics-adapter.

  4. In the Update Release panel, copy the following content to the YAML editor and click OK.

    Click to view YAML content

      AlibabaCloudMetricsAdapter:
      ......
        prometheus:
        	enabled: true    # Set the parameter to true to enable the Prometheus adapter. 
          # Specify the endpoint of Managed Service for Prometheus. 
          url: https://cn-beijing.arms.aliyuncs.com:9443/api/v1/prometheus/xxxx/xxxx/xxxx/cn-beijing
        	# If token-based authentication is enabled for Managed Service for Prometheus, configure the Authorization field of the prometheusHeader parameter. 
          prometheusHeader:
          - Authorization: xxxxxxx
          	
          metricsRelistInterval: 1m # Specify the metric collection interval. We recommend that you use the default value 1min. 
        	logLevel: 5								# Specify the level of component debugging logs. We recommend that you use the default value. 
        
          adapter:
            rules:
              default: false  			# The default metric collection setting. We recommend that you use the default value false. 
              custom:
              
              ## Example 1: This is an example of custom metric config.
              # this config will convert prometheus metric: container_memory_working_set_bytes to a custom metric container_memory_working_set_bytes_per_second
              # and cpu metric container_cpu_usage_seconds_total convert to container_cpu_usage_core_per_second
              # you can run command to check the memory/cpu value:
              # kubectl get --raw  "/apis/custom.metrics.k8s.io/v1beta1/namespaces/kube-system/pods/*/container_memory_working_set_bytes_per_second"
              # kubectl get --raw  "/apis/custom.metrics.k8s.io/v1beta1/namespaces/kube-system/pods/*/container_cpu_usage_core_per_second"
              # refer to doc: https://help.aliyun.com/document_detail/184519.html
      
              - seriesQuery: 'container_memory_working_set_bytes{namespace!="",pod!=""}'
                resources:
                  overrides:
                    namespace: { resource: "namespace" }
                    pod: { resource: "pod" }
                name:
                  matches: "^(.*)_bytes"
                  as: "${1}_bytes_per_second"
                metricsQuery: 'sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'
              - seriesQuery: 'container_cpu_usage_seconds_total{namespace!="",pod!=""}'
                resources:
                  overrides:
                    namespace: { resource: "namespace" }
                    pod: { resource: "pod" }
                name:
                  matches: "^(.*)_seconds_total"
                  as: "${1}_core_per_second"
                metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>)'
      
              ## Example 2: This is an example of external metric config.
              
              # refer to doc: https://help.aliyun.com/document_detail/608768.html
      
              ## Add a conversion rule. Make sure that the metric label is the same as the label of the metric in Managed Service for Prometheus. If the labels are different, specify the label of the metric in Managed Service for Prometheus. 
              
              #- seriesQuery: http_requests_total{namespace!="",pod!=""}
              #  resources:
              #    overrides:
              #      # The resource field specifies a Kubernetes API resource. You can run the kubectl api-resources -o wide command to query resources. 
              # The key field specifies the LabelName of the Managed Service for Prometheus metric. Make sure that the Managed Service for Prometheus metric uses the specified LabelName. 
              #      namespace: {resource: "namespace"}
              #      pod: {resource: "pod"}
              #  name:
              #    matches: ^(.*)_total
              #   as: ${1}_per_second
              #  metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)
      
      
              # this is an example of external metric config.
              
              # refer to doc: https://help.aliyun.com/document_detail/608768.html
      
              #- seriesQuery: arms_app_requests_count
              #  metricsQuery: sum by (rpc) (sum_over_time(<<.Series>>{rpc="/demo/queryUser/{id}",service="arms-demo:arms-k8s-demo",prpc="__all__",ppid="__all__",endpoint="__all__",destId="__all__",<<.LabelMatchers>>}[1m]))
              #  name:
              #    as: ${1}_per_second_queryuser
              #    matches: ^(.*)_count
              #  resources:
              #    namespaced: false
      
      
              # this is an example of custom metric from user define prometheus metric: http_requests_total
              # refer to doc: https://help.aliyun.com/document_detail/184519.html
      
              #- seriesQuery: 'http_requests_total{namespace!="",pod!=""}'
              #  resources:
              #    overrides:
              #      namespace: {resource: "namespace"}
              #      pod: {resource: "pod"}
              #  name:
              #    matches: "^(.*)_total"
              #    as: "${1}_per_second"
              #  metricsQuery: 'sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)'
              # - seriesQuery: '{__name__=~"^some_metric_count$"}'
              #   resources:
              #     template: <<.Resource>>
              #   name:
              #     matches: ""
              #     as: "my_custom_metric"
              #   metricsQuery: sum(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)
      
        ......

    The following table describes some of the fields. For more information about the configuration file of ack-alibaba-cloud-adapter, see Configuration file of ack-alibaba-cloud-adapter.

    Field

    Description

    AlibabaCloudMetricsAdapter. prometheus.adapter.rules.custom

    Set this field to the value in the preceding YAML content.

    alibabaCloudMetricsAdapter. prometheus.url

    Specify the endpoint of Managed Service for Prometheus. For more information about how to obtain the endpoint, see Obtain the endpoint of the Prometheus API.

    AlibabaCloudMetricsAdapter. prometheus.prometheusHeader[].Authorization

    Enter the token. For more information about how to obtain the token, see Obtain the endpoint of the Prometheus API.

    AlibabaCloudMetricsAdapter. prometheus.adapter.rules.default

    By default, the predefined metrics are created. We recommend that you use the default value false.

Configure the Metrics-adapter component. After the Metrics-adapter component is deployed, run the following command to check whether the Kubernetes aggregation API has collected data.

  1. Scale pods based on custom metrics.

    1. Run the following command to query the details of custom metrics supported by the HPA:

      kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/" | jq .
    2. Run the following command to query the current value of the http_requests_per_second metric in the default namespace:

      # Query the container_memory_working_set_bytes_per_second metric to view the size of the working memory of the pods in the kube-system namespace per second. 
      kubectl get --raw  "/apis/custom.metrics.k8s.io/v1beta1/namespaces/kube-system/pods/*/container_memory_working_set_bytes_per_second"
      
      # Query the container_cpu_usage_core_per_second metric to view the number of vCores of the pods in the kube-system namespace per second. 
      kubectl get --raw  "/apis/custom.metrics.k8s.io/v1beta1/namespaces/kube-system/pods/*/container_cpu_usage_core_per_second"

      Sample output:

      {
        "kind": "MetricValueList",
        "apiVersion": "custom.metrics.k8s.io/v1beta1",
        "metadata": {
          "selfLink": "/apis/custom.metrics.k8s.io/v1beta1/namespaces/kube-system/pods/%2A/container_memory_working_set_bytes_per_second"
        },
        "items": [
          {
            "describedObject": {
              "kind": "Pod",
              "namespace": "kube-system",
              "name": "ack-alibaba-cloud-metrics-adapter-7cf8dcb845-h****",
              "apiVersion": "/v1"
            },
            "metricName": "container_memory_working_set_bytes_per_second",
            "timestamp": "2023-08-09T06:30:19Z",
            "value": "24576k",
            "selector": null
          }
        ]
      }
  2. Scale pods based on external metrics.

    1. Run the following command to query the details of external metrics supported by the HPA:

      kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/" | jq .
    2. Run the following command to query the current value of the http_requests_per_second metric in the default namespace:

      kubectl get --raw "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/http_requests_per_second"

      Sample output:

      {
        "kind": "ExternalMetricValueList",
        "apiVersion": "external.metrics.k8s.io/v1beta1",
        "metadata": {},
        "items": [
          {
            "metricName": "http_requests_per_second",
            "metricLabels": {},
            "timestamp": "2022-01-28T08:40:20Z",
            "value": "33m"
          }
        ]
      }
      

Step 3: Configure and deploy the HPA to perform auto scaling based on the collected metrics

  1. Deploy the HPA.

    You can use Managed Service for Prometheus metrics to expose custom metrics and external metrics at the same time. The following table describes the two types of metrics.

    Metric type

    Description

    Custom Metric

    Scale Kubernetes objects, such as pods, based on metrics that are related to the objects. For example, you can scale pods based on pod metrics. For more information, see autoscaling-on-multiple-metrics-and-custom-metrics.

    External Metric

    Scale Kubernetes objects, such as pods, based on metrics that are not related to the objects. For example, you can scale the pods of a workload based on the business QPS. For more information, see autoscaling-on-metrics-not-related-to-kubernetes-objects.

    • Method 1: Scale pods based on custom metrics

      1. Create a file named hpa.yaml and add the following content to the file:

        kind: HorizontalPodAutoscaler
        apiVersion: autoscaling/v2
        metadata:
          name: sample-app-memory-high
        spec:
        #Describe the object that you want HPA to scale. HPA can dynamically change the number of pods that are deployed for the object. 
          scaleTargetRef:
            apiVersion: apps/v1
            kind: Deployment
            name: sample-app
        # Specify the upper limit and lower limit of pods. 
          minReplicas: 1
          maxReplicas: 10
        # Specify the metrics based on which HPA performs auto scaling. You can specify different types of metrics at the same time. 
          metrics:
          - type: Pods
            pods:
              # Use the pods/container_memory_working_set_bytes_per_second metric. 
              metric: 
                name: container_cpu_usage_core_per_second
         # Specify an AverageValue type threshold. You can specify only AverageValue type thresholds for Pods metrics. 
              target:
                type: AverageValue
                averageValue: 1024000m       # # 1024000m indicates a memory threshold of 1 KB. The current unit of the metric is bytes per second. m is a precision unit used by Kubernetes. If the value contains decimal places and ACK requires high precision, the m or k unit is used. For example, 1001 m is equal to 1.001 and 1k is equal to 1000.
      2. Run the following command to create the HPA:

        kubectl apply -f hpa.yaml
      3. Run the following command to check whether the HPA runs as expected:

        kubectl get hpa sample-app-memory-high

        Expected output:

        NAME                     REFERENCE               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
        sample-app-memory-high   Deployment/sample-app   24576k/1024000m   3         10        1          7m
    • Method 2: Scale pods based on external metrics

      1. Create a file named hpa.yaml and add the following content to the file:

        apiVersion: autoscaling/v2
        kind: HorizontalPodAutoscaler
        metadata:
          name: sample-app
        spec:
          scaleTargetRef:
            apiVersion: apps/v1
            kind: Deployment
            name: sample-app
          minReplicas: 1
          maxReplicas: 10
          metrics:
            - type: External
              external:
                metric:
                  name: http_requests_per_second
                  selector:
                    matchLabels:
                      job: "sample-app"
        # You can specify only thresholds of the Value or AverageValue type for external metrics. 
                target:
                  type: AverageValue
                  averageValue: 500m
      2. Run the following command to create the HPA:

        kubectl apply -f hpa.yaml
      3. After the LoadBalancer Service is created, run the following command to perform a stress test:

        ab -c 50 -n 2000 LoadBalancer(sample-app):8080/
      4. Run the following command to query the details of the HPA:

        kubectl get hpa sample-app

        Expected output:

        NAME         REFERENCE               TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
        sample-app   Deployment/sample-app   33m/500m   1         10        1          7m

Configuration file of ack-alibaba-cloud-adapter

ack-alibaba-cloud-adapter performs the following steps to convert Managed Service for Prometheus metrics to metrics that are supported by the HPA:

  1. Discovery: discovers Managed Service for Prometheus metrics that can be used by the HPA.

  2. Association: associates the metrics with Kubernetes resources, such as pods, nodes, and namespaces.

  3. Naming: defines the names of the metrics that can be used by the HPA after conversion.

  4. Querying: defines the template of the requests that are sent to the Managed Service for Prometheus API.

In the preceding example, the http_requests_total metric that is exposed by the sample-app pod is converted to the http_requests_per_second metric for the HPA. The following code block shows the configurations of ack-alibaba-cloud-adapter used in the example:

- seriesQuery: http_requests_total{namespace!="",pod!=""}
  resources:
    overrides:
      namespace: {resource: "namespace"}
      pod: {resource: "pod"}
  name:
    matches: ^(.*)_total
    as: ${1}_per_second
  metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)

Field

Description

seriesQuery

The Prometheus Query Language (PromQL) query data.

metricsQuery

Aggregates the PromQL query data in the seriesQuery.

resources

Labels in the PromQL query data, which are matched against resource objects. The resource objects refer to API resources in the cluster, such as pods, namespaces, and nodes. You can run the kubectl api-resources -o wide command to query API resources. The key field specifies the LabelName of the Managed Service for Prometheus metric. Make sure that the Managed Service for Prometheus metric uses the specified LabelName.

name

Converts the names of Managed Service for Prometheus metrics to easy-to-read metric names by using a regular expression. In this example, http_request_total is converted to http_request_per_second.

  1. Discovery

    Specify the Managed Service for Prometheus metric that you want to convert. You can use seriesFilters to filter metrics. The seriesQuery matches data based on the specified labels. The following code block is an example:

    seriesQuery: http_requests_total{namespace!="",pod!=""}
    seriesFilters:
        - isNot: "^container_.*_seconds_total"

    seriesFilters is optional. This field is used to filter metrics:

    • is:<regex>: matches metrics whose names contain this regular expression.

    • isNot:<regex>: matches metrics whose names do not contain this regular expression.

  2. Association

    Map the labels of Managed Service for Prometheus metrics to Kubernetes resources. The labels of the http_requests_total metric are namespace!="" and pod!="".

    - seriesQuery: http_requests_total{namespace!="",pod!=""}
      resources:
        overrides:
          namespace: {resource: "namespace"}
          pod: {resource: "pod"}
  3. Naming

    Name the HPA metrics that are converted from Managed Service for Prometheus metrics. The names of the Managed Service for Prometheus metrics remain unchanged. You do not need to configure the naming settings if you directly use the Managed Service for Prometheus metrics.

    You can run the kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1" command to query the metrics that are supported by the HPA.

    - seriesQuery: http_requests_total{namespace!="",pod!=""}
      resources:
        overrides:
          namespace: {resource: "namespace"}
          pod: {resource: "pod"}
      name:
        matches: "^(.*)_total"
        as: "${1}_per_second"
  4. Querying

    The template of requests that are sent to the Managed Service for Prometheus API. ack-alibaba-cloud-adapter passes parameters in the HPA to the request template, sends a request to the Managed Service for Prometheus API based on the template, and then sends the returned parameter values to the HPA for auto scaling.

    - seriesQuery: http_requests_total{namespace!="",pod!=""}
      resources:
        overrides:
          namespace: {resource: "namespace"}
          pod: {resource: "pod"}
      name:
        matches: ^(.*)_total
        as: ${1}_per_second
      metricsQuery: sum(rate(<<.Series>>{<<.LabelMatchers>>}[2m])) by (<<.GroupBy>>)

Obtain the endpoint of the Prometheus API

Scenario 1: Managed Service for Prometheus

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage and choose Operations > Prometheus Monitoring in the left-side navigation pane.

  3. Click Go to ARMS Prometheus in the upper-right corner.

  4. In the left-side navigation pane of the Managed Service for Prometheus console, click Settings. Then, click the Configure tab and view HTTP API URL (Grafana Read URL).

    We recommend that you call the Prometheus API over an internal network. You can call the API over the Internet if no internal network is available.

    3.png

Scenario 2: Open source Prometheus

  1. Deploy Prometheus.

    1. Log on to the ACK console. In the left-side navigation pane, choose Marketplace > Marketplace.

    2. On the App Catalog page, find and click ack-prometheus-operator. On the page that appears, click Deploy.

    3. In the panel that appears, specify Cluster and Namespace, modify Release Name, and click Next. Modify Parameters based on your business requirements and click OK.

    4. View the deployment result.

      1. Run the following command to map Prometheus in the cluster to local port 9090:

        kubectl port-forward svc/ack-prometheus-operator-prometheus 9090:9090 -n monitoring
      2. Enter localhost:9090 in the address bar of a browser to visit the Prometheus page.

      3. In the top navigation bar, choose Status > Targets to view all collection tasks.image.png

        Tasks in the UP state are running as expected.

        image.png

    5. Check service and namespace in the Labels column.

      The following code block shows the endpoint. In this example, ServiceName is ack-prometheus-operator-prometheus and ServiceNamespace is monitoring.

      http://ack-prometheus-operator-prometheus.monitoring.svc.cluster.local:9090

References