The ack-slo-manager component provides resource profiles for Kubernetes workloads. You can obtain recommendations on resource specifications for individual containers in pods based on the resource profiles. This simplifies your work in configuring resource requests and limits for pods. This topic describes how to obtain recommendations on resource specifications for containers.

Prerequisites

  • A Container Service for Kubernetes (ACK) Pro cluster is created. For more information, see Create an ACK Pro cluster.
  • ack-slo-manager is installed in the cluster. For more information, see Usage notes.

Background information

Kubernetes allows you to specify resource requests for containers in a pod. The scheduler schedules the pod to a node whose capacity meets the resource requests that you specify. When you specify the resource request for a container, you can refer to the historical resource utilization and stress test results. You can also adjust the resource request after the container is created based on the performance of the container.

However, you may encounter the following issues:
  • To ensure application stability, you need to reserve a specific amount of resources as a buffer to handle the fluctuations of the upstream and downstream workloads. As a result, the amount of resources in the resource requests that you specify for containers is excessively greater than the actual amount of resources used by the containers. This causes low resource utilization and resource waste in the cluster.
  • If your cluster hosts a large number of pods, you can reduce the resource request for individual containers to increase resource utilization in the cluster. This allows you to deploy more containers on a node. However, application stability is adversely affected when traffic spikes.

To resolve this issue, ack-slo-manager provides resource profiles for your workloads. You can obtain recommendations on resource specifications for individual containers in pods based on the resource profiles. This simplifies your work in configuring resource requests and limits for containers.

Limits

The following table describes the versions that are required for system components.

Component Required version
Kubernetes 1.18 and later
metrics-server 0.3.8 and later
ack-slo-manager 0.3.0 and later
Helm 3.0 and later
OS Alibaba Cloud Linux 2, CentOS 7.6, and CentOS 7.7
Notice If your cluster uses containerd as the container runtime and the cluster nodes were added before 14:00 (UTC+8) on January 19, 2022, you must remove the cluster nodes and add them to the cluster again, or update the Kubernetes version of your cluster to the latest version. For more information about how to update the Kubernetes version of an ACK cluster, see Update the Kubernetes version of an ACK cluster.

Procedure

  1. Use the following YAML template to enable recommendations on resource specifications for your workloads.
    You can use the RecommendationProfile CustomResourceDefinition (CRD) to generate resource profiles for your workloads and provide recommendations on resource specifications for containers in your workloads. You can specify the namespaces and workload types to which a RecommendationProfile CRD is applied.
    apiVersion: autoscaling.alibabacloud.com/v1alpha1
    kind: RecommendationProfile
    metadata:
      # The name of the RecommendationProfile CRD. If you want to create a non-namespaced RecommendationProfile CRD, do not specify a namespace. 
      name: profile-demo
    spec:
      # The workload types for which you want to enable resource profiling. 
      controllerKind:
      - Deployment
      # The namespaces for which you want to enable resource profiling. 
      enabledNamespaces:
      - recommender-demo

    The following table describes the parameters in the YAML template.

    Parameter Data type Description
    metadata.name String The name of the resource object.

    If you want to create a non-namespaced RecommendationProfile CRD, do not specify a namespace.

    spec.controllerKind String The workload types for which you want to enable resource profiling.

    The following workload types are supported: Deployment , StatefulSet, DaemonSet, and ReplicaSet.

    spec.enabledNamespaces String The namespaces for which you want to enable resource profiling.

    An asterisk (*) indicates all namespaces.

    Notice To generate accurate recommendations on resource specifications, we recommend that you wait at least one day after you enable resource profiling for your workloads. This way, you can obtain a sufficient amount of historical data.
  2. Run the kubectl get recommendations -o yaml command to obtain recommendations on resource specifications for your workloads.
    After you enable resource profiling for your workloads, ack-slo-manager provides recommendations on resource specifications for each container in your workloads. The recommendations are stored in the Recommendation CRD. The following code block shows the content of a Recommendation CRD that stores the recommendation on resource specifications for a workload named cpu-load-gen:
    apiVersion: autoscaling.alibabacloud.com/v1alpha1
    kind: Recommendation
    metadata:
      labels:
        alpha.alibabacloud.com/recommendation-workload-apiVersion: app-v1
        alpha.alibabacloud.com/recommendation-workload-kind: Deployment
        alpha.alibabacloud.com/recommendation-workload-name: cpu-load-gen
      name: f20ac0b3-dc7f-4f47-b3d9-bd91f906****
      namespace: recommender-demo
    spec:
      workloadRef:
        apiVersion: apps/v1
        kind: Deployment
        name: cpu-load-gen
    status:
      recommendResources:
        containerRecommendations:
        - containerName: cpu-load-gen
          target:
            cpu: 4742m
            memory: 262144k
          originalTarget: # The intermediate results provided by the algorithm that is used to generate the recommendation on resource specifications. We recommend that you do not use the intermediate results. 
           # ...

    To facilitate data retrieval, the Recommendation CRD is generated in the same namespace as the workload. In addition, the Recommendation CRD has specific labels that record the API version, type, and name of the workload. The following table describes the labels.

    Label Key Description Example
    alpha.alibabacloud.com/recommendation-workload-apiVersion The API version of the workload. The value of the label conforms to the Kubernetes specifications. Forward slashes (/) are replaced by hyphens (-). app-v1 (Original form: app/v1)
    alpha.alibabacloud.com/recommendation-workload-kind The type of the workload, for example, Deployment or StatefulSet. Deployment
    alpha.alibabacloud.com/recommendation-workload-name The name of the workload. The value of the label conforms to the Kubernetes specifications and cannot exceed 63 characters in length. cpu-load-gen

    The recommendations on resource specifications for each container are displayed in the status.recommendResources.containerRecommendations section. The following table describes the fields.

    Field Description Format Example
    containerName The name of the container. string cpu-load-gen
    target The recommendations on CPU and memory resources. map[ResourceName]resource.Quantity
    • cpu: 4742m
    • memory: 262144k
    originalTarget The intermediate results provided by the algorithm that is used to generate the recommendation on resource specifications. We recommend that you do not use the intermediate results. If you want to use the intermediate results, Submit a ticket. - -
    Note The recommended minimum CPU resources are 0.025 vCPUs. The recommended minimum memory resources are 250 MB.
  3. Verify the results in Application Real-Time Monitoring Service (ARMS) Prometheus
    ack-slo-manager allows you to check resource profiles in ARMS Prometheus.
    1. To collect the resource profiles generated by ack-slo-manager to ARMS Prometheus, modify the ConfigMap of ack-slo-manager based on the following content:
      By default, the resource profiles generated by ack-slo-manager are not collected to ARMS Prometheus. If you want to collect the resource profiles generated by ack-slo-manager, you must modify the ConfigMap of ack-slo-manager.
      apiVersion: v1
      data:
        recommender-config: 
          {
            "enableRecommendationTargetMetric": true
          }
      kind: ConfigMap
      metadata:
        namespace: kube-system
        name: ack-slo-manager-config
    2. Run the following command to update the ack-slo-manager-config ConfigMap.
      To avoid changing other settings in the ack-slo-manager-config ConfigMap, we recommend that you run the kubectl patch command to update the ConfigMap.
      kubectl patch cm -n kube-system ack-slo-manager-config --patch "$(cat configmap.yaml)"
    3. View details about the collected resource profiles.
      • If this is the first time you use Prometheus dashboards, reset the dashboards and install the Resource Profile dashboard. For more information about how to reset Prometheus dashboards, see Reset dashboards

        To view details about the collected resource profiles in ARMS Prometheus, perform the following steps:

        1. Log on to the ACK console.
        2. In the left-side navigation pane, click Clusters.
        3. On the Clusters page, find the cluster that you want to manage and click its name or click Details in the Actions column.
        4. In the left-side navigation pane of the cluster details page, choose Operations > Prometheus Monitoring.
        5. On the Prometheus Monitoring page, choose Cost Analysis/Resource Optimization > Resource Profile.

          On the Resource Profile tab, you can view details about the collected resource profiles. The details include the resource requests, resource usage, and recommended resource specifications of containers. For more information, see Enable ARMS Prometheus.

          Resource Profile
      • If you use a self-managed Prometheus monitoring system, you can use the following metrics to configure dashboards:
        # Information about the containers for which you want to generate recommendations on CPU resources. 
        slo_manager_recommender_recommendation_workload_target{exported_namespace="$namespace", workload_name="$workload", container_name="$container", resource="cpu"}
        # Information about the containers for which you want to generate recommendations on memory resources. 
        slo_manager_recommender_recommendation_workload_target{exported_namespace="$namespace", workload_name="$workload", container_name="$container", resource="memory"}

Examples

  1. Create a file named cpu-load-gen.yaml with the following YAML template:
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: cpu-load-gen
      labels:
        app: cpu-load-gen
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: cpu-load-gen-selector
      template:
        metadata:
          labels:
            app: cpu-load-gen-selector
        spec:
          containers:
          - name: cpu-load-gen
            image: registry.cn-zhangjiakou.aliyuncs.com/acs/slo-test-cpu-load-gen:v0.1
            command: ["cpu_load_gen.sh"]
            imagePullPolicy: Always
            resources:
              requests:
                cpu: 8 # Request eight vCPUs for the application. 
                memory: "1G"
              limits:
                cpu: 12
                memory: "2G"
  2. Run the following command to deploy the cpu-load-gen application:
    kubectl apply -f cpu-load-gen.yaml
  3. Create a file named recommender-profile.yaml with the following YAML template:
    apiVersion: autoscaling.alibabacloud.com/v1alpha1
    kind: RecommendationProfile
    metadata:
      name: profile-demo
    spec:
      controllerKind:
      - Deployment
      enabledNamespaces: # Enable recommendations on resource specifications for all Deployments in the default namespace. 
      - default
  4. Run the following command to enable resource profiling for the application that you created:
    kubectl apply -f recommender-profile.yaml
  5. Run the following command to obtain recommendations on resource specifications for the application that you created.
    To generate accurate recommendations on resource specifications, we recommend that you wait at least one day after you enable resource profiling for your workloads. This way, you can obtain a sufficient amount of historical data. For more information about the labels, see Labels.
    kubectl get recommendations -l \
      "alpha.alibabacloud.com/recommendation-workload-apiVersion=apps-v1, \
      alpha.alibabacloud.com/recommendation-workload-kind=Deployment, \
      alpha.alibabacloud.com/recommendation-workload-name=cpu-load-gen" -o yaml

    Expected output:

    apiVersion: autoscaling.alibabacloud.com/v1alpha1
    kind: Recommendation
    metadata:
      creationTimestamp: "2022-02-09T08:56:51Z"
      labels:
        alpha.alibabacloud.com/recommendation-workload-apiVersion: app-v1
        alpha.alibabacloud.com/recommendation-workload-kind: Deployment
        alpha.alibabacloud.com/recommendation-workload-name: cpu-load-gen
      name: f20ac0b3-dc7f-4f47-b3d9-bd91f906****
      namespace: recommender-demo
    spec:
      workloadRef:
        apiVersion: apps/v1
        kind: Deployment
        name: cpu-load-gen
    status:
      conditions:
      - lastTransitionTime: "2022-02-09T08:56:52Z"
        status: "True"
        type: RecommendationProvided
      recommendResources:
        containerRecommendations:
        - containerName: cpu-load-gen
          target:
            cpu: 4742m # The recommended CPU resources are 4.742 vCPUs. 
            memory: 262144k
          originalTarget: # The intermediate results provided by the algorithm that is used to generate the recommendation on resource specifications. We recommend that you do not use the intermediate results. 
            #...              

Analyze the results

Compare the resource specifications in Step 1 and Step 5. The requested amount of CPU resources is greater than the amount of CPU resources in the recommendation. You can reduce the CPU request of the application to save resources in the cluster.

Item Requested amount Recommended amount
CPU 8 vCPUs 4.742 vCPUs