All Products
Search
Document Center

Container Service for Kubernetes:Managed Service for Prometheus

Last Updated:Mar 22, 2024

After you enable Managed Service for Prometheus, you can view dashboards and performance metrics that are predefined for Container Service for Kubernetes (ACK). This topic describes how to enable Managed Service for Prometheus in ACK, how to configure alert rules, and how to create custom metrics and use Grafana to display the metrics.

Introduction to Managed Service for Prometheus

Managed Service for Prometheus is a fully managed monitoring service interfaced with the open source Prometheus ecosystem. Managed Service for Prometheus monitors a wide array of components and provides multiple predefined dashboards. Managed Service for Prometheus saves you the efforts to manage underlying services, such as data storage, data display, and system maintenance.

For more information about Managed Service for Prometheus, see What is Managed Service for Prometheus?

Enable Managed Service for Prometheus

Method 1: Enable Managed Service for Prometheus when you create a cluster

On the Component Configurations wizard page, select Enable Managed Service for Prometheus. For more information, see Create an ACK managed cluster.

image.png

Note
  • By default, Enable Managed Service for Prometheus is selected when you create a cluster.

  • After the cluster is created, the system automatically configures Managed Service for Prometheus.

Method 2: Enable Managed Service for Prometheus in an existing cluster

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage and choose Operations > Prometheus Monitoring in the left-side navigation pane.

  3. On the Prometheus Monitoring page, click Install.

    The system automatically installs the component and checks the dashboards. After the installation is complete, you can click each tab to view metrics.

View Grafana dashboards provided by Managed Service for Prometheus

On the Prometheus Monitoring page, click the name of a Grafana dashboard to view the monitoring data.

Configure alert rules in Managed Service for Prometheus

Managed Service for Prometheus allows you to create alert rules for monitoring jobs. When alert rules are met, you can receive alerts through emails, text messages, and DingTalk notifications in real time. This helps you detect errors in a proactive manner. When an alert rule is met, notifications are sent to the contact group that you specified. Before you can create a contact group, you must create a contact. When you create a contact, you can specify the mobile phone number and email address of the contact to receive notifications. You can also provide a DingTalk chatbot webhook URL that is used to automatically send alert notifications.

Step 1: Create a contact

  1. Log on to the Managed Service for Prometheus console. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed.

  2. In the left-side navigation pane, choose Alert Management > Notification Objects.

  3. On the Contacts tab, click Create Contact in the upper-right corner.
  4. In the Create Contact dialog box, set the parameters and click OK. The following table describes the parameters.
    ParameterDescription
    NameThe name of the contact.
    Mobile Phone NumberAfter you specify the mobile phone number of a contact, the contact can be notified by phone call and text message.
    Note You can specify only verified mobile phone numbers in a notification policy. For more information about how to verify mobile phone numbers, see Verify mobile phone numbers.
    EmailAfter you specify the email address of a contact, the contact can be notified by email.
    Method to Resend Notifications If Phone Notifications FailSelect the method to resend notifications if phone notifications fail.

    You can specify a global default setting for this parameter on the Contacts tab. For more information, see Specify a default method to resend notifications.

    Important You must specify at least one of the Mobile Phone Number and Email parameters. Each mobile phone number or email address can be used for only one contact.

Step 2: Configure alert rules

  1. Log on to the Managed Service for Prometheus console. In the left-side navigation pane, click Monitoring List.

  2. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance.

  3. In the left-side navigation pane, click Alert Rules. On the Prometheus Alert Rules page, click Edit in the Actions column of the alert rule that you want to modify, modify the alert rule, and then click Save to quickly configure an alert rule for a metric.

    For more information, see Create an alert rule for a Prometheus instance (for the new console version) or Create an alert rule (for the old console version).

Create custom metrics and use Grafana to display the metrics

Method 1: Create custom metrics by adding annotations

You can add annotations to Deployments to define custom metrics. Managed Service for Prometheus uses the default service discovery feature to automatically collect custom metrics from pods.

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. Create an application.

    1. On the Clusters page, click the name of your cluster. In the left-side navigation pane, choose Workloads > Deployments.

    2. On the Deployments page, click Create from Image.

    3. On the Basic Information wizard page, specify the basic information of the application and click Next.

    4. On the Container wizard page, specify a container image and the required resources, create a web application, expose port 5000, and then click Next.

      In this example, the container image yejianhonghong/pindex is used.

      容器配置

    5. In the Pod Annotations section of the Advanced wizard page, add pod annotations.

      The prometheus.io/port annotation is used to specify the endpoint port that Managed Service for Prometheus scrapes. The prometheus.io/path annotation is used to specify the endpoint path that Managed Service for Prometheus scrapes.标签和注解

    6. Click Create to create the application.

      For more information about how to create an application, see Create a stateless application by using a Deployment.

  3. Create a Service.

    1. On the Clusters page, click the name of the cluster that you want to manage and choose Network > Services in the left-side navigation pane.

    2. Click Create in the upper-right part of the Services page. In the Create Service dialog box, configure the following parameters.

      Parameter

      Description

      Name

      Enter a Service name.

      Type

      Select Server Load Balancer and enable Public Access.

      Backend

      Select the application that you created.

      Port Mapping

      Specify Service Port and Container Port.

    3. Click Create to create the Service.

      For more information about how to create a Service, see Create a Service.

  4. Create custom metrics.

    1. Log on to the Managed Service for Prometheus console.

    2. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance to go to the instance page.

    3. In the left-side navigation pane, click Service Discovery. Then, click the Configure tab and add ServiceMonitor and PodMonitor settings to define Prometheus metric collection rules.

      For more information about how to configure custom metrics, see Manage service discoveries.

    4. Click the Targets tab to view the custom metrics that you configured.

      自定义指标

  5. In the ACK console, access the external endpoint of the Service that you created to increase the value of the following custom metric.

    image.png

    For more information about metrics, see Data model.增加指标值

  6. View custom metrics in Grafana.

    1. Log on to the Managed Service for Prometheus console.

    2. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance to go to the instance page.

    3. In the left-side navigation pane, click Dashboards and click a predefined dashboard to log on to Grafana. Then, click the image.png icon in the upper-right part of the page and click Add a new panel to add a panel.

      image.png

    4. Select your ACK cluster as the data source and enter a PromQL statement. For example, set Metrics to current_person_counts.

      image.png

  7. Save the configurations to view custom metrics in the Grafana chart.

    Grafana

Method 2: Create custom metrics by using ServiceMonitors

To use ServiceMonitors to create custom metrics, you need to add labels instead of annotations to your Services.

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. Create an application.

    1. On the Clusters page, click the name of your cluster. In the left-side navigation pane, choose Workloads > Deployments.

    2. On the Deployments page, click Create from Image.

    3. On the Basic Information wizard page, specify the basic information of the application and click Next.

    4. On the Container wizard page, specify a container image and the required resources, create a web application, expose port 5000, and then click Next.

      In this example, the container image yejianhonghong/pindex is used.容器配置

    5. On the Advanced wizard page, click Create to create the application.

  3. Create a Service.

    1. On the Clusters page, click the name of the cluster that you want to manage and choose Network > Services in the left-side navigation pane.

    2. On the Services page, click Create in the upper-right part. In the Create Service dialog box, configure the following parameters.

      Parameter

      Description

      Name

      Enter a Service name.

      Type

      Select Server Load Balancer and enable Public Access.

      Backend

      Select the application that you created.

      Port Mapping

      Specify Service Port and Container Port.

      Label

      Add a label. This label is used by the selector of ServiceMonitors.

    3. Click Create to create the Service.

      For more information about how to create a Service, see Create a Service.

  4. Configure custom metrics. Use the endpoints that Managed Service for Prometheus scrapes.

    1. Log on to the Managed Service for Prometheus console.

    2. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance to go to the instance page.

    3. In the left-side navigation pane, click Service Discovery. Then, click the Configure tab.

    4. On the Configure tab, click the ServiceMonitor tab.

    5. On the ServiceMonitor tab, click Add ServiceMonitor to create a ServiceMonitor.

      The following code block shows the YAML template:

      apiVersion: monitoring.coreos.com/v1
      kind: ServiceMonitor
      metadata:
        # Enter a unique name. 
        name: custom-metrics-pindex
        # Specify a namespace. 
        namespace: default
      spec:
        endpoints:
        - interval: 30s
          # Enter the name of the port specified in the Port Mapping section when you created the Service, as shown in the preceding figure. 
          port: web
          # Enter the path of the Service. 
          path: /access
        namespaceSelector:
          any: true
          # The namespace of the NGINX demo application. 
        selector:
          matchLabels:
            # Enter the label that you added to the Service. 
            app: custom-metrics-pindex

      Click OK to create the ServiceMonitor.

      For more information about how to configure custom metrics, see Manage service discovery.

    6. On the Targets tab, the endpoints that Managed Service for Prometheus scrapes are displayed.

      Scape Endpioint

      Note

      The definition of a ServiceMonitor provides more information than an annotation, which includes the namespace and name of the Service.

  5. In the ACK console, access the external endpoint of the Service to increase the value of the following metric.

    image.png

    For more information about metrics, see Data model.增加指标值

  6. View custom metrics in Grafana.

    1. Log on to the Managed Service for Prometheus console.

    2. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance to go to the instance page.

    3. In the left-side navigation pane, click Dashboards and click a predefined dashboard to log on to Grafana. Then, click the image.png icon in the upper-right part of the page and click Add a new panel to add a panel.

      image.png

    4. Select your ACK cluster as the data source and enter a PromQL statement. For example, set Metrics to current_person_counts.

      image.png

  7. Save the configurations to view custom metrics in the Grafana chart.

    Grafana

FAQ

How do I check the version of the ack-arms-prometheus component?

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage and choose Operations > Add-ons in the left-side navigation pane.

  3. On the Add-ons page, click the Logs and Monitoring tab and find the ack-arms-prometheus component.

    The version number is displayed in the lower part of the component. If a new version is available, click Upgrade on the right side to update the component.

    Note

    The Upgrade button is displayed only if the component is not updated to the latest version.

Why is Managed Service for Prometheus unable to monitor GPU-accelerated nodes?

Managed Service for Prometheus may be unable to monitor GPU-accelerated nodes that are configured with taints. You can perform the following steps to view the taints of a GPU-accelerated node.

  1. Run the following command to view the taints of a GPU-accelerated node:

    If you added custom taints to the GPU-accelerated node, you can view information about the custom taints. In this example, a taint whose key is set to test-key, value is set to test-value, and effect is set to NoSchedule is added to the node.

    kubectl describe node cn-beijing.47.100.***.***

    Expected output:

    Taints:test-key=test-value:NoSchedule
  2. Use one of the following methods to handle the taint:

    • Run the following command to delete the taint from the GPU-accelerated node:

      kubectl taint node cn-beijing.47.100.***.*** test-key=test-value:NoSchedule-
    • Add a toleration rule that allows pods to be scheduled to the GPU-accelerated node with the taint.

      # 1 Run the following command to modify ack-prometheus-gpu-exporter: 
      kubectl edit daemonset -n arms-prom ack-prometheus-gpu-exporter
      
      # 2. Add the following fields to the YAML file to tolerate the taint: 
      # Irrelevant fields are not shown. 
      # The tolerations field must be added above the containers field and both fields must be of the same level. 
      tolerations:
      - key: "test-key"
        operator: "Equal"
        value: "test-value"
        effect: "NoSchedule"
      containers:
       # Irrelevant fields are not shown.

What do I do if I fail to reinstall ack-arms-prometheus due to residual resource configurations of ack-arms-prometheus?

If you delete only the namespace of Managed Service for Prometheus, resource configurations are retained. In this case, you may fail to reinstall ack-arms-prometheus. You can perform the following operations to delete the residual resource configurations:

  • Run the following command to delete the arms-prom namespace:

    kubectl delete namespace arms-prom
  • Run the following commands to delete the related ClusterRoles:

    kubectl delete ClusterRole arms-kube-state-metrics
    kubectl delete ClusterRole arms-node-exporter
    kubectl delete ClusterRole arms-prom-ack-arms-prometheus-role
    kubectl delete ClusterRole arms-prometheus-oper3
    kubectl delete ClusterRole arms-prometheus-ack-arms-prometheus-role
    kubectl delete ClusterRole arms-pilot-prom-k8s
    kubectl delete ClusterRole gpu-prometheus-exporter
  • Run the following commands to delete the related ClusterRoleBindings:

    kubectl delete ClusterRoleBinding arms-node-exporter
    kubectl delete ClusterRoleBinding arms-prom-ack-arms-prometheus-role-binding
    kubectl delete ClusterRoleBinding arms-prometheus-oper-bind2
    kubectl delete ClusterRoleBinding arms-kube-state-metrics
    kubectl delete ClusterRoleBinding arms-pilot-prom-k8s
    kubectl delete ClusterRoleBinding arms-prometheus-ack-arms-prometheus-role-binding
    kubectl delete ClusterRoleBinding gpu-prometheus-exporter
  • Run the following commands to delete the related Roles and RoleBindings:

    kubectl delete Role arms-pilot-prom-spec-ns-k8s
    kubectl delete Role arms-pilot-prom-spec-ns-k8s -n kube-system
    kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s
    kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s -n kube-system

After you delete the residual resource configurations, go to the ACK console, choose Operations > Add-ons, and reinstall the ack-arms-prometheus component.