All Products
Search
Document Center

Container Service for Kubernetes:Enable Managed Service for Prometheus

Last Updated:Aug 08, 2023

You can view metrics for Container Service for Kubernetes (ACK) Serverless clusters on predefined dashboards that are provided by Managed Service for Prometheus. This topic describes how to enable Managed Service for Prometheus for ACK Serverless clusters, how to configure alert rules in Managed Service for Prometheus, how to create custom metrics in Managed Service for Prometheus, and how to use Grafana to display custom monitoring metrics.

Introduction to Managed Service for Prometheus

Managed Service for Prometheus is a managed monitoring service that is fully interfaced with the open source Prometheus ecosystem. Managed Service for Prometheus monitors a wide array of components and provides multiple ready-to-use dashboards. Managed Service for Prometheus saves you the effort to manage underlying services, such as data storage, data presentation, and system maintenance.

For more information about Managed Service for Prometheus, see What is Managed Service for Prometheus?.

Prometheus monitoring agents

You can install managed or unmanaged Prometheus monitoring agents in ACK Serverless Pro clusters. By default, managed Prometheus monitoring agents are installed in ACK Serverless Pro clusters.

  • Managed Prometheus monitoring agents: Managed Prometheus monitoring agents allow Managed Service for Prometheus to directly collect monitoring data from containers in ACK Serverless Pro clusters and allow you to use out-of-the-box features provided by Managed Service for Prometheus.

  • Unmanaged Prometheus monitoring agents: You need to deploy a set of components, including metric collection components and Kube-State-Metrics. In addition, you must launch at least two elastic container instances that provide 1.5 vCores and 1.5 GB of memory in total. The actual resource specifications of the elastic container instances dynamically scale based on the amount of data in your cluster. For more information about the pricing of elastic container instances, see Overview of elastic container instances.

Enable Managed Service for Prometheus

Method 1: Enable Managed Service for Prometheus when you create a cluster

On the Component Configurations wizard page, select Enable Prometheus Monitoring. For more information, see Create an ACK Serverless cluster.

image.png
    Note

    By default, Enable Prometheus Monitoring is selected when you create a cluster in the ACK console.

    After the cluster is created, the system automatically configures Managed Service for Prometheus.

Method 2: Enable Managed Service for Prometheus in an existing cluster

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage and choose Operations > Prometheus Monitoring in the left-side navigation pane.

  3. On the Prometheus Monitoring page, click Install.

    The system automatically installs the component and checks the dashboards. After the installation is complete, you can click each tab to view metrics.

To install an unmanaged Prometheus monitoring agent in an existing cluster, you must first uninstall the managed monitoring component ack-arms-prometheus. Go to the cluster details page in the ACK console and choose Operations > Add-ons in the left-side navigation pane. You can uninstall ack-arms-prometheus from the Add-ons page. After ack-arms-prometheus is uninstalled, install the unmanaged version of ack-arms-prometheus from the Add-ons page.

image.png
Note

If ack-arms-prometheus is not displayed on the Add-ons page, the region where the ACK Serverless cluster resides does not support Managed Service for Prometheus.

View Grafana dashboards provided by Managed Service for Prometheus

On the Prometheus Monitoring page, click the name of a Grafana dashboard to view the monitoring data.

Configure alert rules in Managed Service for Prometheus

Managed Service for Prometheus allows you to create alert rules for monitoring jobs. When alert rules are met, you can receive alerts through emails, Short Message Service (SMS) messages, and DingTalk notifications in real time. This helps you detect errors in a proactive manner. When an alert rule is met, notifications are sent to the contact group that you specified. Before you can create a contact group, you must create a contact. When you create a contact, you can specify the mobile phone number and email address of the contact to receive notifications. You can also provide a DingTalk chatbot webhook URL that is used to automatically send alert notifications.

Step 1: Create a contact

  1. Log on to the Managed Service for Prometheus console. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed.

  2. In the left-side navigation pane, choose Alert Management > Notification Objects.

  3. On the Contacts tab, click Create Contact in the upper-right corner.
  4. In the Create Contact dialog box, set the parameters and click OK. The following table describes the parameters.
    ParameterDescription
    NameThe name of the contact.
    Mobile Phone NumberAfter you specify the mobile phone number of a contact, the contact can be notified by phone call and text message.
    Note You can specify only verified mobile phone numbers in a notification policy. For more information about how to verify mobile phone numbers, see Verify mobile phone numbers.
    EmailAfter you specify the email address of a contact, the contact can be notified by email.
    Method to Resend Notifications If Phone Notifications FailSelect the method to resend notifications if phone notifications fail.

    You can specify a global default setting for this parameter on the Contacts tab. For more information, see Specify a default method to resend notifications.

    Important You must specify at least one of the Mobile Phone Number and Email parameters. Each mobile phone number or email address can be used for only one contact.

Step 2: Configure alert rules

  1. Log on to the Managed Service for Prometheus console. In the left-side navigation pane, click Monitoring List.

  2. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance.

  3. In the left-side navigation pane, click Alert Rules. On the Prometheus Alert Rules page, click Edit in the Actions column of the alert rule that you want to modify, modify the alert rule, and then click Save to quickly configure an alert rule for a metric.

    For more information, see Create an alert rule for a Prometheus instance (for the new console version) or Create an alert rule (for the old console version).

Create custom metrics and use Grafana to display the monitoring metrics

Method 1: Create custom metrics by adding annotations

You can add annotations to pod configuration templates to define custom metrics. Managed Service for Prometheus automatically obtains the custom metrics by performing service discovery. For more information, see Manage service discovery in Kubernetes clusters.

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. Creates an application.

    1. On the Clusters page, click the name of a cluster and choose Workloads > Deployments in the left-side navigation pane.

    2. On the Deployments tab, click Create from Image.

    3. On the Basic Information wizard page, configure the basic settings and click Next.

    4. On the Container page, specify an image that is used to create a web application, specify resource specifications for the web application, and open port 5000. Then, click Next.

      image.png
  3. Create a Service.

    1. On the Clusters page, click the name of the cluster that you want to manage and choose Network > Services in the left-side navigation pane.

    2. Click Create in the upper-right part of the Services page. In the Create Service dialog box, configure the following parameters.

      Parameter

      Description

      Name

      Enter a Service name.

      Type

      Select Server Load Balancer and enable Public Access.

      Backend

      Select the application that you created.

      Port Mapping

      Specify Service Port and Container Port.

    3. Click Create to create the Service.

      For more information about how to create a Service, see Create a Service.

  4. Create custom metrics.

    1. Log on to the Managed Service for Prometheus console.

    2. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance.

    3. In the left-side navigation pane, click Service Discovery. Then, click the Configure tab and add ServiceMonitor and PodMonitor settings to define Prometheus metric collection rules.

      For more information about how to configure custom metrics, see Configure service discoveries.

    4. Click the Targets tab to view the custom metrics that you configured.

      自定义指标
  5. In the ACK console, access the external endpoint of the Service that you created to increase the value of the following custom metric.

    image.png

    For more information about metrics, see Data model.增加指标值

  6. View custom metrics in Grafana.

    1. Log on to the Managed Service for Prometheus console.

    2. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance.

    3. In the left-side navigation pane, click Dashboards and click a predefined dashboard to log on to Grafana. Then, click the image.png icon in the upper-right part of the page and click Add a new panel to add a panel.

      image.png
    4. Select your ACK cluster as the data source and enter a PromQL statement. For example, set Metrics to current_person_counts.

      image.png
  7. Save the configurations to view custom metrics in the Grafana chart.

    Grafana

Method 2: Use ServiceMonitors to create custom metrics

To use ServiceMonitors to create custom metrics, you need to add labels instead of annotations to your Services.

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. Create an application.

    1. On the Clusters page, click the name of a cluster and choose Workloads > Deployments in the left-side navigation pane.

    2. On the Deployments tab, click Create from Image.

    3. On the Basic Information wizard page, configure the basic settings and click Next.

    4. On the Container page, specify an image that is used to create a web application, specify resource specifications for the web application, and open port 5000. Then, click Next.

      In this example, the yejianhonghong/pindex image is selected.

      image.png
    5. On the Advanced wizard page, click Create.

  3. Create a Service.

    1. On the Clusters page, click the name of the cluster that you want to manage and choose Network > Services in the left-side navigation pane.

    2. On the Services page, click Create in the upper-right part. In the Create Service dialog box, configure the following parameters.

      Parameter

      Description

      Name

      Enter a Service name.

      Type

      Select Server Load Balancer and enable Public Access.

      Backend

      Select the application that you created.

      Port Mapping

      Specify Service Port and Container Port.

      Labels

      Add a label. This label is used by the selector of ServiceMonitors.

    3. Click Create to create the Service.

      For more information about how to create a Service, see Create a Service.

  4. Configure custom metrics. Use the endpoints that Managed Service for Prometheus scrapes.

    1. Log on to the Managed Service for Prometheus console.

    2. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance to go to the instance page.

    3. In the left-side navigation pane, click Service Discovery. Then, click the Configure tab.

    4. On the Configure tab, click the ServiceMonitor tab.

    5. On the ServiceMonitor tab, click Add ServiceMonitor to create a ServiceMonitor.

      The following code block shows the YAML template:

      apiVersion: monitoring.coreos.com/v1
      kind: ServiceMonitor
      metadata:
        # Enter a unique name. 
        name: custom-metrics-pindex
        # Specify a namespace. 
        namespace: default
      spec:
        endpoints:
        - interval: 30s
          # Enter the name of the port specified in the Port Mapping section when you created the Service, as shown in the preceding figure. 
          port: web
          # Enter the path of the Service. 
          path: /access
        namespaceSelector:
          any: true
          # The namespace of the NGINX demo application. 
        selector:
          matchLabels:
            # Enter the label that you added to the Service. 
            app: custom-metrics-pindex

      Click OK to create the ServiceMonitor.

      For more information about how to configure custom metrics, see Manage service discovery.

    6. On the Targets tab, the endpoints that Managed Service for Prometheus scrapes are displayed.

      Scape Endpioint
      Note

      The definition of a ServiceMonitor provides more information than an annotation, which includes the namespace and name of the Service.

  5. In the ACK console, access the external endpoint of the Service to increase the value of the following metric.

    image.png

    For more information about metrics, see Data model.增加指标值

  6. View custom metrics in Grafana.

    1. Log on to the Managed Service for Prometheus console.

    2. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance.

    3. In the left-side navigation pane, click Dashboards and click a predefined dashboard to log on to Grafana. Then, click the image.png icon in the upper-right part of the page and click Add a new panel to add a panel.

      image.png
    4. Select your ACK cluster as the data source and enter a PromQL statement. For example, set Metrics to current_person_counts.

      image.png
  7. Save the configurations to view custom metrics in the Grafana chart.

    Grafana

FAQ

How do I check the version of the ack-arms-prometheus component?

  1. Log on to the ACK console. In the left-side navigation pane, click Clusters.

  2. On the Clusters page, click the name of the cluster that you want to manage and choose Operations > Add-ons in the left-side navigation pane.

  3. On the Add-ons page, click the Logs and Monitoring tab and find the ack-arms-prometheus component.

    The version number is displayed in the lower part of the component. If a new version is available, click Upgrade on the right side to update the component.

    Note

    The Upgrade button is displayed only if the component is not updated to the latest version.

Why is Managed Service for Prometheus unable to monitor GPU-accelerated nodes?

Note

This issue is related only to unmanaged Prometheus monitoring agents.

Managed Service for Prometheus may be unable to monitor GPU-accelerated nodes that are configured with taints. You can perform the following steps to view the taints of a GPU-accelerated node.

  1. Run the following command to view the taints of a GPU-accelerated node:

    If you added custom taints to the GPU-accelerated node, you can view information about the custom taints. In this example, a taint whose key is set to test-key, value is set to test-value, and effect is set to NoSchedule is added to the node.

    kubectl describe node cn-beijing.47.100.***.***

    Expected output:

    Taints:test-key=test-value:NoSchedule
  2. Use one of the following methods to handle the taint:

    • Run the following command to delete the taint from the GPU-accelerated node:

      kubectl taint node cn-beijing.47.100.***.*** test-key=test-value:NoSchedule-
    • Add a toleration rule that allows pods to be scheduled to the GPU-accelerated node with the taint.

      # 1 Run the following command to modify ack-prometheus-gpu-exporter: 
      kubectl edit daemonset -n arms-prom ack-prometheus-gpu-exporter
      
      # 2. Add the following fields to the YAML file to tolerate the taint: 
      # Irrelevant fields are not shown. 
      # The tolerations field must be added above the containers field and both fields must be of the same level. 
      tolerations:
      - key: "test-key"
        operator: "Equal"
        value: "test-value"
        effect: "NoSchedule"
      containers:
       # Irrelevant fields are not shown.

What do I do if I fail to reinstall ack-arms-prometheus due to residual resource configurations of ack-arms-prometheus?

Note

This issue is related only to unmanaged Prometheus monitoring agents.

If you delete only the namespace of Managed Service for Prometheus, resource configurations are retained. In this case, you may fail to reinstall ack-arms-prometheus. You can perform the following operations to delete the residual resource configurations:

  • Run the following command to delete the arms-prom namespace:

    kubectl delete namespace arms-prom
  • Run the following commands to delete the related ClusterRoles:

    kubectl delete ClusterRole arms-kube-state-metrics
    kubectl delete ClusterRole arms-node-exporter
    kubectl delete ClusterRole arms-prom-ack-arms-prometheus-role
    kubectl delete ClusterRole arms-prometheus-oper3
    kubectl delete ClusterRole arms-prometheus-ack-arms-prometheus-role
    kubectl delete ClusterRole arms-pilot-prom-k8s
    kubectl delete ClusterRole gpu-prometheus-exporter
  • Run the following commands to delete the related ClusterRoleBindings:

    kubectl delete ClusterRoleBinding arms-node-exporter
    kubectl delete ClusterRoleBinding arms-prom-ack-arms-prometheus-role-binding
    kubectl delete ClusterRoleBinding arms-prometheus-oper-bind2
    kubectl delete ClusterRoleBinding arms-kube-state-metrics
    kubectl delete ClusterRoleBinding arms-pilot-prom-k8s
    kubectl delete ClusterRoleBinding arms-prometheus-ack-arms-prometheus-role-binding
    kubectl delete ClusterRoleBinding gpu-prometheus-exporter
  • Run the following commands to delete the related Roles and RoleBindings:

    kubectl delete Role arms-pilot-prom-spec-ns-k8s
    kubectl delete Role arms-pilot-prom-spec-ns-k8s -n kube-system
    kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s
    kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s -n kube-system