After you enable Application Real-Time Monitoring Service (ARMS) Prometheus, you can view dashboards and performance metrics that are preset for Container Service for Kubernetes (ACK). This topic describes how to enable ARMS Prometheus in ACK, how to configure alert rules in ARMS Prometheus, and how to customize monitoring metrics and use Grafana to display monitoring metrics.

Background information

ARMS Prometheus is a managed monitoring service of ARMS that monitors a wide variety of components and provides various out-of-the-box dashboards. The Prometheus monitoring system of ARMS is compatible with the open source Prometheus ecosystem. ARMS Prometheus is a comprehensive, ready-to-use monitoring service. You no longer need to be concerned about managing the underlying services, such as data storage, data presentation, and system O&M.

For more information about ARMS Prometheus, see What is Prometheus Service?

Enable ARMS Prometheus

You can use one of the following methods to enable ARMS Prometheus:

Enable ARMS Prometheus by setting cluster parameters in the ACK console

  1. Log on to the ACK console.
  2. In the left-side navigation pane of the ACK console, click Clusters.
  3. In the upper-right corner of the Clusters page, click Create Kubernetes Cluster.
  4. Select the cluster template that you want to use and set the cluster parameters. On the Component Configurations wizard page, select Enable Prometheus Monitoring.
    ARMSFor more information about how to create an ACK cluster, see Create an ACK managed cluster.
    Note By default, Enable Prometheus Monitoring is selected when you create a cluster in the ACK console.
    After the cluster is created, the system automatically configures ARMS Prometheus for the cluster.

Enable ARMS Prometheus on the Prometheus Monitoring page in the ACK console

  1. Log on to the ACK console.
  2. In the left-side navigation pane of the ACK console, click Clusters.
  3. On the Clusters page, find the cluster that you want to manage and click the name of the cluster or click Details in the Actions column. The details page of the cluster appears.
  4. In the left-side navigation pane of the cluster details page, choose Operations > Prometheus Monitoring.
  5. On the Prometheus Monitoring page, the system automatically installs the component and checks the dashboards. After ARMS Prometheus is installed, you can click each tab to view monitoring metrics.

Enable ARMS Prometheus on the App Catalog tab in the ACK console

  1. Log on to the ACK console.
  2. In the left-side navigation pane of the ACK console, choose Marketplace > App Catalog.
  3. On the Marketplace page, click the App Catalog tab. Then, find and click ack-arms-prometheus.
  4. On the ack-arms-prometheus page, click Deploy.
  5. In the Deploy wizard, select a cluster and a namespace, and then click Next.
    Note By default, Namespace and Release Name are set to arms-prom.
  6. On the Parameters wizard page, set the parameters and click OK.

Execution results

After the installation is completed, the arms-prom page appears. You can view application information on this page.

View the taints that are added to GPU-accelerated nodes

ARMS Prometheus may not be able to monitor GPU-accelerated nodes that are configured with taints. You can perform the following steps to view the taints of a GPU-accelerated node. For more information, see Taints and toleration rules.

  1. Run the following command to view the taints of a specific GPU-accelerated node:
    kubectl describe node cn-beijing.47.100.XX.XX
    If you added custom taints to the GPU-accelerated node, you can view information about the taints in the node description. In this example, a taint whose key is set to test-key, value is set to test-value, and effect is set to NoSchedule is added to the node:
    Taints: test-key=test-value:NoSchedule
  2. Delete the taint of the GPU-accelerated node or add toleration rules for the node. You can choose any of the following ways to do this.
    • Run the following command to delete the taint of the GPU-accelerated node:
      kubectl taint node cn-beijing.47.100.XX.XX test-key=test-value:NoSchedule-
    • Add a toleration rule that allows pods to be scheduled to the node with the matching taint.
      1. Run the following command to modify ack-prometheus-gpu-exporter:
        kubectl edit daemonset -n arms-prom ack-prometheus-gpu-exporter
      2. Add the following fields in the YAML file to tolerate the taint:
        #Other fields are omitted. 
        #The tolerations field must be added above the containers field and both fields must be of the same level. 
        tolerations:
        - key: "test-key"
          operator: "Equal"
          value: "test-value"
          effect: "NoSchedule"
        containers:
         #Other fields are omitted. 

View ARMS Prometheus Grafana dashboards

  1. Log on to the ACK console.
  2. In the left-side navigation pane of the ACK console, click Clusters.
  3. On the Clusters page, find the cluster that you want to manage and click the name of the cluster or click Details in the Actions column. The details page of the cluster appears.
  4. In the left-side navigation pane of the details page, choose Operations > Prometheus Monitoring.
  5. On the Prometheus Monitoring page, click the name of a Grafana dashboard to view the monitoring data.

Configure alert rules in ARMS Prometheus

ARMS Prometheus allows you to create alert rules for monitoring jobs. When alert rules are met, you can receive alerts through emails, Short Message Service (SMS) messages, and DingTalk notifications in real time. This helps you detect errors in a proactive manner. When an alert rule is met, notifications are sent to the contact group that you specified. Before you can create a contact group, you must create a contact. When you create a contact, you can specify the mobile phone number and email address of the contact to receive notifications. You can also provide a DingTalk chatbot webhook URL that is used to automatically send alert notifications.
Note To add a DingTalk chatbot as a contact, you must first obtain the webhook URL of the chatbot. For more information, see Configure a DingTalk chatbot to send alert notifications.
  1. Log on to the ARMS console .
  2. In the left-side navigation pane, choose Alert Management > Contact.
  3. On the Contact tab, click Create a contact in the upper-right corner of the tab. Configure the contact and click OK.
  4. Configure an alert rule.
    1. Log on to the ARMS console.
    2. In the left-side navigation pane, choose Prometheus Monitoring > Prometheus Instances.
    3. In the upper-left corner of the Prometheus Monitoring page, select the region where your ACK cluster is deployed and click the Prometheus instance that you want to manage. Then, you are redirected to the instance details page.
    4. In the left-side navigation pane, click Alarm Configuration.
    5. Select the alert rule that you want to manage and click Edit in the Actions column. Modify the PromQL statement and click OK.
      For more information about how to configure PromQL statements, see Create ARMS alerts.
    Note You can also choose Alarms > Alarm Policies in the ARMS console to manage alert rules.

    Verify the result

    Perform a manual test to trigger a DingTalk alert notification. The following figure shows a sample alert notification.Monitoring and Alerting

Customize monitoring metrics and use Grafana to display monitoring metrics

Method 1: Use annotations to customize monitoring metrics

You can add annotations to pod configuration templates to define custom monitoring metrics. The application monitoring component of ARMS uses ARMS Prometheus to automatically obtain these custom monitoring metrics.

  1. Log on to the ACK console.
  2. In the left-side navigation pane of the ACK console, click Clusters.
  3. On the Clusters page, find the cluster that you want to manage and click the name of the cluster or click Details in the Actions column. The details page of the cluster appears.
  4. In the left-side navigation pane of the details page, choose Workloads > Deployments.
  5. On the Deployments page, create an application.
    1. Click Create from Image.
    2. On the Basic Information wizard page, set basic parameters and click Next.
    3. Create a web application and open port 5000 for the application.
      In this example, the yejianhonghong/pindex image is used. Container Configuration
    4. Click Next.
    5. Add annotations that are related to ARMS to the pod.
      The prometheus.io/port annotation is used to specify the endpoint port that ARMS Prometheus scrapes. The prometheus.io/path annotation is used to specify the endpoint path that ARMS Prometheus scrapes. Labels and annotations
    6. Click Create to create the application.
  6. On the Services page, create a Service.
    1. In the left-side navigation pane of the details page, choose Network > Services
    2. In the upper-right corner of the Services page, click Create.
    3. Select Server Load Balancer and Public Access for Type.
    4. Select the application that you created in Step 4 for Backend.
    5. Click Create to create the Service.
    For more information, see Create Services.
  7. Configure custom monitoring metrics.
    1. Log on to the ARMS console.
    2. In the left-side navigation pane, choose Prometheus Monitoring > Prometheus Monitoring.
    3. In the upper-left corner of the Prometheus Monitoring page, select the region where your ACK cluster is deployed and click the Prometheus instance that you want to manage. Then, you are redirected to the instance details page.
    4. In the left-side navigation pane, click Service Discovery. Click the Targets tab. You can verify that the custom metric are configured.
      Custom metrics
  8. Access the public IP address of the Service that you created in Step 5. This increases the value of a custom metric.
    For more information about how to configure metrics, see Data model. Increase the value of a custom metric
  9. Go to the Dashboards page in the ARMS console and click a dashboard to go to the Grafana page. Click Add panel in the upper-right corner, select the Graph type, and then enter current_person_counts in the Metrics field.
  10. Save the configurations to view the Grafana chart of the custom metric.
    Grafana

Method 2: Use ServiceMonitors to customize monitoring metrics

To use ServiceMonitors to customize monitoring metrics, you must add labels to Services. You do not need to add annotations.

  1. Log on to the ACK console.
  2. In the left-side navigation pane of the ACK console, click Clusters.
  3. On the Clusters page, find the cluster that you want to manage and click the name of the cluster or click Details in the Actions column. The details page of the cluster appears.
  4. On the Deployments page, create an application.
    1. In the left-side navigation pane of the details page, choose Workloads > Deployments.
    2. Click Create from Image.
    3. On the Basic Information wizard page, set basic parameters and click Next.
    4. Create a web application and open port 5000 for the application.
      In this example, the yejianhonghong/pindex image is used. Container Configuration

      Click Next.

    5. Click Create to create the application.
  5. On the Services page, create a Service.
    1. In the left-side navigation pane of the details page, choose Network > Services
    2. In the upper-right corner of the Services page, click Create.
    3. Select Server Load Balancer and Public Access for Type.
    4. Select the application that you created in Step 4 for Backend.
    5. Add labels.
      This label is used by ServiceMonitors as a selector. Create a Service
    6. Click Create to create the Service.
    For more information, see Manage Services.
  6. Specify the endpoint that ARMS Prometheus scrapes.
    1. Log on to the ARMS console.
    2. In the left-side navigation pane, choose Prometheus Monitoring > Prometheus Monitoring.
    3. In the upper-left corner of the Prometheus Monitoring page, select the region where your ACK cluster is deployed and click the Prometheus instance that you want to manage. Then, you are redirected to the instance details page.
    4. In the left-side navigation pane, click Service Discovery. Then, click the Configure tab.
    5. On the Configure tab, click ServiceMonitor.
    6. On the ServiceMonitor tab, click Add ServiceMonitor.
      In this example, the following template is used to create a ServiceMonitor.
      apiVersion: monitoring.coreos.com/v1
      kind: ServiceMonitor
      metadata:
        # Enter a unique name. 
        name: custom-metrics-pindex
        # Specify a namespace. 
        namespace: default
      spec:
        endpoints:
        - interval: 30s
          # Enter the name of the port specified in the Port Mapping section when you created the Service in Step 5. 
          port: web
          # Enter the path of the Service. 
          path: /access
        namespaceSelector:
          any: true
          # The namespace to which the NGINX demo application belongs. 
        selector:
          matchLabels:
            # Enter the label that you added to the Service in Step 5. 
            app: custom-metrics-pindex

      Click OK to create the ServiceMonitor.

    7. On the Targets tab, verify that the endpoints that ARMS Prometheus scrapes are displayed.
      Scape Endpioint
      Note The definition of a ServiceMonitor provides more information than an annotation, and includes the namespace and name of the Service.
  7. Connect to the public IP address of the Service that you created in Step 5. This increases the value of a custom metric.
    For more information about how to configure metrics, see Data model. Increase the value of a custom metric
  8. Go to the Dashboards page of the ARMS console and click a dashboard to go to the Grafana page. Click Add panel in the upper-right corner, select the Graph type, and then enter current_person_counts in the Metrics field.
  9. Save the configurations to view the Grafana chart of the custom metric.
    Grafana

FAQ

What do I do if I fail to reinstall ARMS Prometheus after I delete the arms-prom namespace?

If you delete only the arms-prom namespace, resource configurations may be retained. In this case, you may fail to reinstall ARMS Prometheus. You can perform the following operations to delete the residual resource configurations:
  1. Run the following commands to delete the related ClusterRoles:
    kubectl delete ClusterRole arms-kube-state-metrics
    kubectl delete ClusterRole arms-node-exporter
    kubectl delete ClusterRole arms-prom-ack-arms-prometheus-role
    kubectl delete ClusterRole arms-prometheus-oper3
    kubectl delete ClusterRole arms-prometheus-ack-arms-prometheus-role
    kubectl delete ClusterRole arms-pilot-prom-k8s
  2. Run the following commands to delete the related ClusterRoleBindings:
    kubectl delete ClusterRoleBinding arms-node-exporter
    kubectl delete ClusterRoleBinding arms-prom-ack-arms-prometheus-role-binding
    kubectl delete ClusterRoleBinding arms-prometheus-oper-bind2
    kubectl delete ClusterRoleBinding kube-state-metrics
    kubectl delete ClusterRoleBinding arms-pilot-prom-k8s
    kubectl delete ClusterRoleBinding arms-prometheus-ack-arms-prometheus-role-binding
  3. Run the following commands to delete the related Roles and RoleBindings:
    kubectl delete Role arms-pilot-prom-spec-ns-k8s
    kubectl delete Role arms-pilot-prom-spec-ns-k8s -n kube-system
    kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s 
    kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s -n kube-system