After you enable Prometheus Service, you can view dashboards and performance metrics that are preset for Container Service for Kubernetes (ACK). This topic describes how to enable Prometheus Service in ACK, how to configure alert rules in Prometheus Service, and how to customize monitoring metrics and use Grafana to display monitoring metrics.Prometheus Service
Background information
Prometheus Service is a managed monitoring service that is provided by Alibaba Cloud. Prometheus Service is compatible with the open source Prometheus ecosystem and provides out-of-the-box dashboards for you to monitor a wide variety of components. Prometheus Service saves you the effort to manage underlying services, such as data storage, data presentation, and system maintenance.
For information about Prometheus Service, see What is Prometheus Service?.
Enable Prometheus Service
Method 1: Enable Prometheus Service when you create a cluster

- By default, Enable Prometheus Monitoring is selected when you create a cluster in the ACK console.
- After the ACK cluster is created, the system automatically configures Prometheus Service.
Method 2: Enable Prometheus Service in an existing cluster
- Log on to the ACK console and click Clusters in the left-side navigation pane.
- On the Clusters page, click the name of the cluster that you want to manage. In the left-side navigation pane, choose .
- In the middle part of the Prometheus Monitoring page, click Install. The system automatically installs the component and checks the dashboards. After Prometheus Service is installed, you can click each tab to view monitoring metrics.
View the Grafana dashboards in Prometheus Service
On the Prometheus Monitoring page, click the name of a Grafana dashboard to view the monitoring data.
Configure alert rules in Prometheus Service
- Log on to the ARMS console.
- In the left-side navigation pane, choose .
- On the Contacts tab, click Create Contact in the upper-right corner. Configure the contact and click OK.
- Configure an alert rule. Note You can also choose ARMS console to manage alert rules.in the
Verify the result
Perform a manual test to trigger a DingTalk alert notification. The following figure shows a sample alert notification.
Customize monitoring metrics and use Grafana to display monitoring metrics
Method 1: Use annotations to customize monitoring metrics
You can add annotations to pod configuration templates to define custom monitoring metrics. The application monitoring component of ARMS uses Prometheus Service to automatically obtain these custom monitoring metrics.
- Log on to the ACK console and click Clusters in the left-side navigation pane.
- On the Clusters page, click the name of a cluster and choose in the left-side navigation pane.
- On the Deployments page, create an application.
- On the Services page, create a Service.
- In the left-side navigation pane of the details page, choose
- In the upper-right corner of the Services page, click Create.
- Select Server Load Balancer and Public Access for the Type parameter.
- Select the application that you created in Step 4 for the Backend parameter.
- Click Create to create the Service.
For more information, see Create Services. - Configure custom monitoring metrics.
- Access the public IP address of the Service that you created in Step 5. This increases the value of a custom metric. For more information about how to configure metrics, see Data model.
- Go to the Dashboards page in the ARMS console and click a dashboard to go to the Grafana page. Click Add panel in the upper-right corner, select the Graph type, and then enter current_person_counts in the Metrics field.
- Save the settings to view the Grafana chart of the custom metric.
Method 2: Use ServiceMonitors to customize monitoring metrics
To use ServiceMonitors to customize monitoring metrics, you must add labels to Services. You do not need to add annotations.
- Log on to the ACK console and click Clusters in the left-side navigation pane.
- On the Clusters page, click the name of a cluster and choose in the left-side navigation pane.
- On the Deployments page, create an application.
- On the Services page, create a Service. For more information, see Create Services.
- Specify the endpoint that Prometheus Service scrapes.
- Connect to the public IP address of the Service that you created in Step 5. This increases the value of a custom metric. For more information about how to configure metrics, see Data model.
- Go to the Dashboards page of the ARMS console and click a dashboard to go to the Grafana page. Click Add panel in the upper-right corner, select the Graph type, and then enter current_person_counts in the Metrics field.
- Save the settings to view the Grafana chart of the custom metric.
FAQ
How do I check the version of the ack-arms-prometheus component?
- Log on to the ACK console and click Clusters in the left-side navigation pane.
- On the Clusters page, click the name of a cluster and choose in the left-side navigation pane.
- On the Add-ons page, click the Logs and Monitoring tab and find ack-arms-prometheus. The version number is displayed in the lower part of the component. If a new version is available, click Upgrade on the right side to update the component.Note The Upgrade button is displayed only if the component is not updated to the latest version.
Why cannot ARMS Prometheus monitor GPU-accelerated nodes?
ARMS Prometheus may not be able to monitor GPU-accelerated nodes that are configured with taints. You can perform the following steps to view the taints of a GPU-accelerated node.
- Run the following command to view the taints of a GPU-accelerated node: If you added custom taints to the GPU-accelerated node, you can view the information about the custom taints. In this example, a taint whose
key
is set totest-key
,value
is set totest-value
, andeffect
is set toNoSchedule
is added to the node.kubectl describe node cn-beijing.47.100.***.***
Expected output:
Taints:test-key=test-value:NoSchedule
- Use one of the following methods to handle the taint:
- Run the following command to delete the taint from the GPU-accelerated node:
kubectl taint node cn-beijing.47.100.***.*** test-key=test-value:NoSchedule-
- Add a toleration rule that allows pods to be scheduled to the CPU-accelerated node with the taint.
# 1. Run the following command to modify ack-prometheus-gpu-exporter: kubectl edit daemonset -n arms-prom ack-prometheus-gpu-exporter # 2. Add the following fields to the YAML file to tolerate the taint: # Irrelevant fields are not shown. #The tolerations field must be added above the containers field and both fields must be of the same level. tolerations: - key: "test-key" operator: "Equal" value: "test-value" effect: "NoSchedule" containers: # Irrelevant fields are not shown.
- Run the following command to delete the taint from the GPU-accelerated node:
What do I do if I fail to reinstall ARMS Prometheus after I delete the arms-prom namespace?
If you delete only the arms-prom namespace, resource configurations may be retained. In this case, you may fail to reinstall ARMS Prometheus. You can perform the following operations to delete the residual resource configurations:
- Run the following commands to delete the related ClusterRoles:
kubectl delete ClusterRole arms-kube-state-metrics kubectl delete ClusterRole arms-node-exporter kubectl delete ClusterRole arms-prom-ack-arms-prometheus-role kubectl delete ClusterRole arms-prometheus-oper3 kubectl delete ClusterRole arms-prometheus-ack-arms-prometheus-role kubectl delete ClusterRole arms-pilot-prom-k8s
- Run the following commands to delete the related ClusterRoleBindings:
kubectl delete ClusterRoleBinding arms-node-exporter kubectl delete ClusterRoleBinding arms-prom-ack-arms-prometheus-role-binding kubectl delete ClusterRoleBinding arms-prometheus-oper-bind2 kubectl delete ClusterRoleBinding kube-state-metrics kubectl delete ClusterRoleBinding arms-pilot-prom-k8s kubectl delete ClusterRoleBinding arms-prometheus-ack-arms-prometheus-role-binding
- Run the following commands to delete the related Roles and RoleBindings:
kubectl delete Role arms-pilot-prom-spec-ns-k8s kubectl delete Role arms-pilot-prom-spec-ns-k8s -n kube-system kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s -n kube-system