You can view metrics for Container Service for Kubernetes (ACK) Serverless clusters on predefined dashboards that are provided by Managed Service for Prometheus. This topic describes how to enable Managed Service for Prometheus for ACK Serverless clusters, how to configure alert rules in Managed Service for Prometheus, how to create custom metrics in Managed Service for Prometheus, and how to use Grafana to display custom monitoring metrics.
Introduction to Managed Service for Prometheus
Managed Service for Prometheus is a managed monitoring service that is fully interfaced with the open source Prometheus ecosystem. Managed Service for Prometheus monitors a wide array of components and provides multiple ready-to-use dashboards. Managed Service for Prometheus saves you the effort to manage underlying services, such as data storage, data presentation, and system maintenance.
For more information about Managed Service for Prometheus, see What is Managed Service for Prometheus?.
Prometheus monitoring agents
You can install managed or unmanaged Prometheus monitoring agents in ACK Serverless Pro clusters. By default, managed Prometheus monitoring agents are installed in ACK Serverless Pro clusters.
Managed Prometheus monitoring agents: Managed Prometheus monitoring agents allow Managed Service for Prometheus to directly collect monitoring data from containers in ACK Serverless Pro clusters and allow you to use out-of-the-box features provided by Managed Service for Prometheus.
Unmanaged Prometheus monitoring agents: You need to deploy a set of components, including metric collection components and Kube-State-Metrics. In addition, you must launch at least two elastic container instances that provide 1.5 vCores and 1.5 GB of memory in total. The actual resource specifications of the elastic container instances dynamically scale based on the amount of data in your cluster. For more information about the pricing of elastic container instances, see Overview of elastic container instances.
Enable Managed Service for Prometheus
Method 1: Enable Managed Service for Prometheus when you create a cluster
On the Component Configurations wizard page, select Enable Prometheus Monitoring. For more information, see Create an ACK Serverless cluster.
By default, Enable Prometheus Monitoring is selected when you create a cluster in the ACK console.
After the cluster is created, the system automatically configures Managed Service for Prometheus.
Method 2: Enable Managed Service for Prometheus in an existing cluster
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage and choose in the left-side navigation pane.
On the Prometheus Monitoring page, click Install.
The system automatically installs the component and checks the dashboards. After the installation is complete, you can click each tab to view metrics.
To install an unmanaged Prometheus monitoring agent in an existing cluster, you must first uninstall the managed monitoring component ack-arms-prometheus
. Go to the cluster details page in the ACK console and choose Operations > Add-ons in the left-side navigation pane. You can uninstall ack-arms-prometheus from the Add-ons page. After ack-arms-prometheus is uninstalled, install the unmanaged version of ack-arms-prometheus from the Add-ons page.
If ack-arms-prometheus is not displayed on the Add-ons page, the region where the ACK Serverless cluster resides does not support Managed Service for Prometheus.
View Grafana dashboards provided by Managed Service for Prometheus
On the Prometheus Monitoring page, click the name of a Grafana dashboard to view the monitoring data.
Configure alert rules in Managed Service for Prometheus
Managed Service for Prometheus allows you to create alert rules for monitoring jobs. When alert rules are met, you can receive alerts through emails, Short Message Service (SMS) messages, and DingTalk notifications in real time. This helps you detect errors in a proactive manner. When an alert rule is met, notifications are sent to the contact group that you specified. Before you can create a contact group, you must create a contact. When you create a contact, you can specify the mobile phone number and email address of the contact to receive notifications. You can also provide a DingTalk chatbot webhook URL that is used to automatically send alert notifications.
Step 1: Create a contact
Log on to the Managed Service for Prometheus console. In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed.
In the left-side navigation pane, choose Alert Management > Notification Objects.
- On the Contacts tab, click Create Contact in the upper-right corner.
- In the Create Contact dialog box, set the parameters and click OK. The following table describes the parameters.
Parameter Description Name The name of the contact. Mobile Phone Number After you specify the mobile phone number of a contact, the contact can be notified by phone call and text message. Note You can specify only verified mobile phone numbers in a notification policy. For more information about how to verify mobile phone numbers, see Verify mobile phone numbers.Email After you specify the email address of a contact, the contact can be notified by email. Method to Resend Notifications If Phone Notifications Fail Select the method to resend notifications if phone notifications fail. You can specify a global default setting for this parameter on the Contacts tab. For more information, see Specify a default method to resend notifications.
Important You must specify at least one of the Mobile Phone Number and Email parameters. Each mobile phone number or email address can be used for only one contact.
Step 2: Configure alert rules
Log on to the Managed Service for Prometheus console. In the left-side navigation pane, click Monitoring List.
In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance.
In the left-side navigation pane, click Alert Rules. On the Prometheus Alert Rules page, click Edit in the Actions column of the alert rule that you want to modify, modify the alert rule, and then click Save to quickly configure an alert rule for a metric.
For more information, see Create an alert rule for a Prometheus instance (for the new console version) or Create an alert rule (for the old console version).
Create custom metrics and use Grafana to display the monitoring metrics
Method 1: Create custom metrics by adding annotations
You can add annotations to pod configuration templates to define custom metrics. Managed Service for Prometheus automatically obtains the custom metrics by performing service discovery. For more information, see Manage service discovery in Kubernetes clusters.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
Creates an application.
On the Clusters page, click the name of a cluster and choose Workloads > Deployments in the left-side navigation pane.
On the Deployments tab, click Create from Image.
On the Basic Information wizard page, configure the basic settings and click Next.
On the Container page, specify an image that is used to create a web application, specify resource specifications for the web application, and open port 5000. Then, click Next.
Create a Service.
On the Clusters page, click the name of the cluster that you want to manage and choose in the left-side navigation pane.
Click Create in the upper-right part of the Services page. In the Create Service dialog box, configure the following parameters.
Parameter
Description
Name
Enter a Service name.
Type
Select Server Load Balancer and enable Public Access.
Backend
Select the application that you created.
Port Mapping
Specify Service Port and Container Port.
Click Create to create the Service.
For more information about how to create a Service, see Create a Service.
Create custom metrics.
Log on to the Managed Service for Prometheus console.
In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance.
In the left-side navigation pane, click Service Discovery. Then, click the Configure tab and add ServiceMonitor and PodMonitor settings to define Prometheus metric collection rules.
For more information about how to configure custom metrics, see Configure service discoveries.
Click the Targets tab to view the custom metrics that you configured.
In the ACK console, access the external endpoint of the Service that you created to increase the value of the following custom metric.
For more information about metrics, see Data model.
View custom metrics in Grafana.
Log on to the Managed Service for Prometheus console.
In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance.
In the left-side navigation pane, click Dashboards and click a predefined dashboard to log on to Grafana. Then, click the icon in the upper-right part of the page and click Add a new panel to add a panel.
Select your ACK cluster as the data source and enter a PromQL statement. For example, set Metrics to current_person_counts.
Save the configurations to view custom metrics in the Grafana chart.
Method 2: Use ServiceMonitors to create custom metrics
To use ServiceMonitors to create custom metrics, you need to add labels instead of annotations to your Services.
Log on to the ACK console. In the left-side navigation pane, click Clusters.
Create an application.
On the Clusters page, click the name of a cluster and choose Workloads > Deployments in the left-side navigation pane.
On the Deployments tab, click Create from Image.
On the Basic Information wizard page, configure the basic settings and click Next.
On the Container page, specify an image that is used to create a web application, specify resource specifications for the web application, and open port 5000. Then, click Next.
In this example, the
yejianhonghong/pindex
image is selected.On the Advanced wizard page, click Create.
Create a Service.
On the Clusters page, click the name of the cluster that you want to manage and choose in the left-side navigation pane.
On the Services page, click Create in the upper-right part. In the Create Service dialog box, configure the following parameters.
Parameter
Description
Name
Enter a Service name.
Type
Select Server Load Balancer and enable Public Access.
Backend
Select the application that you created.
Port Mapping
Specify Service Port and Container Port.
Labels
Add a label. This label is used by the selector of ServiceMonitors.
Click Create to create the Service.
For more information about how to create a Service, see Create a Service.
Configure custom metrics. Use the endpoints that Managed Service for Prometheus scrapes.
Log on to the Managed Service for Prometheus console.
In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance to go to the instance page.
In the left-side navigation pane, click Service Discovery. Then, click the Configure tab.
On the Configure tab, click the ServiceMonitor tab.
On the ServiceMonitor tab, click Add ServiceMonitor to create a ServiceMonitor.
The following code block shows the YAML template:
apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: # Enter a unique name. name: custom-metrics-pindex # Specify a namespace. namespace: default spec: endpoints: - interval: 30s # Enter the name of the port specified in the Port Mapping section when you created the Service, as shown in the preceding figure. port: web # Enter the path of the Service. path: /access namespaceSelector: any: true # The namespace of the NGINX demo application. selector: matchLabels: # Enter the label that you added to the Service. app: custom-metrics-pindex
Click OK to create the ServiceMonitor.
For more information about how to configure custom metrics, see Manage service discovery.
On the Targets tab, the endpoints that Managed Service for Prometheus scrapes are displayed.
NoteThe definition of a ServiceMonitor provides more information than an annotation, which includes the namespace and name of the Service.
In the ACK console, access the external endpoint of the Service to increase the value of the following metric.
For more information about metrics, see Data model.
View custom metrics in Grafana.
Log on to the Managed Service for Prometheus console.
In the upper-left part of the Managed Service for Prometheus page, select the region where your cluster is deployed and click the name of the corresponding Prometheus instance.
In the left-side navigation pane, click Dashboards and click a predefined dashboard to log on to Grafana. Then, click the icon in the upper-right part of the page and click Add a new panel to add a panel.
Select your ACK cluster as the data source and enter a PromQL statement. For example, set Metrics to current_person_counts.
Save the configurations to view custom metrics in the Grafana chart.
FAQ
How do I check the version of the ack-arms-prometheus component?
Log on to the ACK console. In the left-side navigation pane, click Clusters.
On the Clusters page, click the name of the cluster that you want to manage and choose in the left-side navigation pane.
On the Add-ons page, click the Logs and Monitoring tab and find the ack-arms-prometheus component.
The version number is displayed in the lower part of the component. If a new version is available, click Upgrade on the right side to update the component.
NoteThe Upgrade button is displayed only if the component is not updated to the latest version.
Why is Managed Service for Prometheus unable to monitor GPU-accelerated nodes?
This issue is related only to unmanaged Prometheus monitoring agents.
Managed Service for Prometheus may be unable to monitor GPU-accelerated nodes that are configured with taints. You can perform the following steps to view the taints of a GPU-accelerated node.
Run the following command to view the taints of a GPU-accelerated node:
If you added custom taints to the GPU-accelerated node, you can view information about the custom taints. In this example, a taint whose
key
is set totest-key
,value
is set totest-value
, andeffect
is set toNoSchedule
is added to the node.kubectl describe node cn-beijing.47.100.***.***
Expected output:
Taints:test-key=test-value:NoSchedule
Use one of the following methods to handle the taint:
Run the following command to delete the taint from the GPU-accelerated node:
kubectl taint node cn-beijing.47.100.***.*** test-key=test-value:NoSchedule-
Add a toleration rule that allows pods to be scheduled to the GPU-accelerated node with the taint.
# 1 Run the following command to modify ack-prometheus-gpu-exporter: kubectl edit daemonset -n arms-prom ack-prometheus-gpu-exporter # 2. Add the following fields to the YAML file to tolerate the taint: # Irrelevant fields are not shown. # The tolerations field must be added above the containers field and both fields must be of the same level. tolerations: - key: "test-key" operator: "Equal" value: "test-value" effect: "NoSchedule" containers: # Irrelevant fields are not shown.
What do I do if I fail to reinstall ack-arms-prometheus due to residual resource configurations of ack-arms-prometheus?
This issue is related only to unmanaged Prometheus monitoring agents.
If you delete only the namespace of Managed Service for Prometheus, resource configurations are retained. In this case, you may fail to reinstall ack-arms-prometheus. You can perform the following operations to delete the residual resource configurations:
Run the following command to delete the arms-prom namespace:
kubectl delete namespace arms-prom
Run the following commands to delete the related ClusterRoles:
kubectl delete ClusterRole arms-kube-state-metrics kubectl delete ClusterRole arms-node-exporter kubectl delete ClusterRole arms-prom-ack-arms-prometheus-role kubectl delete ClusterRole arms-prometheus-oper3 kubectl delete ClusterRole arms-prometheus-ack-arms-prometheus-role kubectl delete ClusterRole arms-pilot-prom-k8s kubectl delete ClusterRole gpu-prometheus-exporter
Run the following commands to delete the related ClusterRoleBindings:
kubectl delete ClusterRoleBinding arms-node-exporter kubectl delete ClusterRoleBinding arms-prom-ack-arms-prometheus-role-binding kubectl delete ClusterRoleBinding arms-prometheus-oper-bind2 kubectl delete ClusterRoleBinding arms-kube-state-metrics kubectl delete ClusterRoleBinding arms-pilot-prom-k8s kubectl delete ClusterRoleBinding arms-prometheus-ack-arms-prometheus-role-binding kubectl delete ClusterRoleBinding gpu-prometheus-exporter
Run the following commands to delete the related Roles and RoleBindings:
kubectl delete Role arms-pilot-prom-spec-ns-k8s kubectl delete Role arms-pilot-prom-spec-ns-k8s -n kube-system kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s kubectl delete RoleBinding arms-pilot-prom-spec-ns-k8s -n kube-system