Prometheus is an open source project that is used to monitor cloud-native applications. This topic describes how to deploy Prometheus in a Container Service for Kubernetes (ACK) cluster.
Prerequisites
- An ACK cluster is created. For more information, see Create an ACK managed cluster.
- You are connected to the cluster. Make sure that you can view node information such as node labels. For more information, see Obtain the kubeconfig file of a cluster and use kubectl to connect to the cluster.
Background information
- Resource: resource utilization of nodes and applications. In a Kubernetes cluster, the monitoring system monitors the resource usage of nodes, pods, and the cluster.
- Application: internal metrics of applications. For example, the monitoring system dynamically counts the number of online users who are using an application, collects monitoring metrics from application ports, and enables alerting based on the collected metrics.
- Cluster components: the components of the Kubernetes cluster, such as kube-apiserver, kube-controller-manager, and etc.
- Static resource entities, such as status of resources on nodes and kernel events.
- Dynamic resource entities: entities of abstract workloads in Kubernetes, such as Deployments, DaemonSets, and pods.
- Custom objects in applications: custom data and metrics that are used to monitor applications.
To monitor cluster components and static resource entities, specify the monitoring methods in configuration files.
To monitor dynamic resource entities in a Kubernetes cluster, you can deploy Prometheus in the Kubernetes cluster.
Procedure
- Deploy Prometheus.
- View the aggregated data.
- View alert rules and set silent alerts.
- View alert rules To view alert rules, enter
localhost:9090
in the address bar of a browser, and then click Alerts in the top navigation bar.- Red: Alerts are being triggered based on alert rules in red.
- Green: No alerts are being triggered based on alert rules in green.
- Set silent alerts Run the following command. Enter
localhost:9093
in the address bar of a browser and click Silence to set silent alerts.kubectl --namespace monitoring port-forward svc/alertmanager-operated 9093
- View alert rules
You can follow the preceding steps to deploy Prometheus in a cluster. The following examples describe how to configure Prometheus in different scenarios.
Alert configurations
To configure alert notification methods or notification templates, perform the following steps to configure the config field in the alertmanager section:
- Configure alert notification methods You can set prometheus-operator to send alert notifications by using DingTalk messages or emails. You can perform the following steps to configure the alert notification method:
- Configure DingTalk notifications
On the ack-prometheus-operator page, click Deploy. On the Parameters wizard page, set enabled to true in the dingtalk section, set the webhook URL of your DingTalk chatbot to the token field, and set the receiver field of the config parameter in the alertmanager section to the alert name that is specified in the receivers field. The default value of the receivers field is webhook.
If you have two DingTalk chatbots, perform the following steps:- Replace the parameter values in the token field with the webhook URLs of your DingTalk chatbots. Copy the webhook URLs of your DingTalk chatbots and replace the parameter values of dingtalk1 and dingtalk2 in the token field with the copied URLs. In this example, https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxx is replaced by the webhook URLs.
- Modify the value of the receiver parameter.
In the alertmanager section, set the receiver fields in the config parameter to the alert names that are specified in the receivers field. In this example, webhook1 and webhook2 are used.
- Modify the value of the url parameter. Replace the value of the url parameter with the names of your DingTalk chatbots. In this example, dingtalk1 and dingtalk2 are used.
Note To add more DingTalk chatbots, add more webhook URLs. - Replace the parameter values in the token field with the webhook URLs of your DingTalk chatbots.
- Configure email notifications On the ack-prometheus-operator page, click Deploy. On the Parameters wizard page, specify the details about your email address as shown in the red box of the following figure, and set the receiver field of the config parameter in the alertmanager section to the alert name that is specified in the receivers field. The default value of the receivers field is mail.
- Configure DingTalk notifications
- Configure alert notification templates You can customize the alert notification template in the templateFiles field of the alertmanager section on the Parameters wizard page, as shown in the following figure.
Storage configuration
- Store data in TSDB On the ack-prometheus-operator page, click Deploy. On the Parameters wizard page, set enabled to true in the tsdb section, and set the url fields of the remoteRead and remoteWrite parameters.
- Store data on disks By default, ack-prometheus-operator allows you to store data on Alibaba Cloud disks. On the ack-prometheus-operator page, click Deploy. On the Parameters wizard page, set the storage parameter in the alertmanager section or the storageSpec parameter in the prometheus section. You can specify the disk type in the storageClassName field, specify the access mode in the accessModes field, and specify the disk capacity in the storage field.Note For example, you want to store Prometheus data on a standard SSD. In the storageSpec parameter, set
storageClassName
to alicloud-disk-ssd,accessModes
to ReadWriteOnce, andstorage
to 50Gi, as shown in the following figure.To check the configuration, go to the Elastic Compute Service (ECS) console and choose
. On the Disks page, you can view the standard SSD that is used.For information about how to reuse a disk, see Disk volume overview.
Use prometheus-adapter to enable auto scaling based on custom metrics
prometheus-adapter allows you to specify custom metrics for pod auto scaling. To enable prometheus-adapter, set enabled to true in the prometheusAdapter section and specify custom metrics. This way, the cluster can automatically scale the number of pods based on the specified metrics, which improves resource utilization.

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1"
Mount a ConfigMap to Prometheus
This section describes how to mount a ConfigMap to the /etc/prometheus/configmaps/ path of a pod.

If prometheus-operator has been deployed in your cluster, perform the following steps:
- Log on to the ACK console and click Clusters in the left-side navigation pane.
- On the Clusters page, click the name of a cluster and choose in the left-side navigation pane.
- Find the ack-prometheus-operator release and click Update in the Actions column.
- In the Update Release panel, set the configMaps fields in the prometheus and alertmanager sections to the name of the ConfigMap that you want to mount. Then, click OK.
For example, you want to mount a ConfigMap named
special-config
, which contains the configuration of Prometheus. To configure the special-config ConfigMap as a configuration file of the Prometheus pod, add the following configuration to the configMaps field in the prometheus section to mount the ConfigMap to the /etc/prometheus/configmaps/ path.The following figure shows an example of the special-config ConfigMap.
The following figure shows how to set the configMaps field in the prometheus section.
Configure Grafana
- Mount the dashboard configuration to Grafana You can perform the following steps to mount a ConfigMap that contains the dashboard configuration to the Grafana pod. On the ack-prometheus-operator page, click Deploy. On the Parameters wizard page, add the following configurations to the
extraConfigmapMounts
section, as shown in the following figure.Note- Make sure that you have a ConfigMap that contains the dashboard configuration in your cluster.
This labels that are added to the ConfigMap must be the same as those added to other ConfigMaps.
- In the
extraConfigmapMounts
section of the Grafana configuration, specify the name of the ConfigMap and how to mount the ConfigMap. - Set mountPath to /tmp/dashboards/.
- Set configMap to the name of the ConfigMap.
- Set name to the name of the JSON file that stores the dashboard configuration.
- Make sure that you have a ConfigMap that contains the dashboard configuration in your cluster.
- Enable data persistence for dashboards
You can perform the following steps to enable data persistence for Grafana dashboards:
- Log on to the ACK console and click Clusters in the left-side navigation pane.
- On the Clusters page, click the name of a cluster and choose in the left-side navigation pane.
- Find ack-prometheus-operator and click Update in the Actions column.
- In the Update Release panel, configure the persistence field in the grafana section as shown in the following figure.
You can export data on Grafana dashboards in JSON format to your on-premises machine. For more information, see Export a Grafana dashboard.
FAQ
- What do I do if I fail to receive DingTalk alert notifications?
- Obtain the webhook URL of your DingTalk chatbot. For more information, see Scenario 3: Use DingTalk to raise alerts upon Kubernetes events.
- On the Parameters wizard page, find the dingtalk section, set enabled to true, and then specify the webhook URL of your DingTalk chatbot in the token field. For more information, see Configure DingTalk alert notifications in Alert configurations.
- What do I do if an error message appears when I deploy prometheus-operator in a cluster? The following error message appears:
Can't install release with errors: rpc error: code = Unknown desc = object is being deleted: customresourcedefinitions.apiextensions.k8s.io "xxxxxxxx.monitoring.coreos.com" already exists
The error message indicates that the cluster fails to clear custom resource definition (CRD) objects of the previous deployment. Run the following commands to delete the CRD objects. Then, deploy prometheus-operator again:kubectl delete crd prometheuses.monitoring.coreos.com kubectl delete crd prometheusrules.monitoring.coreos.com kubectl delete crd servicemonitors.monitoring.coreos.com kubectl delete crd alertmanagers.monitoring.coreos.com
- What do I do if I fail to receive email alert notifications?
Make sure that the value of
smtp_auth_password
is the SMTP authorization code instead of the logon password of the email account. Make sure that the SMTP server endpoint includes a port number. - What do I do if the console prompts the following error message after I click Update to update YAML templates: The current cluster is temporarily unavailable. Try again later or submit a ticket?
If the configuration file of Tiller is overlarge, the cluster cannot be accessed. To solve this issue, you can delete some annotations in the configuration file and mount the file to a pod as a ConfigMap. You can specify the name of the ConfigMap in the configMaps fields of the prometheus and alertmanager sections. For more information, see the second method in Mount a ConfigMap to Prometheus.
- How do I enable the features of prometheus-operator after I deploy it in a cluster?
After prometheus-operator is deployed, you can perform the following steps to enable the features of prometheus-operator. Go to the cluster details page and choose ack-prometheus-operator and click Update in the Actions column. In Update Release panel, configure the code block to enable the features. Then, click OK.
in the left-side navigation pane. On the Helm page, find - How do I select data storage: TSDB or disks? TSDB storage is available to limited regions. However, disk storage is supported in all regions. The following figure shows how to configure the data retention policy.
- What do I do if a Grafana dashboard fails to display data properly?
Go to the cluster details page and choose ack-prometheus-operator and click Update in the Actions column. In Update Release panel, check whether the value of the clusterVersion field is correct. If the Kubernetes version of your cluster is earlier than 1.16, set clusterVersion to 1.14.8-aliyun.1. If the Kubernetes version of your cluster is 1.16 or later, set clusterVersion to 1.16.6-aliyun.1.
in the left-side navigation pane. On the Helm page, find - What do I do if I fail to install ack-prometheus after I delete the ack-prometheus namespace? After you delete the ack-prometheus namespace, the related resource configurations may be retained. In this case, you may fail to install ack-prometheus again. You can perform the following operations to delete the related resource configurations:
- Delete role-based access control (RBAC)-related resource configurations.
- Run the following commands to delete the related ClusterRoles:
kubectl delete ClusterRole ack-prometheus-operator-grafana-clusterrole kubectl delete ClusterRole ack-prometheus-operator-kube-state-metrics kubectl delete ClusterRole psp-ack-prometheus-operator-kube-state-metrics kubectl delete ClusterRole psp-ack-prometheus-operator-prometheus-node-exporter kubectl delete ClusterRole ack-prometheus-operator-operator kubectl delete ClusterRole ack-prometheus-operator-operator-psp kubectl delete ClusterRole ack-prometheus-operator-prometheus kubectl delete ClusterRole ack-prometheus-operator-prometheus-psp
- Run the following commands to delete the related ClusterRoleBindings:
kubectl delete ClusterRoleBinding ack-prometheus-operator-grafana-clusterrolebinding kubectl delete ClusterRoleBinding ack-prometheus-operator-kube-state-metrics kubectl delete ClusterRoleBinding psp-ack-prometheus-operator-kube-state-metrics kubectl delete ClusterRoleBinding psp-ack-prometheus-operator-prometheus-node-exporter kubectl delete ClusterRoleBinding ack-prometheus-operator-operator kubectl delete ClusterRoleBinding ack-prometheus-operator-operator-psp kubectl delete ClusterRoleBinding ack-prometheus-operator-prometheus kubectl delete ClusterRoleBinding ack-prometheus-operator-prometheus-psp
- Run the following commands to delete the related ClusterRoles:
- Run the following command to delete the related CRD objects:
kubectl delete crd alertmanagerconfigs.monitoring.coreos.com kubectl delete crd alertmanagers.monitoring.coreos.com kubectl delete crd podmonitors.monitoring.coreos.com kubectl delete crd probes.monitoring.coreos.com kubectl delete crd prometheuses.monitoring.coreos.com kubectl delete crd prometheusrules.monitoring.coreos.com kubectl delete crd servicemonitors.monitoring.coreos.com kubectl delete crd thanosrulers.monitoring.coreos.com
- Delete role-based access control (RBAC)-related resource configurations.