Prometheus Service (Prometheus) allows you to use a Grafana dashboard to display monitoring data. You can customize a Grafana dashboard or import a Grafana dashboard from the official website of Grafana. This topic describes how to customize a Grafana dashboard to display monitoring data. In this example, Container Service for Kubernetes (ACK) and Container Registry are monitored.

Prerequisites

Procedure

The following figure shows the process of using Prometheus to customize a Grafana dashboard.

Custom Grafana dashboard

Step 1: Upload an application

To build an image for an application and upload the image to an image repository of Container Registry, perform the following steps:

  1. Run the following command to recompile a module:
    mvn clean install -DskipTests
  2. Run the following command to build an image:
    docker build -t <Name of the local temporary Docker image>:<Version number of the local temporary Docker image> . --no-cache
    Example:
    docker build -t promethues-demo:v0 . --no-cache
  3. Run the following command to tag the image:
    sudo docker tag <Name of the local temporary Docker image>:<Version number of the local temporary Docker image> <Domain name of the image repository>/<Namespace>/<Image name>:<Image version number>
    Example:
    sudo docker tag promethues-demo:v0 registry.cn-hangzhou.aliyuncs.com/testnamespace/promethues-demo:v0
  4. Run the following command to upload the image to the image repository:
    sudo docker push <Domain name of the image repository>/<Namespace>/<Image name>:<Image version number>
    Example:
    sudo docker push registry.cn-hangzhou.aliyuncs.com/testnamespace/promethues-demo:v0
    You can view the information about the uploaded application image on the Tags page of the Container Registry console. Image

Step 2: Deploy the application

To deploy the application to an ACK cluster, perform the following steps:

  1. Log on to the Alibaba Cloud Container Service for Kubernetes console.
  2. In the left-side navigation pane, click Clusters.
  3. On the Clusters page, find the cluster to which you want to deploy the application and click Applications in the Actions column.
  4. Create a container group.
    1. In the left-side navigation pane, choose Workloads > Deployments.
    2. On the Deployments page, click Create from YAML in the upper-right corner.
    3. On the Create page, enter the following code in the Template code editor and click Create:
      apiVersion: apps/v1 # for versions before 1.8.0 use apps/v1beta1
      kind: Deployment
      metadata:
        name: prometheus-demo
      spec:
        replicas: 2
        template:
          metadata:
            annotations:
              prometheus.io/scrape: 'true'
              prometheus.io/path: '/prometheus-metrics'
              prometheus.io/port: '8081'
            labels:
              app: tomcat
          spec:
            containers:
            - name: tomcat
              imagePullPolicy: Always
              image: <Domain name of the image repository>/<Namespace>/<Image name>:<Image version number>
              ports:
              - containerPort: 8080
                name: tomcat-normal
              - containerPort: 8081
                name: tomcat-monitor
      Sample code:
      apiVersion: apps/v1 # for versions before 1.8.0 use apps/v1beta1
      kind: Deployment
      metadata:
        name: prometheus-demo
        labels:
          app: tomcat
      spec:
        replicas: 2
        selector:
          matchLabels:
            app: tomcat
        template:
          metadata:
            annotations:
              prometheus.io/scrape: 'true'
              prometheus.io/path: '/prometheus-metrics'
              prometheus.io/port: '8081'
            labels:
              app: tomcat
          spec:
            containers:
            - name: tomcat
              imagePullPolicy: Always
              image: registry.cn-hangzhou.aliyuncs.com/peiyu-test/prometheus-demo:v0
              ports:
              - containerPort: 8080
                name: tomcat-normal
              - containerPort: 8081
                name: tomcat-monitor
    The Deployments page displays the created container group. Container group
  5. Create a Service.
    1. In the left-side navigation pane, choose Networks > Services.
    2. On the Services page, click Create Resources in YAML.
    3. On the Create page, enter the following code in the Template code editor and click Create:
      apiVersion: v1
      kind: Service
      metadata:
        labels:
          app: tomcat
        name: tomcat
        namespace: default
      spec:
        ports:
        - name: tomcat-normal
          port: 8080
          protocol: TCP
          targetPort: 8080
        - name: tomcat-monitor
          port: 8081
          protocol: TCP
          targetPort: 8081
        type: NodePort
        selector:
          app: tomcat
    The Services page displays the created Service. Service

Step 3: Configure data collection rules

By default, Prometheus monitors the CPU, memory, and network information. If you want Prometheus to monitor other data, such as order information, you must configure data collection rules for Prometheus to monitor the application.

  1. Log on to the ARMS console.
  2. In the left-side navigation pane, click Prometheus Monitoring.
  3. In the top navigation bar of the Prometheus Monitoring page, select the region where the ACK cluster resides. Find the cluster and click Settings in the Actions column.
  4. Configure data collection rules for Prometheus to monitor the application in the following scenarios:
    • To monitor the business data of the application that is deployed to the ACK cluster, such as order information, perform the following steps:
      1. On the Settings page, click the Service Discovery tab. On the Service Discovery tab, click the ServiceMonitor tab.
      2. On the ServiceMonitor tab, click Add ServiceMonitor.
      3. In the Add ServiceMonitor dialog box, enter the following code and click OK:
        apiVersion: monitoring.coreos.com/v1
        kind: ServiceMonitor
        metadata:
          # Enter a unique name.
          name: tomcat-demo
          # Enter a namespace.
          namespace: default
        spec:
          endpoints:
          - interval: 30s
            # Enter the value of the Name field for Port of Prometheus Exporter in the service.yaml file.
            port: tomcat-monitor
            # Enter the value of the Path field for Prometheus Exporter.
            path: /metrics
          namespaceSelector:
            any: true
            # The namespace of the demo.
          selector:
            matchLabels:
              # Enter the value of the Label field in the service.yaml file to find the service.yaml file.
              app: tomcat

        The ServiceMonitor tab displays the configured service discovery task.

        Service discovery
    • To monitor business data outside the ACK cluster, such as the number of Redis connections, perform the following steps:
      1. On the Settings page, click the Agent Settings tab.
      2. On the Agent Settings tab, enter the following code in the Prometheus.yaml code editor and click Save:
        global:
          scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
          evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
        scrape_configs:
          - job_name: 'prometheus'
            static_configs:
            - targets: ['localhost:9090']

Step 4: Configure a Grafana dashboard

To configure a Grafana dashboard to display data, perform the following steps:

  1. Go to the homepage of Grafana dashboards.
  2. In the left-side navigation pane, choose + > Dashboard.
  3. On the New dashboard page, click Add new panel.
    Create a Grafana dashboard
  4. On the New dashboard / Edit Panel page, select your ACK cluster from the drop-down list on the Query tab. In the A collapse panel, select a metric from the Metrics drop-down list. Example: go_gc_duration_seconds.
    Query tab
  5. On the right-side Panel tab, enter a panel name, select the visualization type of the dashboard, such as a chart, table, or heatmap, and then set other parameters as required.
    Virtualization
  6. Click Save in the upper-right corner. In the Save dashboard as... dialog box, enter a dashboard name, select your ACK cluster, and then click Save.

    You can create multiple dashboards and charts as required.

    Save a Grafana dashboard
    The following figure shows the configured Grafana dashboard. Configured Grafana dashboard

Step 5: Monitor complex metrics

To monitor metrics that involve complex operations, you must debug the corresponding PromQL statement in Prometheus.

  1. Go to the homepage of Grafana dashboards.
  2. In the left-side navigation pane, click the Explore icon.
  3. In the upper part of the Explore page, select your ACK cluster from the drop-down list. Enter a PromQL statement in the Metrics field and click Run Query in the upper-right corner for debugging.
    Prometheus data debugging
  4. After the debugging is successful, you can repeat the preceding steps to create more dashboards or charts. For more information, see Step 4: Configure a Grafana dashboard.

Step 6: Create an alert

To create an alert to monitor metrics of Prometheus, perform the following steps:

  1. Log on to the ARMS console.
  2. In the left-side navigation pane, click Prometheus Monitoring.
  3. In the top navigation bar of the Prometheus Monitoring page, select the region where the monitored Kubernetes cluster resides. Then, click the name of the Kubernetes cluster.
  4. In the left-side navigation pane, click Alarm configuration.
  5. On the Alarm configuration page, click Create Alert in the upper-right corner.
  6. In the Create Alert panel, set the parameters.
    1. Optional:Select a template from the Alarm template drop-down list.
    2. Enter a rule name in the Rule Name field. Example: Alert for inbound traffic.
    3. Enter a PromQL statement as the expression in the Alarm expression field. Example: (sum(rate(kube_state_metrics_list_total{job="kube-state-metrics",result="error"}[5m])) / sum(rate(kube_state_metrics_list_total{job="kube-state-metrics"}[5m]))) > 0.01.
      Notice If a PromQL statement contains a dollar sign ($), an error is returned. You must delete the equal sign (=) and the parameters on both sides of the equal sign (=) from the statement that contains the dollar sign ($). For example, change sum (rate (container_network_receive_bytes_total{instance=~"^$HostIp.*"}[1m])) to sum (rate (container_network_receive_bytes_total[1m])).
    4. Enter a number N in the duration field. After the alert rule is created, an alert notification is sent only when the alert condition is met for N consecutive minutes. For example, you can enter 1. In this case, an alert notification is sent only when the alert condition is met for one consecutive minute.
      Note The alert condition refers to the condition specified by the PromQL statement. By default, Prometheus collects data at intervals of 15 seconds. If less than 4N consecutively collected data records meet the alert condition, no alert notification is sent. The threshold 4N is calculated by using the following formula: N × 60/15 = 4N. In this example, N is set to 1. Therefore, an alert notification is sent only when four consecutively collected data records meet the alert condition. To ensure that an alert notification can be sent each time a collected data record meets the alert condition, set N to 0.
    5. Enter the notification content in the Alarm message field.
    6. Optional:In the Labels section of Advanced Configuration, click Create Tag to add one or more tags to the alert rule. The specified tags can be used as options for a notification rule.
    7. Optional:In the Annotations section of Advanced Configuration, click Create Annotation. Then, enter message in the Key field and {{variable name}} alert message in the Value field. The specified annotation is in the format of message:{{variable name}} alert message. Example: message:{{$labels.pod_name}} restart.

      You can customize a variable name or select an existing tag as the variable name. The following content describes the existing tags:

      • The tags that are carried in the metrics of an alert rule expression.
      • The tags that are created when you create an alert rule. For more information, see Create an alert rule.
      • The default tags provided by ARMS. The following table describes the default tags.
        Tag Description
        alertname The name of the alert. The format is <Alert name>_<Cluster name>.
        _aliyun_arms_alert_level The level of the alert.
        _aliyun_arms_alert_type The type of the alert.
        _aliyun_arms_alert_rule_id The ID of the alert rule.
        _aliyun_arms_region_id The ID of the region.
        _aliyun_arms_userid The ID of the user.
        _aliyun_arms_involvedObject_type The subtype of the associated object, for example, ManagedKubernetes or ServerlessKubernetes.
        _aliyun_arms_involvedObject_kind The type of the associated object, for example, app or cluster.
        _aliyun_arms_involvedObject_id The ID of the associated object.
        _aliyun_arms_involvedObject_name The name of the associated object.
    8. Click OK.
    After you create the alert rule, the Alarm configuration page displays the rule, as shown in the following figure. 8