Application Real-Time Monitoring Service (ARMS) Managed Service for Prometheus allows you to use a Grafana dashboard to display monitoring data. You can create a custom Grafana dashboard or import a Grafana dashboard from the official website of Grafana. This topic describes how to create a custom Grafana dashboard to display monitoring data. In this example, Container Service for Kubernetes (ACK) and Container Registry are used.

Prerequisites

Procedure

The following figure shows how to create a Grafana dashboard by using Managed Service for Prometheus.

Custom Grafana dashboard

Step 1: Upload an application

To build an image for an application and upload the image to an image repository of Container Registry, perform the following steps:

  1. Run the following command to recompile a module:
    mvn clean install -DskipTests
  2. Run the following command to build an image:
    docker build -t <Name of the local temporary Docker image>:<Version number of the local temporary Docker image> . --no-cache
    Example:
    docker build -t promethues-demo:v0 . --no-cache
  3. Run the following command to tag the image:
    sudo docker tag <Name of the local temporary Docker image>:<Version number of the local temporary Docker image> <Domain name of the image repository>/<Namespace>/<Image name>:<Image version number>
    Example:
    sudo docker tag promethues-demo:v0 registry.cn-hangzhou.aliyuncs.com/testnamespace/promethues-demo:v0
  4. Run the following command to upload the image to the image repository:
    sudo docker push <Domain name of the image repository>/<Namespace>/<Image name>:<Image version number>
    Example:
    sudo docker push registry.cn-hangzhou.aliyuncs.com/testnamespace/promethues-demo:v0
    You can view the information about the uploaded application image on the Tags page of the Container Registry console. Image

Step 2: Deploy the application

To deploy the application to an ACK cluster, perform the following steps:

  1. Log on to the ACK console.
  2. In the left-side navigation pane, click Clusters.
  3. On the Clusters page, find the cluster to which you want to deploy the application and click Applications in the Actions column.
  4. Create a container group.
    1. In the left-side navigation pane, choose Workloads > Deployments.
    2. On the Deployments page, click Create from Template.
    3. On the Create page, enter the following code in the Template code editor and click Create:
      apiVersion: apps/v1 # for versions before 1.8.0 use apps/v1beta1
      kind: Deployment
      metadata:
        name: prometheus-demo
      spec:
        replicas: 2
        template:
          metadata:
            annotations:
              prometheus.io/scrape: 'true'
              prometheus.io/path: '/prometheus-metrics'
              prometheus.io/port: '8081'
            labels:
              app: tomcat
          spec:
            containers:
            - name: tomcat
              imagePullPolicy: Always
              image: <Domain name of the image repository>/<Namespace>/<Image name>:<Image version number>
              ports:
              - containerPort: 8080
                name: tomcat-normal
              - containerPort: 8081
                name: tomcat-monitor
      Sample code:
      apiVersion: apps/v1 # for versions before 1.8.0 use apps/v1beta1
      kind: Deployment
      metadata:
        name: prometheus-demo
        labels:
          app: tomcat
      spec:
        replicas: 2
        selector:
          matchLabels:
            app: tomcat
        template:
          metadata:
            annotations:
              prometheus.io/scrape: 'true'
              prometheus.io/path: '/prometheus-metrics'
              prometheus.io/port: '8081'
            labels:
              app: tomcat
          spec:
            containers:
            - name: tomcat
              imagePullPolicy: Always
              image: registry.cn-hangzhou.aliyuncs.com/peiyu-test/prometheus-demo:v0
              ports:
              - containerPort: 8080
                name: tomcat-normal
              - containerPort: 8081
                name: tomcat-monitor
    The newly created container group is displayed on the Deployments page. Container group
  5. Create a Service.
    1. In the left-side navigation pane, choose Network > Services.
    2. On the Services page, click Create Resources in YAML.
    3. On the Create page, enter the following code in the Template code editor and click Create:
      apiVersion: v1
      kind: Service
      metadata:
        labels:
          app: tomcat
        name: tomcat
        namespace: default
      spec:
        ports:
        - name: tomcat-normal
          port: 8080
          protocol: TCP
          targetPort: 8080
        - name: tomcat-monitor
          port: 8081
          protocol: TCP
          targetPort: 8081
        type: NodePort
        selector:
          app: tomcat
    The newly created Service is displayed on the Services page. Service

Step 3: Configure data collection rules

By default, Managed Service for Prometheus monitors the data about the CPU, memory, and network. If you want Managed Service for Prometheus to monitor other data such as order information, you must configure data collection rules for Managed Service for Prometheus to monitor the application.

  1. Log on to the ARMS console.
  2. In the left-side navigation pane, choose Prometheus Service > Prometheus Instances.
  3. Click the name of the Prometheus instance that you want to manage.
  4. Configure data collection rules for Managed Service for Prometheus to monitor the application in the following scenarios:
    • To monitor the business data of the application that is deployed to the ACK cluster, such as order information, perform the following steps:
      1. In the left-side navigation pane, click Service Discovery. Then, click the Configure tab.
      2. On the Configure tab, click the ServiceMonitor tab. Then, click Add ServiceMonitor.
      3. In the Add ServiceMonitor dialog box, enter the following code and click OK:
        apiVersion: monitoring.coreos.com/v1
        kind: ServiceMonitor
        metadata:
          # Enter a unique name.
          name: tomcat-demo
          # Enter a namespace.
          namespace: default
        spec:
          endpoints:
          - interval: 30s
            #  Enter the value of the Name field of the Port parameter of the Prometheus exporter in the service.yaml file.
            port: tomcat-monitor
            #  Enter the path of the Prometheus exporter.
            path: /metrics
          namespaceSelector:
            any: true
            # The namespace of the demo.
          selector:
            matchLabels:
              # Enter the value of the Label field in the service.yaml file to find the service.yaml file.
              app: tomcat

        The configured service discovery task is displayed on the ServiceMonitor tab.

        Service discovery
    • To monitor business data outside the ACK cluster, such as the number of Redis connections, perform the following steps:
      1. In the left-side navigation pane, click Settings. On the Settings page, click Edit Prometheus.yaml.
      2. In the dialog box that appears, enter the following code and click Save:
        global:
          scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
          evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
        scrape_configs:
          - job_name: 'prometheus'
            static_configs:
            - targets: ['localhost:9090']

Step 4: Configure a Grafana dashboard

To configure a Grafana dashboard to display data, perform the following steps:

  1. Go to the Grafana dashboard homepage.
  2. In the left-side navigation pane, choose + > Dashboard.
  3. On the New dashboard page, click Add new panel.
    Create Grafana DashBoard
  4. On the New dashboard / Edit Panel page, select your ACK cluster from the drop-down list on the Query tab. In the A collapsible section, select a metric from the Metrics drop-down list. Example: go_gc_duration_seconds.
    Grafana Add Query
  5. On the right-side Panel tab, enter a panel name, select the visualization type of the dashboard, such as a chart, table, or heatmap, and then set other parameters as required.
    Create Dashboard Visualization
  6. Click Save in the upper-right corner. In the Save dashboard as... dialog box, enter a dashboard name, select your ACK cluster, and then click Save.

    You can create multiple dashboards and charts as required.

    Save Grafana Dashboard
    The following figure shows the configured Grafana dashboard. ARMS Prometheus Grafana Dashboard to Customize

Step 5: Monitor complex metrics

To monitor metrics that involve complex operations, you must debug the corresponding PromQL statement in Managed Service for Prometheus.

  1. Go to the homepage of Grafana dashboards.
  2. In the left-side navigation pane, click Explore.
  3. In the upper part of the Explore page, select your ACK cluster from the drop-down list. Enter a PromQL statement in the Metrics field and click Run Query in the upper-right corner for debugging.
    Prometheus Data Debug
  4. After the debugging is successful, you can repeat the preceding steps to create more dashboards or charts. For more information, see Step 4: Configure a Grafana dashboard.

Step 6: Create an alert rule

To create an alert rule to monitor a metric, perform the following steps:

  1. Log on to the ARMS console.
  2. In the left-side navigation pane, click Prometheus Monitoring.
  3. In the top navigation bar of the Prometheus Monitoring page, select the region where the monitored Kubernetes cluster resides. Then, click the name of the Kubernetes cluster.
  4. In the left-side navigation pane, click Alarm configuration.
  5. On the Alarm configuration page, click Create Alert in the upper-right corner.
  6. In the Create Alert panel, set the parameters.
    1. Optional:Select a template from the Alarm template drop-down list.
    2. Enter a rule name in the Rule Name field. Example: Alert for inbound traffic.
    3. Enter a PromQL statement as the expression in the Alarm expression field. Example: (sum(rate(kube_state_metrics_list_total{job="kube-state-metrics",result="error"}[5m])) / sum(rate(kube_state_metrics_list_total{job="kube-state-metrics"}[5m]))) > 0.01.
      Important If a PromQL statement contains a dollar sign ($), an error is returned. You must delete the equal sign (=) and the parameters on both sides of the equal sign (=) from the statement that contains the dollar sign ($). For example, change sum (rate (container_network_receive_bytes_total{instance=~"^$HostIp.*"}[1m])) to sum (rate (container_network_receive_bytes_total[1m])).
    4. Enter a number N in the duration field. After the alert rule is created, an alert notification is sent only when the alert condition is met for N consecutive minutes. For example, you can enter 1. In this case, an alert notification is sent only when the alert condition is met for one consecutive minute.
      Note The alert condition refers to the condition specified by the PromQL statement. By default, Prometheus collects data at intervals of 15 seconds. If less than 4N consecutively collected data records meet the alert condition, no alert notification is sent. The threshold 4N is calculated by using the following formula: N × 60/15 = 4N. In this example, N is set to 1. Therefore, an alert notification is sent only when four consecutively collected data records meet the alert condition. To ensure that an alert notification can be sent each time a collected data record meets the alert condition, set N to 0.
    5. Enter the notification content in the Alarm message field.
    6. Optional:In the Labels section of Advanced Configuration, click Create Tag to add one or more tags to the alert rule. The specified tags can be used as options for a notification rule.
    7. Optional:In the Annotations section of Advanced Configuration, click Create Annotation. Then, enter message in the Key field and {{variable name}} alert message in the Value field. The specified annotation is in the format of message:{{variable name}} alert message. Example: message:{{$labels.pod_name}} restart.

      You can customize a variable name or select an existing tag as the variable name. The following content describes the existing tags:

      • The tags that are carried in the metrics of an alert rule expression.
      • The tags that are created when you create an alert rule. For more information, see Create an alert rule.
      • The default tags provided by ARMS. The following table describes the default tags.
        TagDescription
        alertnameThe name of the alert. The format is <Alert name>_<Cluster name>.
        _aliyun_arms_alert_levelThe level of the alert.
        _aliyun_arms_alert_typeThe type of the alert.
        _aliyun_arms_alert_rule_idThe ID of the alert rule.
        _aliyun_arms_region_idThe ID of the region.
        _aliyun_arms_useridThe ID of the user.
        _aliyun_arms_involvedObject_typeThe subtype of the associated object, for example, ManagedKubernetes or ServerlessKubernetes.
        _aliyun_arms_involvedObject_kindThe type of the associated object, for example, app or cluster.
        _aliyun_arms_involvedObject_idThe ID of the associated object.
        _aliyun_arms_involvedObject_nameThe name of the associated object.
    8. Click OK.
    After you create the alert rule, the Alarm configuration page displays the rule, as shown in the following figure. 8