All Products
Search
Document Center

Elastic Container Instance:Use Managed Service for Prometheus to monitor a GPU-accelerated elastic container instance

Last Updated:Mar 22, 2024

After you enable Managed Service for Prometheus for a Kubernetes cluster, you can use the predefined dashboards to monitor the performance metrics of GPU-accelerated elastic container instances in the Kubernetes cluster. This topic describes how to use Managed Service for Prometheus to monitor a GPU-accelerated elastic container instance.

Prerequisites

A Container Service for Kubernetes (ACK) cluster is created and Managed Service for Prometheus is enabled for the cluster. For more information, see Enable Managed Service for Prometheus for an ACK Serverless cluster.

Procedure

  1. Log on to the ACK console.

  2. Create a GPU-accelerated elastic container instance.

    In the following sample YAML file, a Deployment is created.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: gpu-monitor
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: test
      template:
        metadata:
          labels:
            app: test
            alibabacloud.com/eci: "true" 
          annotations:
           k8s.aliyun.com/eci-use-specs : "ecs.gn6i-c4g1.xlarge"     # Specify a GPU-accelerated instance type.
        spec:
          containers:
          - name: bert-container
            image: registry.cn-beijing.aliyuncs.com/eci_open/nginx:1.14.2
            ports:
            - containerPort: 80
            resources:
              limits:
                nvidia.com/gpu: 1   # Specify the number of GPUs allocated to a container.
    
  3. View GPU metrics.

    1. On the Overview tab of the Cluster Information page, click Prometheus Monitoring in the upper-right corner.

    2. Click the GPU Monitoring tab to view monitoring data.

      After Managed Service for Prometheus is enabled for the ACK serverless cluster, you can monitor GPU-accelerated elastic container instances in the cluster without the need to deploy additional plug-ins. By default, Managed Service for Prometheus provides ready-to-use monitoring dashboards.

      For more information, see Panels and Introduction to metrics.