To use GPU computing in Kubernetes clusters, you can schedule applications to nodes with GPUs. To make the scheduling process simple and efficient, you can add labels to these nodes.


Background information

When deploying nodes with NVIDIA GPUs, the Kubernetes cluster discovers the GPU attributes and exposes them as node labels. The node labels provide the following benefits:

  • You can quickly filter GPU nodes by labels.
  • You can use labels as scheduling conditions when you deploy applications.


  1. Log on to the Container Service console.
  2. In the left-side navigation pane, choose Clusters > Nodes and select a cluster to view nodes in the cluster.
    Note This example selects a cluster with three worker nodes, among which two are equipped with GPUs. Note the node IP addresses.
    View a node
  3. Select a GPU node and choose More > Details in the Actions column to go to the Kubernetes dashboard. You can view the labels on the nodes.
    Node details

    You can also log on to a master node and run the following command to view the labels on GPU nodes:

    # kubectl get nodes
    NAME                                STATUS    ROLES     AGE       VERSION
    cn-beijing.i-2ze2dy2h9w97v65uuaft   Ready     master    2d        v1.11.2
    cn-beijing.i-2ze8o1a45qdv5q8a7luz   Ready     <none>    2d        v1.11.2             #Compare the nodes here with the nodes displayed in the console to find the GPU nodes.
    cn-beijing.i-2ze8o1a45qdv5q8a7lv0   Ready     <none>    2d        v1.11.2
    cn-beijing.i-2ze9xylyn11vop7g5bwe   Ready     master    2d        v1.11.2
    cn-beijing.i-2zed5sw8snjniq6mf5e5   Ready     master    2d        v1.11.2
    cn-beijing.i-2zej9s0zijykp9pwf7lu   Ready     <none>    2d        v1.11.2

    Select a GPU node and run the following command to view its labels:

    # kubectl describe node cn-beijing.i-2ze8o1a45qdv5q8a7luz
    Name:               cn-beijing.i-2ze8o1a45qdv5q8a7luz
    Roles:              <none>
    Labels:             aliyun.accelerator/nvidia_count=1                          #This field is important.

    In this example, the GPU node has the following three labels:

    key value
    aliyun.accelerator/nvidia_count The number of GPU cores.
    aliyun.accelerator/nvidia_mem The GPU memory in MiB.
    aliyun.accelerator/nvidia_name The name of the NVIDIA graphics card.

    The same type of GPU nodes have the same graphic cards. You can use this label to filter nodes.

    # kubectl get no -l aliyun.accelerator/nvidia_name=Tesla-M40
    NAME                                STATUS    ROLES     AGE       VERSION
    cn-beijing.i-2ze8o1a45qdv5q8a7luz   Ready     <none>    2d        v1.11.2
    cn-beijing.i-2ze8o1a45qdv5q8a7lv0   Ready     <none>    2d        v1.11.2
  4. Go to the homepage of the Container Service console. In the left-side navigation pane, choose Applications > Deployments and click Create from Template in the upper-right corner.
    1. Create a TensorFlow application and schedule this application to a GPU node.
      Create an application
      This example uses the following YAML template:
      # Define the tensorflow deployment
      apiVersion: apps/v1
      kind: Deployment
        name: tf-notebook
          app: tf-notebook
        replicas: 1
        selector: # define how the deployment finds the pods it mangages
            app: tf-notebook
        template: # define the pods specifications
              app: tf-notebook
            nodeSelector:                                                  #This field is important.
              aliyun.accelerator/nvidia_name: Tesla-M40
            - name: tf-notebook
              image: tensorflow/tensorflow:1.4.1-gpu-py3
         1                                        #This field is important.
              - containerPort: 8888
                hostPort: 8888
                - name: PASSWORD
                  value: mypassw0rdv
    2. You can also avoid deploying an application to a GPU node. The following example deploys an Nginx Pod and schedules the Pod by using node affinity. For more information, see the part about node affinity in Create deployments by using images.

      This example uses the following YAML template:

      apiVersion: v1
      kind: Pod
        name: not-in-gpu-node
              - matchExpressions:
                - key: aliyun.accelerator/nvidia_name
                  operator: DoesNotExist
        - name: not-in-gpu-node
          image: nginx
  5. In the left-side navigation pane, choose Applications > Pods. Select the cluster and namespace to go to the Pods page.
    View Pods


On the Pods page, you can see that the two Pods from preceding examples have been scheduled to the target nodes. You can use labels to schedule Pods to specific GPU nodes easily.