This topic describes how to create graphics processing unit (GPU) instances in Elastic Container Instance (ECI). You can directly run GPU-related Docker images, such as tensorflow/tensorflow:1.13.1-gpu and nvidia/cuda, in GPU instances, without the need to install related software, such as TensorFlow and CUDA Toolkit. Currently, GPU instances in ECI support NVIDIA driver 410.79 and CUDA Toolkit 10.0.

GPU instance types

You can create GPU instances by specifying Elastic Compute Service (ECS) instance types with GPU capabilities.

The following instance families are supported.

  • gn6v, compute optimized instance family that uses NVIDIA V100 GPU processors. Example: ecs.gn6v-c8g1.2xlarge.
  • gn6i, compute optimized instance family that uses NVIDIA T4 GPU processors. Example: ecs.gn6i-c4g1.xlarge.
  • gn5, compute optimized instance family that uses NVIDIA P100 GPU processors. Example: ecs.gn5-c4g1.xlarge.
  • gn5i, compute optimized instance family that uses NVIDIA P4 GPU processors. Example: ecs.gn5i-c2g1.large.

For more information, see Instance families.

Kubernetes mode

You can specify ECS instance types with GPU capabilities in the k8s.aliyun.com/eci-instance-type annotation of a pod to be created.

  • Add the annotation that specifies the instance types in the metadata parameter for the pod.
  • Specify the GPU resources in the resources parameter for containers.
apiVersion: apps/v1beta2 # For versions earlier than 1.8.0, use apps/v1beta1.
kind: Deployment
metadata:
  name: nginx-gpu-demo-1
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
      annotations:
        k8s.aliyun.com/eci-instance-type: ecs.gn5i-c4g1.xlarge
    spec:
    #  nodeSelector:
    #    env: test-team
      containers:
      - name: nginx
        image: registry-vpc.cn-beijing.aliyuncs.com/eci_open/nginx:1.15.10 # Replace it with your image name, in the format of <image_name:tags>.
        resources:
            limits:
              nvidia.com/gpu: '1'
        ports:
        - containerPort: 80

API mode

CreateContainerGroup

To create GPU instances, specify ECS instance types with GPU capabilities in the InstanceType parameter.

Request parameter

Parameter Type Required Description
InstanceType String Yes The instance type of the ECI to be created.

Container parameter

Parameter Type Required Description
Gpu Integer Yes The number of GPUs to be allocated to the container.

You can specify one or more ECS instance types in the InstanceType parameter. If you want to call the CreateContainerGroup operation to create a GPU instance, the InstanceType parameter is required. If you specify the Gpu parameter for containers without specifying the InstanceType parameter, an error message is returned.

GPUs are used by containers in an ECI. The total number of GPUs used by all containers cannot exceed the maximum number supported by the specified instance type. If the maximum number is exceeded, an error message is returned.

UpdateContainerGroup

To update the number of GPUs to be allocated to a container, call the UpdateContainerGroup operation and specify the required number of GPUs through the Gpu parameter for the container.

Container parameter

Parameter Type Required Description
Gpu Integer Yes The number of GPUs to be allocated to the container.