cGPU is a GPU virtualization and sharing service developed by Alibaba Cloud for use in container environments. A single GPU can be split and assigned to multiple isolated containers, which not only ensures multi-tenant security, but also helps you save costs by improving GPU resource utilization.

Why choose the cGPU service?

  • High compatibility

    cGPU is compatible with open source container technology such as Docker, Containerd, and Kubernetes.

  • Ease of use

    You do not need to re-compile AI applications or replace the Compute Unified Device Architecture (CUDA) library.

  • Flexible allocation of computing resources

    cGPU allows you to flexibly allocate physical GPU resources based on your business requirements. For example, cGPU supports dynamic allocation of GPU memory GPU utilization. GPU memory can be allocated at the MB scale and the minimum allocation for GPU utilization is 2%.

  • No limits on instance types

    cGPU applies to various GPU-accelerated instances such as GPU bare metal instances, virtualized instances, and vGPU-accelerated instances.

  • Wide application

    cGPU supports hybrid deployment of online and offline business, and can be used in AI and rendering scenarios that use CUDA.

  • Comprehensive features

    cGPU can guarantee resource availability for high-priority tasks, and provides O&M, hot upgrade, and multi-GPU allocation capabilities.

cGPU architecture

The following figure shows the architecture of cGPU:

cGPU architecture

When multiple containers run on a single physical GPU and the GPU resources are isolated among the containers, the GPU hardware resource utilization can be improved.

cGPU uses the server kernel driver developed by Alibaba Cloud to provide virtual GPU devices for containers, which isolates the memory and computing power of the GPUs without additional overheads. This allows your containers to obtain baremetal GPU performance. You can run commands to configure the virtual GPU devices in containers.