Elastic GPU Service provides GPU-accelerated computing capacity to implement the ready availability and auto scaling of GPU computing resources. As an elastic computing service provided by Alibaba Cloud, Elastic GPU Service combines the computing power of GPUs and CPUs to meet the challenges of scenarios such as AI, high-performance computing, and professional graphics and image processing.
Elastic GPU Service platform
As a computing chip, GPU provides real-time, high-speed parallel computing and floating-point computing capacity. Elastic GPU Service combines ECS with high-speed parallel heterogeneous accelerators of GPUs, delivering both ECS features and GPU acceleration capabilities.
Based on Elastic GPU Service, Alibaba Cloud launched instances with GPU capabilities, which can be operated in the same manner as common ECS instances while providing GPU acceleration capabilities. To use instances with GPU capabilities, select an enterprise-level heterogeneous computing instance type. For more information, see Instance families.
- High elasticity
Provides serial instance families. Instances with GPU capabilities can be created within minutes and support horizontal scaling as well as instance type changes within the same instance family.
- High performance and high security
Supports point-to-point communication between GPUDirect and GPUs. GPUs can communicate with each other directly through NVLink with high bandwidth, low latency, and no CPU interventions. GPU provides elastic security isolation among tenants and authorizes and manages systems by using hypervisors. You can configure high speed communication between secure, isolated GPUs.
- Easy deployment
Deeply integrated with the Alibaba Cloud ecosystem. You can build applications by combining Elastic GPU Service with other Alibaba Cloud products. For example, you can combine GPUs with Object Storage Service (OSS) and Apsara File Storage NAS to meet storage requirements and with E-MapReduce (EMR) to preprocess deep learning data. Elastic GPU Service supports cloud-native applications such as Alibaba Cloud Kubernetes, facilitating delivery.
- Easy monitoring
Provides comprehensive GPU monitoring data, including GPUs, instances, and group dimensions, eliminating your O&M pressure. For more information, see GPU monitoring.