Elastic GPU Service provides GPU-accelerated computing capabilities and ready-to-use and scalable GPU computing resources. Elastic GPU Service is an elastic computing service provided by Alibaba Cloud. This service combines the computing power of GPUs and CPUs to address the challenges in scenarios such as AI, high-performance computing, and professional graphics and image processing. For example, you can use Elastic GPU Service in a parallel computing scenario to significantly accelerate computing.
Why Elastic GPU Service?
Elastic GPU Service provides GPU-accelerated instances that are used as computing servers based on GPUs and CPUs. GPUs provide unique benefits in performing mathematical and geometric computing. In specific scenarios such as floating-point and parallel computing, GPUs can provide 100 times the computing power of their CPU counterparts. GPUs support the following features:
Have a large number of arithmetic logic units (ALUs) that can be used for large-scale parallel computing.
Support high-throughput scenarios in which multiple threads run in parallel to process computing tasks.
Have simple logic control units (LCUs).
The following table compares GPU-accelerated instances that are provided by Elastic GPU Service and self-managed GPU-accelerated servers.
Item | GPU-accelerated instance | Self-managed GPU-accelerated server |
Flexibility |
|
|
Ease of use |
|
|
Disaster recovery and backup |
|
|
Security |
|
|
Costs |
|
|
Instance families with GPU capabilities
An instance is the minimum unit that provides computing service for your business. The computing capability of an instance varies based on the instance type. Elastic Compute Service (ECS) provides a variety of instance families for different business scenarios and use cases. GPU-accelerated instances are a type of ECS instances. You can use GPU-accelerated instances in the same manner as common ECS instances. In addition, GPU-accelerated instances provide GPU acceleration capabilities. To create GPU-accelerated instances, you must select instance types from instance families of the enterprise-level heterogeneous computing, ECS Bare Metal Instance, or Super Computing Cluster category. For more information about instance types, see Overview of instance families.
Benefits
Extensive service coverage
Elastic GPU Service supports large-scale deployment to allow you to deploy GPU-accelerated instances in 17 regions around the world. Elastic GPU Service also provides flexible delivery methods such as auto provisioning and auto scaling to help you meet the sudden demands of your business.
Superior computing power
Elastic GPU Service provides GPUs that have superior computing power. When you use Elastic GPU Service together with a high-performance CPU platform, a GPU-accelerated instance can provide mixed-precision computing performance of up to 1,000 trillion floating point operations per second (TFLOPS).
Superior network performance
GPU-accelerated instances use virtual private clouds (VPCs) that support up to 4.5 million packets per second (Mpps) and 32 Gbit/s of internal bandwidth. You can use GPU-accelerated instances together with Super Computing Cluster (SCC) to provide a remote direct memory access (RDMA) network that have up to 50 Gbit/s of bandwidth between nodes. This allows you to meet the low-latency and high-bandwidth requirements when data is transmitted between nodes.
Flexible purchase methods
Elastic GPU Service supports various billing methods, such as the subscription and pay-as-you-go billing methods, preemptible instances, reserved instances, and storage capacity units (SCUs). To prevent inefficient use of resources, select a billing method based on your business requirements.
NoteYou cannot purchase reserved instances for specific GPU-accelerated instance families. For more information, see Attributes.
Alibaba Cloud also provides DeepGPU that you can use together with Elastic GPU Service. DeepGPU provides enhanced GPU computing capabilities and can help you manage Alibaba Cloud resources in a more convenient and efficient manner. For more information, see DeepGPU.
Billing
Elastic GPU Service resources are billed in the same manner as ECS resources. Computing resources (vCPUs, GPUs, and memory), images, Elastic Block Storage (EBS) devices, public bandwidth, and snapshots are billable resources in Elastic GPU Service.
The following billing methods are supported:
Subscription: You pay for resources upfront and use the resources over a period of time.
Pay-as-you-go: You pay for resources that you use. Resources can be purchased and released based on your business requirements.
Preemptible instance: You bid for available computing resources to create preemptible instances. Compared with pay-as-you-go instances, preemptible instances provide discounts. However, preemptible instances can be reclaimed.
Reserved instance: Reserved instances are discount coupons that are used together with pay-as-you-go instances. When you purchase a reserved instance, you must use instances that have specific configurations such as instance type, region, and zone to receive discounted billing. Reserved instances are applied to offset the bills of computing resources.
Savings plan: Savings plans are discount plans that are used together with pay-as-you-go instances. When you purchase a savings plan, you must use a consistent amount (measured in USD/hour) of resources to receive discounted billing. Saving plans are applied to offset the bills of computing resources and system disks.
Storage capacity unit (SCU): SCUs are storage resource plans provided for use with pay-as-you-go storage resources. When you purchase an SCU, you must use storage resources of specific capacity to receive discounted billing. SCUs are applied to offset the bills of various storage resources such as EBS devices, Apsara File Storage NAS file systems, and OSS buckets.
For more information about the billing, see Billing for Elastic GPU Service.
Related tools
Alibaba Cloud provides DeepGPU. DeepGPU can help you use GPU resources in an efficient manner. DeepGPU includes the following tools.
For more information about DeepGPU, see What is DeepGPU?
Tool | Description |
An AI accelerator developed by Alibaba Cloud for distributed trainings to significantly improve training performance. | |
An AI accelerator developed by Alibaba Cloud for model inference to significantly improve inference performance. | |
A communication optimization library that is used for the distributed training of AI models released by Alibaba Cloud. The library can provide better compatibility, applicability, and performance. | |
An optimizing compiler for AI training developed by Alibaba Cloud. The compiler is used to optimize the computing performance of the PyTorch models in an imperceptible manner. | |
A kernel-based GPU virtualization and sharing service developed by Alibaba Cloud. cGPU can split and assign a single GPU to multiple isolated containers. | |
FastGPU is a tool provided by Alibaba Cloud to build AI computing tasks. This tool provides interfaces and command lines for you to build AI computing tasks on Alibaba Cloud IaaS resources. |