Elastic GPU Service - Elastic GPU Service - Alibaba Cloud Documentation Center

Elastic GPU Service provides GPU-accelerated computing capabilities and ready-to-use and scalable GPU computing resources. Elastic GPU Service is an elastic computing service provided by Alibaba Cloud. This service combines the computing power of GPUs and CPUs to address the challenges in scenarios such as AI, high-performance computing, and professional graphics and image processing. For example, you can use Elastic GPU Service in a parallel computing scenario to significantly accelerate computing.

Why Elastic GPU Service?

Elastic GPU Service provides GPU-accelerated instances that are used as computing servers based on GPUs and CPUs. GPUs provide unique benefits in performing mathematical and geometric computing. In specific scenarios such as floating-point and parallel computing, GPUs can provide 100 times the computing power of their CPU counterparts. GPUs support the following features:

Have a large number of arithmetic logic units (ALUs) that can be used for large-scale parallel computing.
Support high-throughput scenarios in which multiple threads run in parallel to process computing tasks.
Have simple logic control units (LCUs).

The following table compares GPU-accelerated instances that are provided by Elastic GPU Service and self-managed GPU-accelerated servers.

Item	GPU-accelerated instance	Self-managed GPU-accelerated server
Flexibility	Allows you to create one or more GPU-accelerated instances with ease. Supports flexible changes between instance specifications that are configured with different vCPUs, GPUs, and memory, including online upgrades and downgrades. Provides adjustable bandwidths.	Requires an extended subscription period. Provides server specifications that cannot be changed. Requires a one-off purchase of a bandwidth that cannot be adjusted.
Ease of use	Provides online web management tools that are easy and convenient to use. Provides built-in mainstream OSs such as an activated genuine Windows OS and supports online switching between OSs. Allows you to purchase and install GPU drivers when you purchase the instance.	Does not provide online management tools and requires complex maintenance. Requires you to install and replace an OS on your own. Requires you to purchase and install GPU drivers on your own.
Disaster recovery and backup	Uses a triplicate storage mechanism by which three copies of each piece of data are stored. When one copy is corrupted, the data can be restored from the another copy within a short period of time. Allows hardware to be automatically recovered when failures occur.	Requires you to build your disaster recovery environment on your own by using high-cost conventional storage devices. Requires you to manually restore corrupted data.
Security	Effectively defends against Media Access Control (MAC) spoofing and Address Resolution Protocol (ARP) attacks. Defends against DDoS attacks by using blackhole filtering and traffic scrubbing. Provides additional services such as scanning for port intrusions, trojans, and vulnerabilities.	Poorly defends against MAC spoofing and ARP attacks. Requires traffic scrubbing and blackhole filtering devices at additional costs. Typically encounters problems such as port scans, trojans, and vulnerabilities.
Costs	Supports the subscription and pay-as-you-go billing methods. You can select an appropriate billing method based on your business requirements. Allows you to purchase on-demand resources without the need to make a large upfront investment.	Requires you to purchase resources by paying upfront to meet configuration requirements during peak hours. Requires a large upfront investment and results in serious resource waste.

Instance families with GPU capabilities

An instance is the minimum unit that provides computing service for your business. The computing capability of an instance varies based on the instance type. Elastic Compute Service (ECS) provides a variety of instance families for different business scenarios and use cases. GPU-accelerated instances are a type of ECS instances. You can use GPU-accelerated instances in the same manner as common ECS instances. In addition, GPU-accelerated instances provide GPU acceleration capabilities. To create GPU-accelerated instances, you must select instance types from instance families of the enterprise-level heterogeneous computing, ECS Bare Metal Instance, or Super Computing Cluster category. For more information about instance types, see Overview of instance families.

Benefits

Extensive service coverage
Elastic GPU Service supports large-scale deployment to allow you to deploy GPU-accelerated instances in 17 regions around the world. Elastic GPU Service also provides flexible delivery methods such as auto provisioning and auto scaling to help you meet the sudden demands of your business.
Superior computing power
Elastic GPU Service provides GPUs that have superior computing power. When you use Elastic GPU Service together with a high-performance CPU platform, a GPU-accelerated instance can provide mixed-precision computing performance of up to 1,000 trillion floating point operations per second (TFLOPS).
Superior network performance
GPU-accelerated instances use virtual private clouds (VPCs) that support up to 4.5 million packets per second (Mpps) and 32 Gbit/s of internal bandwidth. You can use GPU-accelerated instances together with Super Computing Cluster (SCC) to provide a remote direct memory access (RDMA) network that have up to 50 Gbit/s of bandwidth between nodes. This allows you to meet the low-latency and high-bandwidth requirements when data is transmitted between nodes.
Flexible purchase methods
Elastic GPU Service supports various billing methods, such as the subscription and pay-as-you-go billing methods, preemptible instances, reserved instances, and storage capacity units (SCUs). To prevent inefficient use of resources, select a billing method based on your business requirements.
Note
You cannot purchase reserved instances for specific GPU-accelerated instance families. For more information, see Attributes.

Alibaba Cloud also provides DeepGPU that you can use together with Elastic GPU Service. DeepGPU provides enhanced GPU computing capabilities and can help you manage Alibaba Cloud resources in a more convenient and efficient manner. For more information, see DeepGPU.

Billing

Elastic GPU Service resources are billed in the same manner as ECS resources. Computing resources (vCPUs, GPUs, and memory), images, Elastic Block Storage (EBS) devices, public bandwidth, and snapshots are billable resources in Elastic GPU Service.

The following billing methods are supported:

Subscription: You pay for resources upfront and use the resources over a period of time.
Pay-as-you-go: You pay for resources that you use. Resources can be purchased and released based on your business requirements.
Preemptible instance: You bid for available computing resources to create preemptible instances. Compared with pay-as-you-go instances, preemptible instances provide discounts. However, preemptible instances can be reclaimed.
Reserved instance: Reserved instances are discount coupons that are used together with pay-as-you-go instances. When you purchase a reserved instance, you must use instances that have specific configurations such as instance type, region, and zone to receive discounted billing. Reserved instances are applied to offset the bills of computing resources.
Savings plan: Savings plans are discount plans that are used together with pay-as-you-go instances. When you purchase a savings plan, you must use a consistent amount (measured in USD/hour) of resources to receive discounted billing. Saving plans are applied to offset the bills of computing resources and system disks.
Storage capacity unit (SCU): SCUs are storage resource plans provided for use with pay-as-you-go storage resources. When you purchase an SCU, you must use storage resources of specific capacity to receive discounted billing. SCUs are applied to offset the bills of various storage resources such as EBS devices, Apsara File Storage NAS file systems, and OSS buckets.

For more information about the billing, see Billing for Elastic GPU Service.

Related tools

Alibaba Cloud provides DeepGPU. DeepGPU can help you use GPU resources in an efficient manner. DeepGPU includes the following tools.

Note

For more information about DeepGPU, see What is DeepGPU?

Tool	Description
AIACC-Training	An AI accelerator developed by Alibaba Cloud for distributed trainings to significantly improve training performance.
AIACC-Inference	An AI accelerator developed by Alibaba Cloud for model inference to significantly improve inference performance.
AIACC 2.0-AIACC Communication Speeding (AIACC-ACSpeed)	A communication optimization library that is used for the distributed training of AI models released by Alibaba Cloud. The library can provide better compatibility, applicability, and performance.
AIACC Graph Speeding (AIACC-AGSpeed)	An optimizing compiler for AI training developed by Alibaba Cloud. The compiler is used to optimize the computing performance of the PyTorch models in an imperceptible manner.
cGPU	A kernel-based GPU virtualization and sharing service developed by Alibaba Cloud. cGPU can split and assign a single GPU to multiple isolated containers.
FastGPU	FastGPU is a tool provided by Alibaba Cloud to build AI computing tasks. This tool provides interfaces and command lines for you to build AI computing tasks on Alibaba Cloud IaaS resources.