Elastic GPU Service provides GPU-accelerated computing capabilities and ready-to-use, scalable GPU computing resources. Elastic GPU Service is an elastic computing service provided by Alibaba Cloud. This service combines the computing power of GPUs and CPUs to address the challenges of scenarios such as AI, high-performance computing, and professional graphics and image processing.

Why Elastic GPU Service?

GPU-accelerated instances are computing servers based on GPUs and CPUs. GPUs have unique advantages in performing mathematical and geometric computing, especially floating-point and parallel computing. GPUs provide 100 times the computing power of their CPU counterparts. GPUs have the following features:

  • Have a large number of arithmetic logic units (ALUs) that can be used for large-scale parallel computing.
  • Support high-throughput scenarios in which multiple threads run in parallel to process computing tasks.
  • Have simple logic control units (LCUs).

The following table compares GPU-accelerated instances provided by Elastic GPU Service and self-managed GPU-accelerated servers.

Item GPU-accelerated instance Self-managed GPU-accelerated server
  • Allows you to create one or more GPU-accelerated instances with ease.
  • Supports online changes between instance types with different numbers of vCPUs and memory sizes, including instance type upgrades and downgrades.
  • Provides adjustable bandwidths.
  • Requires an extended subscription period.
  • Has configurations that cannot be changed.
  • Requires a one-off purchase of a bandwidth that cannot be adjusted.
Ease of use
  • Can be managed online in a web console.
  • Has a built-in mainstream operating system such as an activated genuine Windows operating system and allows the operating system to be replaced online.
  • Allows you to purchase and install GPU drivers when you purchase the instance.
  • Does not provide online management tools and requires manual management and maintenance.
  • Requires you to install and replace an operating system on your own.
  • Requires you to purchase and install GPU drivers on your own.
Disaster recovery and backup
  • Uses a triplicate storage mechanism by which three copies of each piece of data are stored. When one copy is corrupted, the data can be restored from the other copies within a short period of time.
  • Allows hardware to be automatically recovered from failures.
  • Requires you to build your own disaster recovery environment by using costly conventional storage devices.
  • Requires you to manually restore corrupted data.
  • Effectively defends against Media Access Control (MAC) spoofing and Address Resolution Protocol (ARP) attacks.
  • Effectively defends against DDoS attacks by using blackhole filtering and traffic scrubbing.
  • Provides additional services such as port scanning, trojan scanning, and vulnerability scanning.
  • Poorly defends against MAC spoofing and ARP attacks.
  • Requires traffic scrubbing and blackhole filtering devices at additional costs.
  • Typically encounters problems such as port scans, trojans, and vulnerabilities.
  • Supports the subscription and pay-as-you-go billing methods. You can select an appropriate billing method based on your business needs.
  • Allows you to purchase resources on demand without the need to make a large upfront investment.
  • Requires you to purchase resources by paying upfront to meet configuration requirements of peak hours.
  • Requires a large upfront investment and results in serious resource waste.

Elastic GPU Service platform

A GPU is a computing chip that provides real-time, high-speed parallel computing and floating-point computing capabilities. Elastic GPU Service combines Alibaba Cloud elastic computing services with high-speed parallel heterogeneous accelerators of GPUs to provide GPU-accelerated elastic computing services.

Alibaba Cloud launched GPU-accelerated instances based on the Elastic GPU Service platform. GPU-accelerated instances provide GPU acceleration capabilities and can be managed in the same manner as common Elastic Compute Service (ECS) instances. To create GPU-accelerated instances, select an enterprise-level heterogeneous computing instance type. For more information about instance types, see Instance family.


  • High elasticity

    Provides serial instance families. GPU-accelerated instances can be created within minutes and be horizontally and vertically scaled.

  • High performance and high security

    Supports GPUDirect for GPU-to-GPU communication. GPUs can communicate with each other over high-bandwidth, low-latency NVLink connections without interventions of CPUs. GPUs provide elastic security isolation among tenants. Hypervisors can be used to authorize tenants to use and manage GPUs. You can configure high-speed communication between isolated GPUs in a secure manner.

  • Ease of deployment

    Deeply integrated with the Alibaba Cloud ecosystem. You can combine Elastic GPU Service with other Alibaba Cloud services to build applications. For example, you can combine Elastic GPU Service with Object Storage Service (OSS) or Apsara File Storage NAS to meet storage requirements and with E-MapReduce (EMR) to preprocess deep learning data. You can also combine Elastic GPU Service with Container Service for Kubernetes (ACK) to make services easier to deliver to users.

  • Ease of monitoring

    Provides comprehensive monitoring in dimensions such as GPUs, instances, and groups to alleviate your O&M pressure. For more information, see GPU monitoring.


Elastic GPU Service resources are billed in the same ways as ECS resources. Computing resources (vCPUs and memory), images, Elastic Block Storage (EBS) devices, public bandwidth, and snapshots are billable resources in Elastic GPU Service.

The following common billing methods are supported:
  • Subscription: You can pay for resources upfront and use them over a period of time.
  • Pay-as-you-go: You pay for resources after you use them. Resources can be purchased and released as needed.
  • Preemptible instance: You bid for available computing resources to create preemptible instances. Preemptible instances offer discounts compared with pay-as-you-go instances. However, preemptible instances can be reclaimed.
  • Reserved instance: Reserved instances are discount coupons that are used together with pay-as-you-go instances. When you purchase a reserved instance, you make a commitment to use instances that have specified configurations such as instance type, region, and zone to receive discounted billing. Reserved instances are applied to offset the bills of computing resources.
  • Savings plan: Savings plans are discount plans that are used together with pay-as-you-go instances. When you purchase a savings plan, you make a commitment to use a consistent amount (measured in USD/hour) of resources to receive discounted billing. Saving plans are applied to offset the bills of computing resources and system disks.
  • Storage capacity unit (SCU): SCUs are storage resource plans provided for use with pay-as-you-go storage resources. When you purchase an SCU, you make a commitment to use storage resources of specific capacity to receive discounted billing. SCUs are applied to offset the bills of various storage resources such as EBS devices, Apsara File Storage NAS file systems, and OSS buckets.

For more information about the billing methods of ECS instances, see Overview and the Pricing tab of the Elastic Compute Service product page.

Related tools

Alibaba Cloud provides the following tools for you to use GPU resources more efficiently:
  • AIACC-Training: an AI accelerator developed by Alibaba Cloud to improve training performance. For more information, see Automatically install AIACC-Training.
  • AIACC-Inference: an AI accelerator developed by Alibaba Cloud to improve inference performance. For more information, see Automatically install AIACC-Inference.
  • cGPU: a technology used to isolate GPU resources so that multiple containers can share a single graphics card. For more information, see What is the cGPU service?.
  • FastGPU: a tool provided by Alibaba Cloud to build AI computing tasks. It provides interfaces and command lines for you to build AI computing tasks on Alibaba Cloud IaaS resources. For more information, see What is FastGPU?.