什么是云服务器ECS GPU虚拟化型实例 - Elastic GPU Service

Instances of vGPU-accelerated instance families provide high-performance graphics processing and GPU-accelerated computing capabilities. vGPU-accelerated instances are suitable for graphics acceleration and rendering scenarios and general-purpose computing scenarios. This topic describes the features of vGPU-accelerated instance families of Elastic Compute Service (ECS) and lists the instance types of each instance family.

sgn7i-vws, vGPU-accelerated instance family with shared CPUs
vgn7i-vws, vGPU-accelerated instance family
vgn6i and vgn6i-vws, vGPU-accelerated instance families

sgn7i-vws, vGPU-accelerated instance family with shared CPUs

Features:

This instance family uses the third-generation SHENLONG architecture to provide predictable and consistent ultra-high performance. This instance family utilizes fast path acceleration on chips to improve storage performance, network performance, and computing stability by an order of magnitude. This way, data storage and model loading can be performed more quickly.
Instances of this instance family share CPU and network resources to maximize the utilization of underlying resources. Each instance has exclusive access to its memory and GPU memory to provide data isolation and performance assurance.
Note
If you want to use exclusive CPU resources, select the vgn7i-vws instance family.
This instance family comes with a NVIDIA GRID vWS license and provides certified graphics acceleration capabilities for Computer Aided Design (CAD) software to meet the requirements of professional graphic design. Instances of this instance family can serve as lightweight GPU-accelerated compute-optimized instances to reduce the costs of small-scale AI inference tasks.
Compute:
- Uses NVIDIA A10 GPUs that have the following features:
  - Innovative Ampere architecture
  - Support for acceleration features (such as vGPU, RTX, and TensorRT) to provide diversified business support
- Uses 2.9 GHz Intel^® Xeon^® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports only ESSDs and ESSD AutoPL disks.
Network:
- Supports IPv6.
- Provides high network performance based on large computing capacity.
Supported scenarios:
- Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification
- Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming
- 3D modeling in fields that require the use of Ice Lake processors, such as animation and film production, cloud gaming, and mechanical design

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Baseline/burst bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs
ecs.sgn7i-vws-m2.xlarge	4	15.5	NVIDIA A10 * 1/12	24GB * 1/12	1.5/5	500,000	4	2
ecs.sgn7i-vws-m4.2xlarge	8	31	NVIDIA A10 * 1/6	24GB * 1/6	2.5/10	1,000,000	4	4
ecs.sgn7i-vws-m8.4xlarge	16	62	NVIDIA A10 * 1/3	24GB * 1/3	5/20	2,000,000	8	4
ecs.sgn7i-vws-m2s.xlarge	4	8	NVIDIA A10 * 1/12	24GB * 1/12	1.5/5	500,000	4	2
ecs.sgn7i-vws-m4s.2xlarge	8	16	NVIDIA A10 * 1/6	24GB * 1/6	2.5/10	1,000,000	4	4
ecs.sgn7i-vws-m8s.4xlarge	16	32	NVIDIA A10 * 1/3	24GB * 1/3	5/20	2,000,000	8	4

Note

The GPU column in the preceding table indicates the GPU model and GPU slicing information for each instance type. Each GPU can be sliced into multiple GPU partitions, and each GPU partition can be allocated as a vGPU to an instance. Example:
NVIDIA A10 * 1/12. NVIDIA A10 is the GPU model. 1/12 indicates that a GPU is sliced into 12 GPU partitions, and each GPU partition can be allocated as a vGPU to an instance.
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.

vgn7i-vws, vGPU-accelerated instance family

Features:

This instance family uses the third-generation SHENLONG architecture to provide predictable and consistent ultra-high performance. This instance family utilizes fast path acceleration on chips to improve storage performance, network performance, and computing stability by an order of magnitude. This way, data storage and model loading can be performed more quickly.
This instance family comes with a NVIDIA GRID vWS license and provides certified graphics acceleration capabilities for CAD software to meet the requirements of professional graphic design. Instances of this instance family can serve as lightweight GPU-accelerated compute-optimized instances to reduce the costs of small-scale AI inference tasks.
Compute:
- Uses NVIDIA A10 GPUs that have the following features:
  - Innovative Ampere architecture
  - Support for acceleration features (such as vGPU, RTX, and TensorRT) to provide diversified business support
- Uses 2.9 GHz Intel^® Xeon^® Scalable (Ice Lake) processors that deliver an all-core turbo frequency of 3.5 GHz.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports only ESSDs and ESSD AutoPL disks.
Network:
- Supports IPv6.
- Provides high network performance based on large computing capacity.
Supported scenarios:
- Concurrent AI inference tasks that require high-performance CPUs, memory, and GPUs, such as image recognition, speech recognition, and behavior identification
- Compute-intensive graphics processing tasks that require high-performance 3D graphics virtualization capabilities, such as remote graphic design and cloud gaming
- 3D modeling in fields that require the use of Ice Lake processors, such as animation and film production, cloud gaming, and mechanical design

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues	ENIs
ecs.vgn7i-vws-m4.xlarge	4	30	NVIDIA A10 * 1/6	24GB * 1/6	3	1,000,000	4	4
ecs.vgn7i-vws-m8.2xlarge	10	62	NVIDIA A10 * 1/3	24GB * 1/3	5	2,000,000	8	6
ecs.vgn7i-vws-m12.3xlarge	14	93	NVIDIA A10 * 1/2	24GB * 1/2	8	3,000,000	8	6
ecs.vgn7i-vws-m24.7xlarge	30	186	NVIDIA A10 * 1	24GB * 1	16	6,000,000	12	8

Note

The GPU column in the preceding table indicates the GPU model and GPU slicing information for each instance type. Each GPU can be sliced into multiple GPU partitions, and each GPU partition can be allocated as a vGPU to an instance. Example:
NVIDIA A10 * 1/6. NVIDIA A10 is the GPU model. 1/6 indicates that a GPU is sliced into six GPU partitions, and each GPU partition can be allocated as a vGPU to an instance.
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.

vgn6i and vgn6i-vws, vGPU-accelerated instance families

Features:

In light of the NVIDIA GRID driver upgrade, Alibaba Cloud upgrades the vgn6i instance family to vgn6i-vws instance family. The vgn6i-vws instance family uses the latest NVIDIA GRID driver and provides a NVIDIA GRID vWS license. Submit a ticket to apply for free images that have the NVIDIA GRID driver pre-installed.
To use other public images or custom images that do not contain an NVIDIA GRID driver, submit a ticket to apply for the GRID driver file and install the NVIDIA GRID driver. Alibaba Cloud does not charge additional license fees for the GRID driver.

Compute:
- Uses NVIDIA T4 GPUs.
- Uses vGPUs.
  - Supports the 1/4 and 1/2 compute capacity of NVIDIA Tesla T4 GPUs.
  - Supports 4 GB and 8 GB of GPU memory.
- Offers a CPU-to-memory ratio of 1:5.
- Uses 2.5 GHz Intel^® Xeon^® Platinum 8163 (Skylake) processors.
Storage:
- Is an instance family in which all instances are I/O optimized.
- Supports only standard SSDs and ultra disks.
Network:
- Supports IPv6.
- Provides high network performance based on large computing capacity.
Supported scenarios:
- Real-time rendering for cloud gaming
- Real-time rendering for Augmented Reality (AR) and Virtual Reality (VR) applications
- AI (deep learning and machine learning) inference for elastic Internet service deployment
- Educational environment of deep learning
- Modeling experiment environment of deep learning

Instance types

Instance type	vCPUs	Memory (GiB)	GPU	GPU memory	Network bandwidth (Gbit/s)	Packet forwarding rate (pps)	NIC queues (primary NIC/secondary NIC)	ENIs	Private IP addresses per ENI
ecs.vgn6i-m4-vws.xlarge	4	23	NVIDIA T4 * 1/4	16GB * 1/4	2	500,000	4/2	3	10
ecs.vgn6i-m8-vws.2xlarge	10	46	NVIDIA T4 * 1/2	16GB * 1/2	4	800,000	8/2	4	10
ecs.vgn6i-m16-vws.5xlarge	20	92	NVIDIA T4 * 1	16GB * 1	7.5	1,200,000	6	4	10

Note

The GPU column in the preceding table indicates the GPU model and GPU slicing information for each instance type. Each GPU can be sliced into multiple GPU partitions, and each GPU partition can be allocated as a vGPU to an instance. Example:
NVIDIA T4 * 1/4. NVIDIA T4 is the GPU model. 1/4 indicates that a GPU is sliced into four GPU partitions, and each GPU partition can be allocated as a vGPU to an instance.
You can go to the Instance Types Available for Each Region page to view the instance types available in each region.
For more information about these specifications, see the "Instance type specifications" section in Overview of instance families. Packet forwarding rates vary significantly based on business scenarios. We recommend that you perform business stress tests on instances to choose appropriate instance types.