GPU instance families supported by ACS - Container Compute Service

Alibaba Cloud Container Compute Service (ACS) supports multiple GPU types for different scenarios. Request a specific GPU model series in your cluster by using the alibabacloud.com/gpu-model-series label. Refer to the following specifications to select the instance family that best meets your needs.

GU8TF

This family features high-performance compute GPUs.

96 GB of memory per GPU and native support for the FP8 floating-point format, enabling single-node inference for 70B and larger models.
Features high-speed NVLink interconnect among all 8 GPUs, making it ideal for small to medium-scale model training. Provides 1.6 Tbps Remote Direct Memory Access (RDMA) for internode communication.

Pod resource constraints:

GPU count (memory per GPU)	vCPU	Memory options (GiB)	Memory increment (GiB)	Storage range (GiB)
1 (96 GB)	2	2–16	1	30–256
	4	4–32	1
	6	6–48	1
	8	8–64	1
	10	10–80	1
	12	12–96	1
	14	14–112	1
	16	16–128	1
	22	22, 32, 64, 128	N/A
2 (96 GB)	16	16–128	1	30–512
	32	32, 64, 128, 230	N/A
	46	64, 128, 230	N/A
4 (96 GB)	32	32, 64, 128, 256	N/A	30–1,024
	64	64, 128, 256, 460	N/A
	92	128, 256, 460	N/A
8 (96 GB)	64	64, 128, 256, 512	N/A	30–2,048
	128	128, 256, 512, 920	N/A
	184	256, 512, 920	N/A

GU8TEF

This family features high-performance compute GPUs.

141 GB of memory per GPU and native support for the FP8 floating-point format. Multi-GPU configurations support single-node inference for models such as DeepSeek-LLM 67B.
Features high-speed NVLink interconnect between all 8 GPUs, making it ideal for small- to medium-scale model training. Provides 1.6 Tbps RDMA for internode communication.

Pod resource constraints:

GPU count (memory per GPU)	vCPU	Memory options (GiB)	Memory increment (GiB)	Storage range (GiB)
1 (141 GB)	2	2–16	1	30–768
	4	4–32	1
	6	6–48	1
	8	8–64	1
	10	10–80	1
	12	12–96	1
	14	14–112	1
	16	16–128	1
	22	22, 32, 64, 128, 225	N/A
2 (141 GB)	16	16–128	1	30–1,536
	32	32, 64, 128, 256	N/A
	46	64, 128, 256, 450	N/A
4 (141 GB)	32	32, 64, 128, 256	N/A	30–3,072
	64	64, 128, 256, 512	N/A
	92	128, 256, 512, 900	N/A
8 (141 GB)	64	64, 128, 256, 512	N/A	30–6,144
	128	128, 256, 512, 1,024	N/A
	184	256, 512, 1024, 1,800	N/A

L20 (GN8IS)

This family features compute GPUs suitable for a wide range of AI workloads.

Supports common acceleration libraries such as TensorRT, and the FP8 floating-point format. Peer-to-peer (P2P) communication between GPUs is enabled.
48 GB of memory per GPU. Multi-GPU configurations support single-node inference for 70B and larger models.

Pod resource constraints:

GPU count (memory per GPU)	vCPU	Memory options (GiB)	Memory increment (GiB)	Storage range (GiB)
1 (48 GB)	2	2–16	1	30–256
	4	4–32	1
	6	6–48	1
	8	8–64	1
	10	10–80	1
	12	12–96	1
	14	14–112	1
	16	16–120	1
2 (48 GB)	16	16–128	1	30–512
2 (48 GB)	32	32, 64, 128, 230	N/A	30–512
4 (48 GB)	32	32, 64, 128, 256	N/A	30–1,024
4 (48 GB)	64	64, 128, 256, 460	N/A	30–1,024
8 (48 GB)	64	64, 128, 256, 512	N/A	30–2,048
8 (48 GB)	128	128, 256, 512, 920	N/A	30–2,048

L20X (GX8SF)

This family features high-performance compute GPUs for large-scale AI workloads.

141 GB of memory per GPU. Multi-GPU configurations support single-node inference for very large models.
Features high-speed NVLink interconnect among all 8 GPUs, making it ideal for large model training and inference. Provides 3.2 Tbps RDMA for internode communication.

Pod resource constraints:

GPU count (memory per GPU)	vCPU	Memory (GiB)	Memory increment (GiB)	Storage range (GiB)
8 (141 GB)	184	1,800	N/A	30–6,144

P16EN

This family features high-performance compute GPUs.

96 GB of memory per GPU with support for the FP16 floating-point format. Multi-GPU configurations support single-node inference for models such as DeepSeek R1.
Features a 700 GB/s high-speed interconnect between all 16 GPUs, ideal for small- to medium-scale model training. Provides 1.6 Tbps RDMA for internode communication.

Pod resource constraints:

GPU count (memory per GPU)	vCPU	Memory (GiB)	Memory increment (GiB)	Storage range
1 (96 GB)	2	2–16	1	30 GB–384 GB
	4	4–32	1
	6	6–48	1
	8	8-64	1
	10	10–80	1
2 (96 GB)	4	4–32	1	30 GB–768 GB
	6	6–48	1
	8	8–64	1
	16	16–128	1
	22	32, 64, 128, 225	N/A
4 (96 GB)	8	8–64	1	30 GB–1.5 TB
	16	16–128	1
	32	32, 64, 128, 256	N/A
	46	64, 128, 256, 450	N/A
8 (96 GB)	16	16–128	1	30 GB–3 TB
	32	32, 64, 128, 256	N/A
	64	64, 128, 256, 512	N/A
	92	128, 256, 512, 900	N/A
16 (96 GB)	32	32, 64, 128, 256	N/A	30 GB–6 TB
	64	64, 128, 256, 512	N/A
	128	128, 256, 512, 1024	N/A
	184	256, 512, 1,024, 1,800	N/A

G49E

This family features compute GPUs suitable for a wide range of AI and graphics workloads.

48 GB of memory per GPU, with support for acceleration libraries such as RTX and TensorRT. P2P communication between GPUs is enabled.

Pod resource constraints:

GPU count (memory per GPU)	vCPU	Memory options (GiB)	Memory increment (GiB)	Storage range (GiB)
1 (48 GB)	2	2–16	1	30–256
	4	4–32	1
	6	6–48	1
	8	8–64	1
	10	10–80	1
	12	12–96	1
	14	14–112	1
	16	16–120	1
2 (48 GB)	16	16–128	1	30–512
2 (48 GB)	32	32, 64, 128, 230	N/A	30–512
4 (48 GB)	32	32, 64, 128, 256	N/A	30–1024
4 (48 GB)	64	64, 128, 256, 460	N/A	30–1024
8 (48 GB)	64	64, 128, 256, 512	N/A	30–2048
8 (48 GB)	128	128, 256, 512, 920	N/A	30–2048

T4

This family features versatile GPUs based on the Turing architecture, suitable for inference and graphics workloads.

16 GB of memory per GPU with 320 GB/s of memory bandwidth.
Variable-precision Tensor Cores support 65 TFLOPS (FP16), 130 TOPS (INT8), and 260 TOPS (INT4).

Pod resource constraints of single-node family:

GPU count (memory per GPU)	vCPU	Memory options (GiB)	Memory increment (GiB)	Storage range (GiB)
1 (16 GB)	2	2–8	1	30–1536
	4	4–16	1
	6	6–24	1
	8	8–32	1
	10	10–40	1
	12	12–48	1
	14	14–56	1
	16	16–64	1
	24	24, 48, 90	N/A
2 (16 GB)	16	16–64	1
	24	24, 48, 96	N/A
	32	32, 64, 128	N/A
	48	48, 96, 180	N/A

A10

This family features powerful GPUs based on the Ampere architecture, suitable for deep learning, HPC, and graphics.

24 GB of memory per GPU and supports features such as RTX and TensorRT.

Pod resource constraints:

GPU count (memory per GPU)	vCPU	Memory options (GiB)	Memory increment (GiB)	Storage range (GiB)
1 (24 GB)	2	2–8	1	30–256
	4	4–16	1
	6	6–24	1
	8	8–32	1
	10	10–40	1
	12	12–48	1
	14	14–56	1
	16	16–60	1
2 (24 GB)	16	16–64	1	30–512
2 (24 GB)	32	32, 64, 120	N/A	30–512
4 (24 GB)	32	32, 64, 128	N/A	30–1,024
4 (24 GB)	64	64, 128, 240	N/A	30–1,024
8 (24 GB)	64	64, 128, 256	N/A	30–2,048
8 (24 GB)	128	128, 256, 480	N/A	30–2,048

G59

This family features compute GPUs suitable for various AI and HPC workloads.

32 GB of memory per GPU and supports features such as RTX and TensorRT. P2P communication between GPUs is enabled.

Pod resource constraints:

GPU count (memory per GPU)	vCPUs	Memory options (GiB)	Memory increment (GiB)	Storage range (GiB)	Network bandwidth
1 (32 GB)	2	2–16	1	30–256	1 Gbps per vCPU
	4	4–32	1
	6	6–48	1
	8	8–64	1
	10	10–80	1
	12	12–96	1
	14	14–112	1
	16	16–128	1
	22	22, 32, 64, 128	N/A
2 (32 GB)	16	16–128	1	30–512
	32	32, 64, 128, 256	N/A
	46	64, 128, 256, 360	N/A
4 (32 GB)	32	32, 64, 128, 256	N/A	30–1,024
	64	64, 128, 256, 512	N/A
	92	128, 256, 512, 720	N/A
8 (32 GB)	64	64, 128, 256, 512	N/A	30–2,048
	128	128, 256, 512, 1024	N/A		100 Gbps per vCPU
	184	256, 512, 1024, 1440	N/A		100 Gbps per vCPU