All Products
Search
Document Center

Container Compute Service:GPU instance families supported by ACS

Last Updated:Nov 13, 2025

Alibaba Cloud Container Compute Service (ACS) supports multiple GPU types for different scenarios. Request a specific GPU model series in your cluster by using the alibabacloud.com/gpu-model-series label. Refer to the following specifications to select the instance family that best meets your needs.

GU8TF

This family features high-performance compute GPUs.

  • 96 GB of memory per GPU and native support for the FP8 floating-point format, enabling single-node inference for 70B and larger models.

  • Features high-speed NVLink interconnect among all 8 GPUs, making it ideal for small to medium-scale model training. Provides 1.6 Tbps Remote Direct Memory Access (RDMA) for internode communication.

Pod resource constraints:

GPU count (memory per GPU)

vCPU

Memory options (GiB)

Memory increment (GiB)

Storage range (GiB)

1 (96 GB)

2

2–16

1

30–256

4

4–32

1

6

6–48

1

8

8–64

1

10

10–80

1

12

12–96

1

14

14–112

1

16

16–128

1

22

22, 32, 64, 128

N/A

2 (96 GB)

16

16–128

1

30–512

32

32, 64, 128, 230

N/A

46

64, 128, 230

N/A

4 (96 GB)

32

32, 64, 128, 256

N/A

30–1,024

64

64, 128, 256, 460

N/A

92

128, 256, 460

N/A

8 (96 GB)

64

64, 128, 256, 512

N/A

30–2,048

128

128, 256, 512, 920

N/A

184

256, 512, 920

N/A

GU8TEF

This family features high-performance compute GPUs.

  • 141 GB of memory per GPU and native support for the FP8 floating-point format. Multi-GPU configurations support single-node inference for models such as DeepSeek-LLM 67B.

  • Features high-speed NVLink interconnect between all 8 GPUs, making it ideal for small- to medium-scale model training. Provides 1.6 Tbps RDMA for internode communication.

Pod resource constraints:

GPU count (memory per GPU)

vCPU

Memory options (GiB)

Memory increment (GiB)

Storage range (GiB)

1 (141 GB)

2

2–16

1

30–768

4

4–32

1

6

6–48

1

8

8–64

1

10

10–80

1

12

12–96

1

14

14–112

1

16

16–128

1

22

22, 32, 64, 128, 225

N/A

2 (141 GB)

16

16–128

1

30–1,536

32

32, 64, 128, 256

N/A

46

64, 128, 256, 450

N/A

4 (141 GB)

32

32, 64, 128, 256

N/A

30–3,072

64

64, 128, 256, 512

N/A

92

128, 256, 512, 900

N/A

8 (141 GB)

64

64, 128, 256, 512

N/A

30–6,144

128

128, 256, 512, 1,024

N/A

184

256, 512, 1024, 1,800

N/A

L20 (GN8IS)

This family features compute GPUs suitable for a wide range of AI workloads.

  • Supports common acceleration libraries such as TensorRT, and the FP8 floating-point format. Peer-to-peer (P2P) communication between GPUs is enabled.

  • 48 GB of memory per GPU. Multi-GPU configurations support single-node inference for 70B and larger models.

Pod resource constraints:

GPU count (memory per GPU)

vCPU

Memory options (GiB)

Memory increment (GiB)

Storage range (GiB)

1 (48 GB)

2

2–16

1

30–256

4

4–32

1

6

6–48

1

8

8–64

1

10

10–80

1

12

12–96

1

14

14–112

1

16

16–120

1

2 (48 GB)

16

16–128

1

30–512

32

32, 64, 128, 230

N/A

4 (48 GB)

32

32, 64, 128, 256

N/A

30–1,024

64

64, 128, 256, 460

N/A

8 (48 GB)

64

64, 128, 256, 512

N/A

30–2,048

128

128, 256, 512, 920

N/A

L20X (GX8SF)

This family features high-performance compute GPUs for large-scale AI workloads.

  • 141 GB of memory per GPU. Multi-GPU configurations support single-node inference for very large models.

  • Features high-speed NVLink interconnect among all 8 GPUs, making it ideal for large model training and inference. Provides 3.2 Tbps RDMA for internode communication.

Pod resource constraints:

GPU count (memory per GPU)

vCPU

Memory (GiB)

Memory increment (GiB)

Storage range (GiB)

8 (141 GB)

184

1,800

N/A

30–6,144

P16EN

This family features high-performance compute GPUs.

  • 96 GB of memory per GPU with support for the FP16 floating-point format. Multi-GPU configurations support single-node inference for models such as DeepSeek R1.

  • Features a 700 GB/s high-speed interconnect between all 16 GPUs, ideal for small- to medium-scale model training. Provides 1.6 Tbps RDMA for internode communication.

Pod resource constraints:

GPU count (memory per GPU)

vCPU

Memory (GiB)

Memory increment (GiB)

Storage range

1 (96 GB)

2

2–16

1

30 GB–384 GB

4

4–32

1

6

6–48

1

8

8-64

1

10

10–80

1

2 (96 GB)

4

4–32

1

30 GB–768 GB

6

6–48

1

8

8–64

1

16

16–128

1

22

32, 64, 128, 225

N/A

4 (96 GB)

8

8–64

1

30 GB–1.5 TB

16

16–128

1

32

32, 64, 128, 256

N/A

46

64, 128, 256, 450

N/A

8 (96 GB)

16

16–128

1

30 GB–3 TB

32

32, 64, 128, 256

N/A

64

64, 128, 256, 512

N/A

92

128, 256, 512, 900

N/A

16 (96 GB)

32

32, 64, 128, 256

N/A

30 GB–6 TB

64

64, 128, 256, 512

N/A

128

128, 256, 512, 1024

N/A

184

256, 512, 1,024, 1,800

N/A

G49E

This family features compute GPUs suitable for a wide range of AI and graphics workloads.

  • 48 GB of memory per GPU, with support for acceleration libraries such as RTX and TensorRT. P2P communication between GPUs is enabled.

Pod resource constraints:

GPU count (memory per GPU)

vCPU

Memory options (GiB)

Memory increment (GiB)

Storage range (GiB)

1 (48 GB)

2

2–16

1

30–256

4

4–32

1

6

6–48

1

8

8–64

1

10

10–80

1

12

12–96

1

14

14–112

1

16

16–120

1

2 (48 GB)

16

16–128

1

30–512

32

32, 64, 128, 230

N/A

4 (48 GB)

32

32, 64, 128, 256

N/A

30–1024

64

64, 128, 256, 460

N/A

8 (48 GB)

64

64, 128, 256, 512

N/A

30–2048

128

128, 256, 512, 920

N/A

T4

This family features versatile GPUs based on the Turing architecture, suitable for inference and graphics workloads.

  • 16 GB of memory per GPU with 320 GB/s of memory bandwidth.

  • Variable-precision Tensor Cores support 65 TFLOPS (FP16), 130 TOPS (INT8), and 260 TOPS (INT4).

Pod resource constraints of single-node family:

GPU count (memory per GPU)

vCPU

Memory options (GiB)

Memory increment (GiB)

Storage range (GiB)

1 (16 GB)

2

2–8

1

30–1536

4

4–16

1

6

6–24

1

8

8–32

1

10

10–40

1

12

12–48

1

14

14–56

1

16

16–64

1

24

24, 48, 90

N/A

2 (16 GB)

16

16–64

1

24

24, 48, 96

N/A

32

32, 64, 128

N/A

48

48, 96, 180

N/A

A10

This family features powerful GPUs based on the Ampere architecture, suitable for deep learning, HPC, and graphics.

  • 24 GB of memory per GPU and supports features such as RTX and TensorRT.

Pod resource constraints:

GPU count (memory per GPU)

vCPU

Memory options (GiB)

Memory increment (GiB)

Storage range (GiB)

1 (24 GB)

2

2–8

1

30–256

4

4–16

1

6

6–24

1

8

8–32

1

10

10–40

1

12

12–48

1

14

14–56

1

16

16–60

1

2 (24 GB)

16

16–64

1

30–512

32

32, 64, 120

N/A

4 (24 GB)

32

32, 64, 128

N/A

30–1,024

64

64, 128, 240

N/A

8 (24 GB)

64

64, 128, 256

N/A

30–2,048

128

128, 256, 480

N/A

G59

This family features compute GPUs suitable for various AI and HPC workloads.

  • 32 GB of memory per GPU and supports features such as RTX and TensorRT. P2P communication between GPUs is enabled.

Pod resource constraints:

GPU count (memory per GPU)

vCPUs

Memory options (GiB)

Memory increment (GiB)

Storage range (GiB)

Network bandwidth

1 (32 GB)

2

2–16

1

30–256

1 Gbps per vCPU

4

4–32

1

6

6–48

1

8

8–64

1

10

10–80

1

12

12–96

1

14

14–112

1

16

16–128

1

22

22, 32, 64, 128

N/A

2 (32 GB)

16

16–128

1

30–512

32

32, 64, 128, 256

N/A

46

64, 128, 256, 360

N/A

4 (32 GB)

32

32, 64, 128, 256

N/A

30–1,024

64

64, 128, 256, 512

N/A

92

128, 256, 512, 720

N/A

8 (32 GB)

64

64, 128, 256, 512

N/A

30–2,048

128

128, 256, 512, 1024

N/A

100 Gbps per vCPU

184

256, 512, 1024, 1440

N/A