All Products
Search
Document Center

Container Compute Service:GPU instance families supported by ACS

Last Updated:Mar 26, 2026

ACS supports multiple GPU families for AI and HPC workloads. Use the alibabacloud.com/gpu-model-series label to request a specific GPU family in your cluster. The following table summarizes the available families.

Choose a GPU family

FamilyGPU memoryKey specsBest for
GU8TF96 GBFP8, NVLink, 1.6 Tbps RDMALarge model inference (70B+), small to medium training
GU8TEF141 GBFP8, NVLink, 1.6 Tbps RDMALarge model inference (DeepSeek-LLM 67B), small to medium training
L20 (GN8IS)48 GBFP8, TensorRT, P2PBroad AI workloads, 70B+ inference
L20X (GX8SF)141 GBNVLink, 3.2 Tbps RDMALarge-scale model training and inference
P16EN96 GBFP16, 700 GB/s interconnect, 1.6 Tbps RDMALarge model inference (DeepSeek R1), small to medium training
G49E48 GBRTX, TensorRT, P2PAI and graphics workloads
T416 GBFP16/INT8/INT4 Tensor Cores, 320 GB/sInference and graphics
A1024 GBRTX, TensorRT, AmpereDeep learning, HPC, graphics
G5932 GBRTX, TensorRT, P2PAI and HPC workloads

GU8TF

Compute GPUs with 96 GB of memory per GPU and native FP8 support, enabling single-node inference for 70B and larger models. All 8 GPUs are interconnected via NVLink for small to medium-scale model training. Inter-node communication runs at 1.6 Tbps via Remote Direct Memory Access (RDMA).

Pod resource constraints:

GPU count (memory per GPU)

vCPU

Memory options (GiB)

Memory increment (GiB)

Storage range (GiB)

1 (96 GB)

2

2–16

1

30–256

4

4–32

1

6

6–48

1

8

8–64

1

10

10–80

1

12

12–96

1

14

14–112

1

16

16–128

1

22

22, 32, 64, 128

N/A

2 (96 GB)

16

16–128

1

30–512

32

32, 64, 128, 230

N/A

46

64, 128, 230

N/A

4 (96 GB)

32

32, 64, 128, 256

N/A

30–1,024

64

64, 128, 256, 460

N/A

92

128, 256, 460

N/A

8 (96 GB)

64

64, 128, 256, 512

N/A

30–2,048

128

128, 256, 512, 920

N/A

184

256, 512, 920

N/A

GU8TEF

Compute GPUs with 141 GB of memory per GPU and native FP8 support. Multi-GPU configurations support single-node inference for models such as DeepSeek-LLM 67B. All 8 GPUs are interconnected via NVLink for small to medium-scale model training. Inter-node communication runs at 1.6 Tbps via RDMA.

Pod resource constraints:

GPU count (memory per GPU)

vCPU

Memory options (GiB)

Memory increment (GiB)

Storage range (GiB)

1 (141 GB)

2

2–16

1

30–768

4

4–32

1

6

6–48

1

8

8–64

1

10

10–80

1

12

12–96

1

14

14–112

1

16

16–128

1

22

22, 32, 64, 128, 225

N/A

2 (141 GB)

16

16–128

1

30–1,536

32

32, 64, 128, 256

N/A

46

64, 128, 256, 450

N/A

4 (141 GB)

32

32, 64, 128, 256

N/A

30–3,072

64

64, 128, 256, 512

N/A

92

128, 256, 512, 900

N/A

8 (141 GB)

64

64, 128, 256, 512

N/A

30–6,144

128

128, 256, 512, 1,024

N/A

184

256, 512, 1024, 1,800

N/A

L20 (GN8IS)

Compute GPUs with 48 GB of memory per GPU, suited for a wide range of AI workloads. Supports acceleration libraries such as TensorRT and the FP8 floating-point format. Peer-to-peer (P2P) communication between GPUs is enabled. Multi-GPU configurations support single-node inference for 70B and larger models.

    Pod resource constraints:

    GPU count (memory per GPU)

    vCPU

    Memory options (GiB)

    Memory increment (GiB)

    Storage range (GiB)

    1 (48 GB)

    2

    2–16

    1

    30–256

    4

    4–32

    1

    6

    6–48

    1

    8

    8–64

    1

    10

    10–80

    1

    12

    12–96

    1

    14

    14–112

    1

    16

    16–120

    1

    2 (48 GB)

    16

    16–128

    1

    30–512

    32

    32, 64, 128, 230

    N/A

    4 (48 GB)

    32

    32, 64, 128, 256

    N/A

    30–1,024

    64

    64, 128, 256, 460

    N/A

    8 (48 GB)

    64

    64, 128, 256, 512

    N/A

    30–2,048

    128

    128, 256, 512, 920

    N/A

    L20X (GX8SF)

    Compute GPUs with 141 GB of memory per GPU, designed for large-scale model training and inference. All 8 GPUs are interconnected via NVLink. Inter-node communication runs at 3.2 Tbps via RDMA.

    Pod resource constraints:

    GPU count (memory per GPU)

    vCPU

    Memory (GiB)

    Memory increment (GiB)

    Storage range (GiB)

    8 (141 GB)

    184

    1,800

    N/A

    30–6,144

    P16EN

    Compute GPUs with 96 GB of memory per GPU and FP16 support. Multi-GPU configurations support single-node inference for models such as DeepSeek R1. All 16 GPUs are interconnected at 700 GB/s for small to medium-scale model training. Inter-node communication runs at 1.6 Tbps via RDMA.

    Pod resource constraints:

    GPU count (memory per GPU)

    vCPU

    Memory (GiB)

    Memory increment (GiB)

    Storage range

    1 (96 GB)

    2

    2–16

    1

    30 GB–384 GB

    4

    4–32

    1

    6

    6–48

    1

    8

    8-64

    1

    10

    10–80

    1

    2 (96 GB)

    4

    4–32

    1

    30 GB–768 GB

    6

    6–48

    1

    8

    8–64

    1

    16

    16–128

    1

    22

    32, 64, 128, 225

    N/A

    4 (96 GB)

    8

    8–64

    1

    30 GB–1.5 TB

    16

    16–128

    1

    32

    32, 64, 128, 256

    N/A

    46

    64, 128, 256, 450

    N/A

    8 (96 GB)

    16

    16–128

    1

    30 GB–3 TB

    32

    32, 64, 128, 256

    N/A

    64

    64, 128, 256, 512

    N/A

    92

    128, 256, 512, 900

    N/A

    16 (96 GB)

    32

    32, 64, 128, 256

    N/A

    30 GB–6 TB

    64

    64, 128, 256, 512

    N/A

    128

    128, 256, 512, 1024

    N/A

    184

    256, 512, 1,024, 1,800

    N/A

    G49E

    Compute GPUs with 48 GB of memory per GPU, suited for AI and graphics workloads. Supports acceleration libraries such as RTX and TensorRT. P2P communication between GPUs is enabled.

      Pod resource constraints:

      GPU count (memory per GPU)

      vCPU

      Memory options (GiB)

      Memory increment (GiB)

      Storage range (GiB)

      1 (48 GB)

      2

      2–16

      1

      30–256

      4

      4–32

      1

      6

      6–48

      1

      8

      8–64

      1

      10

      10–80

      1

      12

      12–96

      1

      14

      14–112

      1

      16

      16–120

      1

      2 (48 GB)

      16

      16–128

      1

      30–512

      32

      32, 64, 128, 230

      N/A

      4 (48 GB)

      32

      32, 64, 128, 256

      N/A

      30–1024

      64

      64, 128, 256, 460

      N/A

      8 (48 GB)

      64

      64, 128, 256, 512

      N/A

      30–2048

      128

      128, 256, 512, 920

      N/A

      T4

      GPUs based on the Turing architecture, with 16 GB of memory per GPU and 320 GB/s of memory bandwidth. Variable-precision Tensor Cores deliver 65 TFLOPS (FP16), 130 TOPS (INT8), and 260 TOPS (INT4), suited for inference and graphics workloads.

      Pod resource constraints (single-node family):

      GPU count (memory per GPU)

      vCPU

      Memory options (GiB)

      Memory increment (GiB)

      Storage range (GiB)

      1 (16 GB)

      2

      2–8

      1

      30–1536

      4

      4–16

      1

      6

      6–24

      1

      8

      8–32

      1

      10

      10–40

      1

      12

      12–48

      1

      14

      14–56

      1

      16

      16–64

      1

      24

      24, 48, 90

      N/A

      2 (16 GB)

      16

      16–64

      1

      24

      24, 48, 96

      N/A

      32

      32, 64, 128

      N/A

      48

      48, 96, 180

      N/A

      A10

      GPUs based on the Ampere architecture, with 24 GB of memory per GPU. Supports RTX and TensorRT, suited for deep learning, high-performance computing (HPC), and graphics workloads.

      Pod resource constraints:

      GPU count (memory per GPU)

      vCPU

      Memory options (GiB)

      Memory increment (GiB)

      Storage range (GiB)

      1 (24 GB)

      2

      2–8

      1

      30–256

      4

      4–16

      1

      6

      6–24

      1

      8

      8–32

      1

      10

      10–40

      1

      12

      12–48

      1

      14

      14–56

      1

      16

      16–60

      1

      2 (24 GB)

      16

      16–64

      1

      30–512

      32

      32, 64, 120

      N/A

      4 (24 GB)

      32

      32, 64, 128

      N/A

      30–1,024

      64

      64, 128, 240

      N/A

      8 (24 GB)

      64

      64, 128, 256

      N/A

      30–2,048

      128

      128, 256, 480

      N/A

      G59

      Compute GPUs with 32 GB of memory per GPU, suited for AI and HPC workloads. Supports RTX and TensorRT. P2P communication between GPUs is enabled. Network bandwidth scales from 1 Gbps per vCPU at smaller configurations to 100 Gbps per vCPU at the largest.

        Pod resource constraints:

        GPU (Cards)

        vCPU

        Memory (GiB)

        Memory Step Size (GiB)

        Storage (GiB)

        Networking

        1 (32 GiB GPU memory)

        2

        2 ~ 16

        1

        30 ~ 256

        1 Gbps per vCPU

        4

        4 ~ 32

        1

        6

        6 ~ 48

        1

        8

        8 ~ 64

        1

        10

        10 ~ 80

        1

        12

        12 ~ 96

        1

        14

        14 ~ 112

        1

        16

        16 ~ 128

        1

        22

        22, 32, 64, 128

        N/A

        2 (32 GiB × 2 GPU memory)

        16

        16 ~ 128

        1

        30 ~ 512

        32

        32, 64, 128, 256

        N/A

        46

        64, 128, 256, 360

        N/A

        4 (32 GiB × 4 GPU memory)

        32

        32, 64, 128, 256

        N/A

        30 ~ 1024

        64

        64, 128, 256, 512

        N/A

        92

        128, 256, 512, 720

        N/A

        8 (32 GiB × 8 GPU memory)

        64

        64, 128, 256, 512

        N/A

        30 ~ 2048

        128

        128, 256, 512, 1024

        N/A

        100 Gbps

        184

        256, 512, 1024, 1440

        N/A