All Products
Search
Document Center

Elastic GPU Service:Scenarios

Last Updated:Mar 31, 2026

Elastic GPU Service supports video transcoding, image rendering, AI training, AI inference, and cloud graphics workstations. DeepGPU extends these capabilities with enhanced GPU acceleration for AI training and AI inference.

Elastic GPU Service use cases

Video transcoding

Applicable instance type: ebmgn6v (ECS Bare Metal)

GPU-accelerated transcoding handles high-concurrency, real-time video streams at 1080P, 2K, and 4K resolutions while keeping bandwidth consumption low.

During the 2019 Double 11 Global Shopping Festival gala, Elastic GPU Service:

  • Supported real-time video streaming across more than 5,000 concurrent channels, peaking at 6,200 channels per minute

  • Rendered more than 5,000 household images in real time—in seconds per image—using ebmgn6v ECS Bare Metal instances to power Taobao renderers, improving rendering performance by dozens of times

AI training

Applicable instance families: gn6v, gn6e | GPU: NVIDIA V100

The gn6v and gn6e instance families are built for deep learning acceleration. gn6v provides 16 GB of GPU memory per card; gn6e provides 32 GB. Both deliver up to 1,000 TFLOPS (teraflops) of mixed-precision computing per node.

These instance families integrate with container services to simplify deployment, operations, and resource scheduling across online and offline computing environments.

AI inference

Applicable instance family: gn6i | GPU: NVIDIA Tesla T4

The gn6i instance family uses the NVIDIA Tesla T4 GPU, which delivers:

  • Up to 8.1 TFLOPS of single-precision floating-point performance

  • Up to 130 TOPS (tera-operations per second) of int8 fixed-point processing for quantized inference

  • Mixed-precision support

  • 75 W power consumption per GPU—high output at low power draw

Like the training families, gn6i integrates with container services for simplified deployment and resource scheduling. Pre-built images with NVIDIA GPU drivers and popular deep learning frameworks are available on Alibaba Cloud Marketplace.

Cloud graphics workstations

Applicable instance family: gn6i | GPU: NVIDIA Tesla T4 (Turing architecture)

Pair gn6i instances with WUYING Workspace to deliver cloud-based GPU graphics workstations. This setup is suited for graphics-intensive workflows across industries such as:

  • Film and television animation design

  • Industrial design

  • Medical imaging

  • High-performance computing result presentation

DeepGPU use cases

DeepGPU bundles enhanced GPU acceleration tools for AI workloads. Its components include:

  • Apsara AI Accelerator (AIACC): Includes AIACC-Training and AIACC-Inference

  • AIACC-ACSpeed: Communication acceleration optimized for PyTorch-based training

  • AIACC-AGSpeed: Graph optimization for PyTorch-based training

  • FastGPU

  • cGPU

AI training

AIACC

ScenarioApplicable modelStorage
Image classification and image recognitionMXNet modelsCloud Paralleled File System (CPFS)
Click-through rate (CTR) predictionWide&Deep models of TensorFlowHadoop Distributed File System (HDFS)
Natural language processing (NLP)Transformer and BERT models of TensorFlowCPFS

AIACC-ACSpeed

AIACC-ACSpeed optimizes distributed training communication for PyTorch workloads, including large-model pretraining and fine-tuning.

ScenarioApplicable modelStorage
Image classification and image recognitionNeural network models such as ResNet and VGG-16, and AIGC models such as Stable DiffusionCPFS
CTR predictionWide&Deep modelHDFS
NLPTransformer and BERT modelsCPFS
Pretraining and fine-tuning of large modelsLarge language models (LLMs) such as Megatron-LM and DeepSpeedCPFS

AIACC-AGSpeed

AIACC-AGSpeed optimizes the computational graph for PyTorch workloads.

ScenarioApplicable model
Image classificationResNet and MobileNet models
Image segmentationUnet3D models
NLPBERT, GPT-2, and T5 models

AI inference

ScenarioApplicable modelGPUPerformance improvementOptimization
Video Ultra HD inferenceUltra HD modelsT41.7xVideo decoding ported to GPU; preprocessing and postprocessing ported to GPU; dataset size automatically obtained from a single operation; deep convolution optimization
Online inference of image synthesisGenerative adversarial network (GAN) modelsT43xPreprocessing and postprocessing ported to GPU; dataset size automatically obtained from a single operation; deep convolution optimization
CTR prediction and inferenceWide&Deep modelM405.1xPipeline optimization; model splitting; child models separately optimized
NLP inferenceBERT modelsT42.3xPipeline optimization of preprocessing and postprocessing; dataset size automatically obtained from a single operation; deep Kernel optimization