This topic describes the use scenarios, customer requirements, architecture, and references for using GPUs to train AI models.
Use scenarios
You can use GPUs to train AI image generation models, use Cloud Parallel File System (CPFS) and Apsara File Storage NAS (NAS) file systems to store and share model data, and use Container Service for Kubernetes (ACK) to manage GPU-accelerated Elastic Compute Service (ECS) instances that are used to run training jobs.
Customer requirements
Build environments for training AI models based on images
Use CPFS to store model training data
Use Apsara AI acceleration tools to accelerate model training
Use Arena to submit training jobs
Architecture
References
For more information about how to use GPUs to train AI models in ACK clusters, see Use GPUs to train AI models in ACK clusters.