This topic describes the use scenarios, customer requirements, architecture, and references for using GPUs to train AI models.
You can use GPUs to train AI image generation models, use Cloud Parallel File System (CPFS) and Apsara File Storage NAS (NAS) file systems to store and share model data, and use Container Service for Kubernetes (ACK) to manage GPU-accelerated Elastic Compute Service (ECS) instances that are used to run training jobs.
Build basic training environments for AI image generation models
Use CPFS to store model training data
Use Apsara AI acceleration tools to accelerate model training
Use Arena to submit training jobs
For more information about how to use GPUs to train AI models in ACK clusters, see Use GPUs to train AI models in ACK clusters.