All Products
Search
Document Center

Container Service for Kubernetes:Use GPUs to train AI models in ACK clusters

Last Updated:Apr 07, 2024

This topic describes the use scenarios, customer requirements, architecture, and references for using GPUs to train AI models.

Use scenarios

You can use GPUs to train AI image generation models, use Cloud Parallel File System (CPFS) and Apsara File Storage NAS (NAS) file systems to store and share model data, and use Container Service for Kubernetes (ACK) to manage GPU-accelerated Elastic Compute Service (ECS) instances that are used to run training jobs.

Customer requirements

  • Build environments for training AI models based on images

  • Use CPFS to store model training data

  • Use Apsara AI acceleration tools to accelerate model training

  • Use Arena to submit training jobs

Architecture

image

References

For more information about how to use GPUs to train AI models in ACK clusters, see Use GPUs to train AI models in ACK clusters.