Learn common development workflows for cloud-native AI and AI-integrated big data scenarios using PAI modules.
Common workflows
Access PAI modules from the workspace details page. The following workflows show how to use these modules for common scenarios.
-
Cloud-native development

Section
Description
Reference
①
High-quality datasets are essential for accurate models. Use dataset management to register public datasets, upload files from local machines or Alibaba Cloud storage, or create index datasets by scanning OSS folders. Dataset management enables centralized data organization and prepares data for labeling and training.
②
Data Science Workshop (DSW) is an interactive machine learning IDE for cloud-based development. Use Notebooks to access data, develop algorithms, and train and deploy models from anywhere.
③
Image management provides PAI public images and supports custom images for centralized application image management.
④
Deep Learning Containers (DLC) provides a flexible, stable, and high-performance training environment. DLC supports multiple algorithm frameworks, enables large-scale distributed deep learning, and supports custom frameworks.
⑤
PAI supports datasets from NAS, OSS, and Git repositories. Specify datasets and code repositories when submitting training jobs.
⑥
Model management enables centralized management of trained models and integrates with EAS for model deployment.
⑦
EAS deploys models as online services using CPU or GPU resources. It features high throughput, low latency, one-click deployment for complex models, and real-time auto scaling.
NoteEAS does not support DSW images or CPFS datasets.
-
AI with big data

Section
Description
Reference
①
Store source data in MaxCompute tables, preprocess in DataWorks, and reference in PAI for model training.
②
Machine Learning Designer supports large-scale distributed training for traditional machine learning, deep learning, reinforcement learning, and stream/batch processing. It provides hundreds of algorithms, automatic parameter tuning, and drag-and-drop component assembly with minimal code.
③
DataWorks schedules tasks based on time properties and scheduling parameters.
④
Task management stores experiment data from Machine Learning Designer and custom task records, enabling experiment comparison across tasks.
⑤
Model management enables centralized management of trained models and integrates with EAS for model deployment.
⑥
EAS deploys models as online services using CPU or GPU resources. It features high throughput, low latency, one-click deployment for complex models, and real-time auto scaling.
NoteEAS does not support DSW images or CPFS datasets.