Model Gallery is a model-as-a-service (MaaS) component of Platform for AI (PAI). Model Gallery features SOTA models from various fields. Its low-code/no-code interface facilitates the entire model lifecycle. This topic describes how to use Model Gallery to deploy, distill, train, and build applications with DeepSeek models.
Before you go
(Required) Activate PAI and create a workspace
The PAI workspace is designed for centralized management of computing resources, permissions, and AI assets. When you activate PAI, a default workspace is created. However, if you don't have a workspace, you will need to create one manually. Services like Object Storage Service (OSS) are activated by default, because cloud storage is required to store code, models, datasets, and other files.
(Note) Region and resource specifications
You may have noticed that most Alibaba Cloud services, such as the PAI workspace and OSS bucket, are region-specific. Some regions may not be interconnected, so you must pay attention to the region to choose.
The availability and specifications of computing resources can differ greatly between regions. If a particular region lacks resources, consider checking other regions for availability.
PAI offers two billing options: pay-as-you-go and subscription. Pay-as-you-go resources are shared by all users, so you may encounter stockouts. In such cases, consider checking other regions for availability.
PAI provides various resource specifications for different scenarios. Some specifications are restricted for whitelisted users. For advice tailored to your scenario, consult your sales manager.
The platform also supports Lingjun resources, which use high-speed networks for communication and are essential for distributed training or deployment. Lingjun resources are also restricted for whitelisted users. Contact your sales manager if needed.
(Optional) Create a VPC for distributed training or deployment
Model deployment
One-click deployment of DeepSeek-V3 and DeepSeek-R1.
We also recommend DeepSeek-R1-Distill-Qwen-7B, a distilled model that is smaller in size, making it ideal for quick practice. It has low computational resource requirements and can be deployed using free trial resources.
Fine-tuning and distillation
Fine-tuning: Use your private data to train the model, making its responses more accurate in your scenario.
Distillation: Transfer the knowledge of a larger teacher model to a smaller student model. This retains the inference capability and accuracy of the teacher model while significantly reducing computing and storage costs. Distillation is also a form of fine-tuning.
Fine-tuning is not a simple task, and cannot solve all problems. Successful fine-tuning depends on various factors, such as the size and quality of datasets, the training hyper-parameters, and numerous attempts. In many practical scenarios, fine-tuning may not be the best solution, when a simpler retrieval-augmented generation (RAG) application may be enough. It all depends on your specific needs.
Build AI application
Models of the DeepSeek-R1 series are not good at structured output. Currently, DeepSeek-R1, DeepSeek-V3, and distill models based on DeepSeek-R1 do not support function calling. If you want to try the function calling feature, we recommend the Qwen2.5-Instruct series instead.
Develop applications with LangStudio
PAI provides the LangStudio module, which simplifies the development of enterprise-level LLM applications. LangStudio has built-in templates for the most popular AI application types, such as RAG and web search. You can create corresponding AI applications with simple configurations.
DeepSeek + Knowledge base: Develop and deploy a RAG application flow
DeepSeek + Web Search: Chat With Web Search
DeepSeek + Knowledge base + Web Search: Chatbot with RAG and Web Search