NVIDIA Inference Microservice (NIM) provides pre-built containers with optimized AI model inference. These models deliver significant performance improvements over open source versions.
PAI Model Gallery offers multiple NIM models (Alibaba Cloud is an NVIDIA-authorized partner). Filter by Model Source: NIM to find available models. Deploy using either method:
Available models
Deploy these NIM models directly in PAI Model Gallery:
|
Model name |
Model Gallery page |
Supported instance types |
|
qwen2.5-7b-instruct-NIM |
ecs.gn7e series ecs.gn8is series |
|
|
MolMIM |
General-purpose GPU instance types |
|
|
Earth-2 FourCastNet |
General-purpose GPU instance types |
|
|
NVIDIA Retrieval QA Mistral 7B Embedding v2 |
ecs.gn7e series |
|
|
Eye Contact |
General-purpose GPU instance types |
|
|
NV-CLIP |
ecs.gn7e series ecs.gn7i series |
|
|
AlphaFold2-Multimer |
General-purpose GPU instance types |
|
|
Snowflake Arctic Embed Large Embedding |
ecs.gn7e series ecs.gn7i series |
|
|
NVIDIA Retrieval QA Mistral 4B Reranking v3 |
ecs.gn7e series ecs.gn7i series |
|
|
NVIDIA Retrieval QA E5 Embedding v5 |
ecs.gn7e series ecs.gn7i series |
|
|
Parakeet CTC Riva 1.1b |
General-purpose GPU instance types |
|
|
FastPitch HifiGAN Riva |
General-purpose GPU instance types |
|
|
VISTA-3D |
General-purpose GPU instance types |
|
|
AlphaFold2 |
General-purpose GPU instance types |
|
|
ProteinMPNN |
General-purpose GPU instance types |
|
|
megatron-1b-nmt |
General-purpose GPU instance types |
Deploy in Model Gallery
-
Go to PAI Model Gallery.
-
In the filter pane, set Model Source to NIM.

-
Select a model to open its details page. Click Deploy.
Prerequisite: You must be an NVIDIA AI Enterprise user or NVIDIA Developer Program user.

-
Configure runtime resources and click Deploy to create an online service. For invocation instructions, see the model introduction page.

Deploy locally
Download image and model files to deploy locally via Docker.
Prerequisite: You must be an NVIDIA AI Enterprise user or NVIDIA Developer Program user.
-
Configure the environment. See NVIDIA Getting Started.
-
On the model details page, click Download Address, accept the NIM terms and license, and obtain image and model addresses.
-
Pull the image. Replace ${image_address} with your actual address.
docker pull ${image_address} -
Download the model file using ossutil.
-
Start the container. This example assumes the model file is in /local/model/. Replace ${model_mount_path} and ${image_address} with actual values.
docker run --rm \ --runtime=nvidia \ --gpus all \ -u $(id -u) \ -v /local/model/:${model_mount_path} ${image_address}
First-time setup
If this is your first time using PAI:
-
Go to Alibaba Cloud and click Log On. Log in or register a new account.

-
Complete identity verification, then go to Platform for AI (PAI).
On first use, complete identity verification and grant authorization (leave defaults, takes ~10 seconds). You can then deploy models in the default workspace.
