AI models and deployment - Hologres - Alibaba Cloud Documentation Center

This topic describes how to deploy the built-in models for Hologres AI nodes. It also provides a list of the models and instructions on how to use them. To call a built-in model, you must first purchase AI resources. After a model is deployed, you can call it using an AI Function.

Prerequisites

Calling the built-in models of an AI node requires AI resources (GPUs). You must first purchase AI resources. For more information, see Introduction to AI nodes and pricing.

Model deployment

Deployment notes

Select a model to deploy based on your application scenario. Each model requires a minimum allocation of AI resources for deployment.
You can deploy multiple models on a single instance. However, the total resources allocated to the models cannot exceed the total AI resources that you purchased. If you have insufficient AI resources, scale out your resources.
For primary and secondary instances: You can deploy models and perform related operations, such as changing resources and deleting models, only on the primary instance. On a secondary instance, you can view the models deployed on the primary instance and call them using AI Functions.

Deploy a model

Log on to the Hologres console and select a region in the upper-left corner.
In the navigation pane on the left, click Instances, and then click the ID of the target instance.
On the Instance Details page, click AI Node.
In the Models area, click Deploy Model.
In the Deploy Model dialog box, enter a Model Name and select a Model Type. The parameters for Resource Configurations are automatically filled in based on the selected Model Type. Each model has a recommended minimum resource allocation. Allocate appropriate resources based on the selected model for optimal performance.
After you complete the configuration, click OK to deploy the model.
You can view the deployment status of the model in the Models and perform the following operations:
- Adjust model configuration: In the Actions column of the target model, click Modify Configuration.
- Delete model: In the Actions column of the target model, click Delete.
  Note
  When you delete a model, the system does not check for services that are currently calling it. Proceed with caution.

Use a model

After the model is successfully deployed, you can call it using an AI Function in Hologres. For more information, see AI Function.

Model list

Hologres provides built-in models for various AI scenarios. Deploy the models you need for your business. After deployment, call the models using an AI Function. The following table lists the built-in models in Hologres.

Model classification	Model name	Min. recommended CPU for single replica (Cores)	Min. recommended memory for single replica (GB)	Min. recommended cards for single replica	Min. recommended GPU memory for single replica (GB)	Supported instance versions	Remarks
PDF transformation model	ds4sd/docling-models	20	100	Single card/Multiple cards	48 GB	Hologres V4.0 and later
Text chunking	recursive-character-text-splitter	15	30	0	0	Hologres V3.2 and later	Select the CPU specification based on your business volume. You do not need to configure GPUs.
Multimodal model	Qwen/Qwen2.5-VL-3B-Instruct	7	24	Single card/Multiple cards	24 GB	Hologres V4.0 and later
Multimodal model	Qwen/Qwen2.5-VL-7B-Instruct	7	30	Single card/Multiple cards	48 GB	Hologres V4.0 and later
Multimodal model	Qwen/Qwen2.5-VL-32B-Instruct	7	30	Single card/Multiple cards	96 GB	Hologres V4.0 and later
Text model	clip-ViT-B-32-multilingual-v1	7	24	Single card	24 GB	Hologres V4.0 and later	Image patch size: 32×32. Number of parameters: 88M. Returned vector dimensions: 512
Text generation	Qwen/Qwen3-1.7B	7	30	Single card/Multiple cards	8 GB	Hologres V3.2 and later
Text generation	Qwen/Qwen3-4B	7	30	Single card/Multiple cards	16 GB	Hologres V3.2 and later
Text generation	Qwen/Qwen3-8B	7	30	Single card/Multiple cards	32 GB	Hologres V3.2 and later
Text generation	Qwen/Qwen3-14B	7	30	Single card/Multiple cards	48 GB	Hologres V3.2 and later
Text generation	Qwen/Qwen3-32B	7	30	Single card/Multiple cards	96 GB	Hologres V3.2 and later
Sentiment classification	iic/nlp_structbert_sentiment-classification_chinese-base	7	30	Single card	4 GB	Hologres V3.2 and later
Vector embedding	iic/nlp_gte_sentence-embedding_chinese-base	7	30	Single card	12 GB	Hologres V3.2 and later	Output vector dimensions: 768
Vector embedding	iic/nlp_gte_sentence-embedding_chinese-large	7	30	Single card	16 GB	Hologres V3.2 and later	Output vector dimensions: 1024
Vector embedding	iic/nlp_gte_sentence-embedding_chinese-small	7	30	Single card	8 GB	Hologres V3.2 and later	Output vector dimensions: 512
Vector embedding	Qwen/Qwen3-Embedding-0.6B	7	30	Single card	8 GB	Hologres V3.2 and later
Vector embedding	Qwen/Qwen3-Embedding-4B	7	30	Single card	32 GB	Hologres V3.2 and later
Vector embedding	Qwen/Qwen3-Embedding-8B	7	30	Single card	48 GB	Hologres V3.2 and later
Vector embedding	BAAI/bge-base-en-v1.5	7	30	Single card	12 GB	Hologres V3.2 and later	Output vector dimensions: 768
Vector embedding	BAAI/bge-base-zh-v1.5	7	30	Single card	12 GB	Hologres V3.2 and later	Output vector dimensions: 768
Vector embedding	BAAI/bge-large-en-v1.5	7	30	Single card	16 GB	Hologres V3.2 and later	Output vector dimensions: 1024
Vector embedding	BAAI/bge-large-zh-v1.5	7	30	Single card	16 GB	Hologres V3.2 and later	Output vector dimensions: 1024
Vector embedding	BAAI/bge-small-en-v1.5	7	30	Single card	8 GB	Hologres V3.2 and later	Output vector dimensions: 384
Vector embedding	BAAI/bge-small-zh-v1.5	7	30	Single card	8 GB	Hologres V3.2 and later	Output vector dimensions: 512
Text model	clip-ViT-B-32	7	24	Single card	24 GB	Hologres V4.0 and later	Image patch size: 32×32. Number of parameters: 88M. Returned vector dimensions: 512
Text model	clip-ViT-L-14	7	24	Single card	24 GB	Hologres V4.0 and later	Image patch size: 14×14. Number of parameters: 304M. Returned vector dimensions: 768
Vector embedding	clip-ViT-B-16	7	24	Single card	24 GB	Hologres V4.0 and later	Image patch size: 16×16. Number of parameters: 88M. Returned vector dimensions: 512