All Products
Search
Document Center

Hologres:AI models and deployment guide

Last Updated:Oct 14, 2025

Hologres offers various built-in AI models for diverse AI applications. You can deploy these models from the Hologres console based on your business needs. This document explains which AI models are available and how to deploy them.

Supported models

These built-in models require Hologres V3.2 or later versions.

Model name

Category

Recommended minimum vCPUs for single-replica deployment

Recommended minimum memory for single-replica deployment (GB)

Recommended minimum number of GPUs for single-replica deployment

Recommended minimum GPU memory for single-replica deployment (GB)

Required instance versions

Notes

ds4sd/docling-models

PDF conversion model

20

100

1 or more

48

V4.0 and later

None

Qwen/Qwen2.5-VL-3B-Instruct

Multimodal model

7

24

1 or more

24

V4.0 and later

None

Qwen/Qwen2.5-VL-7B-Instruct

Multimodal model

7

30

1 or more

48

V4.0 and later

None

Qwen/Qwen2.5-VL-32B-Instruct

Multimodal model

7

30

1 or more

96

V4.0 and later

None

clip-ViT-B-32

Image embedding model

7

24

1

24

V4.0 and later

  • Image patch size: 32×32

  • Parameters: 88 M

  • Output vector dimensions: 512

clip-ViT-B-32-multilingual-v1

Multilingual embedding model for images

7

24

1

24

V4.0 and later

  • Image patch size: 32×32

  • Parameters: 88 M

  • Output vector dimensions: 512

clip-ViT-B-16

Image embedding model

7

24

1

24

V4.0 and later

  • Image patch size: 16×16

  • Parameters: 88 M

  • Output vector dimensions: 512

clip-ViT-L-14

Image embedding model

7

24

1

24

V4.0 and later

  • Image patch size: 14×14

  • Parameters: 304 M

  • Output vector dimensions: 768

Qwen/Qwen3-1.7B

LLM

7

30

1 or more

8

V3.2 and later

None

Qwen/Qwen3-4B

LLM

7

30

1 or more

16

V3.2 and later

None

Qwen/Qwen3-8B

LLM

7

30

1 or more

32

V3.2 and later

None

Qwen/Qwen3-14B

LLM

7

30

1 or more

48

V3.2 and later

None

Qwen/Qwen3-32B

LLM

7

30

1 or more

96

V3.2 and later

None

iic/nlp_structbert_sentiment-classification_chinese-base

Sentiment classification

7

30

1

4

V3.2 and later

None

iic/nlp_gte_sentence-embedding_chinese-base

Text embedding model

7

30

1

12

V3.2 and later

Output vector dimensions: 768

iic/nlp_gte_sentence-embedding_chinese-large

Text embedding model

7

30

1

16

V3.2 and later

Output vector dimensions: 1024

iic/nlp_gte_sentence-embedding_chinese-small

Text embedding model

7

30

1

8

V3.2 and later

Output vector dimensions: 512

Qwen/Qwen3-Embedding-0.6B

Text embedding model

7

30

1

8

V3.2 and later

None

Qwen/Qwen3-Embedding-4B

Text embedding model

7

30

1

32

V3.2 and later

None

Qwen/Qwen3-Embedding-8B

Text embedding model

7

30

1

48

V3.2 and later

None

recursive-character-text-splitter

Text chunking

15

30

0

0

V3.2 and later

Select CPU specifications as needed. Setting the number of GPUs is not required.

BAAI/bge-base-en-v1.5

Long text embedding

7

30

1

12

V3.2 and later

Output vector dimensions: 768

BAAI/bge-base-zh-v1.5

Long text embedding

7

30

1

12

V3.2 and later

Output vector dimensions: 768

BAAI/bge-large-en-v1.5

Long text embedding

7

30

1

16

V3.2 and later

Output vector dimensions: 1024

BAAI/bge-large-zh-v1.5

Long text embedding

7

30

1

16

V3.2 and later

Output vector dimensions: 1024

BAAI/bge-small-en-v1.5

Long text embedding

7

30

1

8

V3.2 and later

Output vector dimensions: 384

BAAI/bge-small-zh-v1.5

Long text embedding

7

30

1

8

V3.2 and later

Output vector dimensions: 512

Prerequisites

You have purchased AI resources.

Notes

  • Select and deploy models from the list above. Each model requires specified minimum AI resources.

  • You can deploy multiple models on one instance, provided the total resource consumption does not exceed your purchased quota. Scale up if resources are insufficient.

  • For primary/secondary instances: Model deployment and management (modifying resources, deleting) are exclusive to the primary instance. Secondary instances can view the primary instance's models and call them via AI functions.

Deploy a model

  1. Log on to the Hologres console and select a region.

  2. In the left navigation menu, click Instances. Then, click the target instance ID.

  3. On the Instance Details page, click AI Node.

  4. In the Models section, click Deploy Model.

  5. In the Deploy Model dialog box, set Model Name and Model Type.

    The parameters for Resource Configurations are automatically populated based on the selected Model Type.

  6. After you complete the configurations, click OK to deploy the model.

    In the Models section, view the deployment status and perform the following operations:

    • Adjust model configurations: In the Actions column of the target model, click Adjust Configurations.

    • Delete the model: In the Actions column of the target model, click Delete.

      Note

      Hologres does not check for dependent services when deleting a model. Exercise extreme caution to prevent service downtime.

Next step

After deploying a model, you can call it via AI functions. For more information, see AI functions.