This topic describes how to deploy the built-in models for Hologres AI nodes. It also provides a list of the models and instructions on how to use them. To call a built-in model, you must first purchase AI resources. After a model is deployed, you can call it using an AI Function.
Prerequisites
Calling the built-in models of an AI node requires AI resources (GPUs). You must first purchase AI resources. For more information, see Introduction to AI nodes and pricing.
Model deployment
Deployment notes
Select a model to deploy based on your application scenario. Each model requires a minimum allocation of AI resources for deployment.
You can deploy multiple models on a single instance. However, the total resources allocated to the models cannot exceed the total AI resources that you purchased. If you have insufficient AI resources, scale out your resources.
For primary and secondary instances: You can deploy models and perform related operations, such as changing resources and deleting models, only on the primary instance. On a secondary instance, you can view the models deployed on the primary instance and call them using AI Functions.
Deploy a model
Log on to the Hologres console and select a region in the upper-left corner.
In the navigation pane on the left, click Instances, and then click the ID of the target instance.
On the Instance Details page, click AI Node.
In the Models area, click Deploy Model.
In the Deploy Model dialog box, enter a Model Name and select a Model Type. The parameters for Resource Configurations are automatically filled in based on the selected Model Type. Each model has a recommended minimum resource allocation. Allocate appropriate resources based on the selected model for optimal performance.
After you complete the configuration, click OK to deploy the model.
You can view the deployment status of the model in the Models and perform the following operations:
Adjust model configuration: In the Actions column of the target model, click Modify Configuration.
Delete model: In the Actions column of the target model, click Delete.
NoteWhen you delete a model, the system does not check for services that are currently calling it. Proceed with caution.
Use a model
After the model is successfully deployed, you can call it using an AI Function in Hologres. For more information, see AI Function.
Model list
Hologres provides built-in models for various AI scenarios. Deploy the models you need for your business. After deployment, call the models using an AI Function. The following table lists the built-in models in Hologres.
Model classification | Model name | Min. recommended CPU for single replica (Cores) | Min. recommended memory for single replica (GB) | Min. recommended cards for single replica | Min. recommended GPU memory for single replica (GB) | Supported instance versions | Remarks |
PDF transformation model | 20 | 100 | Single card/Multiple cards | 48 GB | Hologres V4.0 and later | ||
Text chunking | recursive-character-text-splitter | 15 | 30 | 0 | 0 | Hologres V3.2 and later | Select the CPU specification based on your business volume. You do not need to configure GPUs. |
Multimodal model | 7 | 24 | Single card/Multiple cards | 24 GB | Hologres V4.0 and later | ||
Multimodal model | 7 | 30 | Single card/Multiple cards | 48 GB | Hologres V4.0 and later | ||
Multimodal model | 7 | 30 | Single card/Multiple cards | 96 GB | Hologres V4.0 and later | ||
Text model | 7 | 24 | Single card | 24 GB | Hologres V4.0 and later | Image patch size: 32×32. Number of parameters: 88M. Returned vector dimensions: 512 | |
Text generation | 7 | 30 | Single card/Multiple cards | 8 GB | Hologres V3.2 and later | ||
Text generation | 7 | 30 | Single card/Multiple cards | 16 GB | Hologres V3.2 and later | ||
Text generation | 7 | 30 | Single card/Multiple cards | 32 GB | Hologres V3.2 and later | ||
Text generation | 7 | 30 | Single card/Multiple cards | 48 GB | Hologres V3.2 and later | ||
Text generation | 7 | 30 | Single card/Multiple cards | 96 GB | Hologres V3.2 and later | ||
Sentiment classification | 7 | 30 | Single card | 4 GB | Hologres V3.2 and later | ||
Vector embedding | 7 | 30 | Single card | 12 GB | Hologres V3.2 and later | Output vector dimensions: 768 | |
Vector embedding | 7 | 30 | Single card | 16 GB | Hologres V3.2 and later | Output vector dimensions: 1024 | |
Vector embedding | 7 | 30 | Single card | 8 GB | Hologres V3.2 and later | Output vector dimensions: 512 | |
Vector embedding | 7 | 30 | Single card | 8 GB | Hologres V3.2 and later | ||
Vector embedding | 7 | 30 | Single card | 32 GB | Hologres V3.2 and later | ||
Vector embedding | 7 | 30 | Single card | 48 GB | Hologres V3.2 and later | ||
Vector embedding | 7 | 30 | Single card | 12 GB | Hologres V3.2 and later | Output vector dimensions: 768 | |
Vector embedding | 7 | 30 | Single card | 12 GB | Hologres V3.2 and later | Output vector dimensions: 768 | |
Vector embedding | 7 | 30 | Single card | 16 GB | Hologres V3.2 and later | Output vector dimensions: 1024 | |
Vector embedding | 7 | 30 | Single card | 16 GB | Hologres V3.2 and later | Output vector dimensions: 1024 | |
Vector embedding | 7 | 30 | Single card | 8 GB | Hologres V3.2 and later | Output vector dimensions: 384 | |
Vector embedding | 7 | 30 | Single card | 8 GB | Hologres V3.2 and later | Output vector dimensions: 512 | |
Text model | 7 | 24 | Single card | 24 GB | Hologres V4.0 and later | Image patch size: 32×32. Number of parameters: 88M. Returned vector dimensions: 512 | |
Text model | 7 | 24 | Single card | 24 GB | Hologres V4.0 and later | Image patch size: 14×14. Number of parameters: 304M. Returned vector dimensions: 768 | |
Vector embedding | 7 | 24 | Single card | 24 GB | Hologres V4.0 and later | Image patch size: 16×16. Number of parameters: 88M. Returned vector dimensions: 512 |