Model customization lets you enhance the performance of text embedding models using your business data. You can also train custom embedding dimensionality reduction models using embedding data that you provide. In a typical business scenario, you first vectorize text or queries using an embedding model. Then, you use an embedding dimensionality reduction model to reduce the embedding dimensions.
Background information
In intelligent search and retrieval-augmented generation (RAG) scenarios, the performance of the embedding model is critical to business outcomes. However, the effectiveness of general-purpose models in specific domains is often limited by the coverage of their training data. To improve retrieval performance, you can fine-tune a general-purpose model with your business data. Meanwhile, the dimensions of embedding models are increasing. This leads to a significant increase in storage and computing costs for large-scale data vectorization. For this reason, the AI Search Open Platform provides an embedding dimensionality reduction service. This service uses custom models to convert high-dimensional vectors into lower-dimensional ones. This saves costs without significantly reducing vectorization performance.
Customize the embedding dimensionality reduction service
In the AI Search Open Platform console, choose Model Service > Model Customization, and click Create.
If you use a RAM account to create models, modify configurations, view model details, or perform other operations, you must grant the RAM account the required model service operation permissions in advance.
On the Model Customization page, you can configure the following parameters.
Parameter
Description
Model Name
The name of the model to be used when you invoke the embedding dimensionality reduction service.
Model Type
The type of the model to train. Select Vector Dimensionality Reduction (embedding-dim-reduction).
Base Model
The base model used for training, such as ops-embedding-dim-reduction-001.
Data Source
maxCompute or oss
MaxCompute
Parameter
Description
Data Source
MaxCompute
Region
The region where the MaxCompute project is located.
Project Name
The name of the project in MaxCompute.
AccessKey ID
The AccessKey ID of the Alibaba Cloud account or RAM user that has the permissions to read from and write to MaxCompute.
You can go to the AccessKey Management page to obtain an AccessKey ID.
Secret
The AccessKey secret that corresponds to the AccessKey ID.
Table Name
The name of the table that stores the training data in MaxCompute.
Table Partition
The partition information of the table.
Training Fields
You must first grant the Grant GetTableFields (to retrieve the MaxCompute table schema) permission to the RAM user that is used to read and write the MaxCompute table schema. This lets you select the primary key field and embedding fields of the String type. The dimension of the embedding fields must be from 1024 to 4096.
OSS
Parameter
Description
Data Source
OSS
Region
The region where the OSS bucket is located.
OSS Bucket
The name of the OSS bucket.
Doc Data
The data in OSS used for training.
OSS Endpoint
Generated after you complete the preceding configurations.
Click OK. In the dialog box that appears, click Create and Train. The model is then pre-processed, and training begins after pre-processing is complete.
If you click Confirm Creation, the model is added to the model customization list with a status of To Be Trained and can be trained later.
In the model list, models with the Active status have completed training and can be invoked. Click Experience to test the fine-tuned embedding model.
Customize the text embedding service
In the AI Search Open Platform console, choose Model Service > Model Customization and click Create.
If you use a RAM user to perform operations such as creating a model, modifying configurations, or viewing model details, you must grant the RAM user the required permissions for the model service in advance.
On the Model Customization page, configure the following parameters.
Parameter
Description
Model Name
Offers customization.
Model Type
The type of the model to train. Select Text Embedding (text-embedding).
Base Model
The foundation model used for training, such as ops-text-embedding-001.
Dimensionality Reduction
If you enable this option, embedding dimensionality reduction training is performed at the same time.
Base Model for Reduction
The model used for dimensionality reduction. This parameter is available only when you enable embedding dimensionality reduction.
Data Source
maxCompute or oss
MaxCompute
Parameter
Description
Data Source
MaxCompute.
Region
The region where the MaxCompute project is located.
Project Name
The name of the project in MaxCompute.
AccessKey ID
The AccessKey ID of the Alibaba Cloud account or RAM user that has the permissions to read from and write to MaxCompute.
You can go to the AccessKey Management page to obtain an AccessKey ID.
Secret
The AccessKey secret that corresponds to the AccessKey ID.
Table Name
The name of the table that stores the training data in MaxCompute.
Table Partition
The partition information of the table.
Training Fields
You must grant GetTableFields permission (to retrieve the MaxCompute table schema) to the RAM user that is used to read and write MaxCompute table schemas. This lets you select primary key fields and String-type text data.
query-doc pair
For more information, see the sample data in the console.
OSS
Parameter
Description
Data Source
OSS
Region
The region where the OSS bucket is located.
OSS bucket
The name of the OSS bucket.
Doc data
The data in OSS used for training.
query-doc pair
For more information, see the sample data in the console.
OSS Endpoint
Generated by the system after you complete the preceding configurations.
Click OK. In the dialog box that appears, click Create and Train. The model starts pre-processing, and training begins after pre-processing is complete.
After you click Confirm Creation, the model appears in the model customization list with a To Be Trained status. You can start the training at a later time.
In the model list, a model with an Active status has completed training and can be deployed.
Service invocation
When the model service meets your requirements, you can call the service using an API. For more information, see embedding Dimensionality Reduction Service API and Custom Deployment Service API.