OpenSearch LLM-Based Conversational Search Edition provides five built-in embedding models. You can select a embedding model to configure an instance based on your business requirements. This topic describes how to view the built-in embedding models.
Go to the details page of an OpenSearch LLM-Based Conversational Search Edition instance, and click Model Management in the pane. On the page that appears, click the Vector Model tab to view information about embedding models, including the model name, model type, and model overview.
Model name | Model type | Supported language | Maximum length of input text (number of tokens) | Output vector dimension |
ops-text-embedding-001 | General-purpose embedding model | More than 40 languages | 300 | 1536 |
ops-text-embedding-002 | General-purpose embedding model | More than 100 languages | 8,192 | 1024 |
ops-text-embedding-zh-001 | General-purpose embedding model | Chinese | 1,024 | 768 |
ops-text-embedding-en-001 | General-purpose embedding model | English | 512 | 768 |
ops-text-sparse-embedding-001 | Sparse embedding model | More than 100 languages | 8,192 | Related to the length of the input text |
General-purpose embedding model: A general-purpose embedding model is a dense embedding model, which converts text into dense vectors. This helps understand long text and semantic descriptions, and optimize the search effect.
Sparse embedding model: A sparse embedding model converts text into sparse vectors. This optimizes effect of searches with filtering conditions. A sparse embedding model must be used together with a dense embedding model. In general case, the search effect of using the two models is better than that of using only the dense embedding model.