Converts text into dense vector representations via a POST API. Use text embeddings to build semantic search, retrieval-augmented generation (RAG) pipelines, text classification systems, and similarity search applications.
Available models
Six models are available. Choose based on the languages your application handles, the input length you need, and the vector dimension that fits your index:
| Model | Service ID | Languages | Max input length | Vector dimension | QPS limit |
|---|---|---|---|---|---|
| OpenSearch text vectorization service -001 | ops-text-embedding-001 |
Multilingual (40+) | 300 | 1536 | 50 |
| OpenSearch Text Embedding Service-Chinese-001 | ops-text-embedding-zh-001 |
Chinese | 1024 | 768 | — |
| OpenSearch Text Embedding Service-English-001 | ops-text-embedding-en-001 |
English | 512 | 768 | — |
| OpenSearch General Text Embedding Service-002 | ops-text-embedding-002 |
Multilingual (100+) | 8192 | 1024 | — |
| GTE Text Embedding-Multilingual-Base | ops-gte-sentence-embedding-multilingual-base |
Multilingual (70+) | 8192 | 768 | — |
| Qwen3 Text Embedding-0.6B | ops-qwen3-embedding-0.6b |
Multilingual (100+) | 32k | 1024 | — |
Notes:
-
ops-text-embedding-001has a default QPS limit of 50, shared across your Alibaba Cloud account and all RAM users. To request a higher limit, submit a ticket. -
ops-text-embedding-002offers broader language support and better retrieval performance thanops-text-embedding-001. -
ops-qwen3-embedding-0.6bis a 0.6B-parameter model from the Qwen3 series.
Prerequisites
Before you begin, ensure that you have:
-
Authentication credentials (API key) for the AI Search Open Platform
-
A service endpoint — call the API over the Internet or through a virtual private cloud (VPC). For details, see Get service registration address
API reference
Request
Method: POST
URL:
{host}/v3/openapi/workspaces/{workspace_name}/text-embedding/{service_id}
| Path parameter | Description | Example |
|---|---|---|
host |
Service endpoint, accessible over the Internet or through a VPC | ****-hangzhou.opensearch.aliyuncs.com |
workspace_name |
The name of the workspace | default |
service_id |
The service ID of the model to use | ops-text-embedding-001 |
Constraints: The request body must not exceed 8 MB.
Header parameters
| Parameter | Type | Required | Description | Example |
|---|---|---|---|---|
Content-Type |
String | Yes | Must be application/json |
application/json |
Authorization |
String | Yes | API key in Bearer token format | Bearer OS-d1**2a |
Body parameters
| Parameter | Type | Required | Description | Example |
|---|---|---|---|---|
input |
Array or String | Yes | Text to embed. Pass up to 32 strings per request. Empty strings are not accepted. Maximum length per string depends on the model. | ["Science and technology are the primary productive forces", "opensearch product documentation"] |
input_type |
String | No | How the input will be used. Valid values: query or document (default). |
document |
Choosing input_type:
-
Use
documentwhen embedding text that goes into your vector index — for example, article content or product descriptions stored for later retrieval. -
Use
querywhen embedding a search query that will be compared against indexed documents at query time.
Setting input_type correctly lets the model optimize vector representations for each role. When omitted, the model uses document as the default.
Response parameters
| Parameter | Type | Description | Example |
|---|---|---|---|
request_id |
String | The request ID | B4AB89C8-B135-****-A6F8-2BAB801A2CE4 |
latency |
Float or Int | Request duration in milliseconds | 10 |
usage |
Object | Metering information generated by this call | {"token_count": 3072} |
usage.token_count |
Int | Number of tokens consumed by this call | 3072 |
result.embeddings |
List | Array of embedding results, one entry per input string | See example below |
result.embeddings[].index |
Int | Position of the input string this result corresponds to (zero-based) | 0 |
result.embeddings[].embedding |
List(Float) | The vector for this input string | [0.003143, 0.009750, ..., -0.017395] |
The result.embeddings array preserves input order via the index field, so you can map each result back to its source string when processing a batch.
Examples
Embed a single string
cURL
curl -X POST \
"http://****-hangzhou.opensearch.aliyuncs.com/v3/openapi/workspaces/default/text-embedding/ops-text-embedding-001" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <your-api-key>" \
-d '{
"input": "opensearch product documentation",
"input_type": "query"
}'
Python
import os
import requests
host = "http://****-hangzhou.opensearch.aliyuncs.com"
workspace = "default"
service_id = "ops-text-embedding-001"
api_key = os.environ.get("OPENSEARCH_API_KEY")
url = f"{host}/v3/openapi/workspaces/{workspace}/text-embedding/{service_id}"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}",
}
payload = {
"input": "opensearch product documentation",
"input_type": "query",
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())
Embed multiple strings
Pass an array to embed up to 32 strings in a single request. The response embeddings array contains one entry per input string, matched by index.
cURL
curl -X POST \
"http://****-hangzhou.opensearch.aliyuncs.com/v3/openapi/workspaces/default/text-embedding/ops-text-embedding-001" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <your-api-key>" \
-d '{
"input": [
"Science and technology are the primary productive forces",
"opensearch product documentation"
],
"input_type": "query"
}'
Python
import os
import requests
host = "http://****-hangzhou.opensearch.aliyuncs.com"
workspace = "default"
service_id = "ops-text-embedding-001"
api_key = os.environ.get("OPENSEARCH_API_KEY")
url = f"{host}/v3/openapi/workspaces/{workspace}/text-embedding/{service_id}"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}",
}
payload = {
"input": [
"Science and technology are the primary productive forces",
"opensearch product documentation",
],
"input_type": "query",
}
response = requests.post(url, headers=headers, json=payload)
data = response.json()
# data["result"]["embeddings"] is a list of objects, one per input string.
# Each object has an "index" field that maps back to the original input position.
for item in data["result"]["embeddings"]:
print(item["index"], item["embedding"][:5])
Replace <your-api-key> in cURL examples with your actual API key. In Python, set the OPENSEARCH_API_KEY environment variable before running the script.
Successful response
{
"request_id": "B4AB89C8-B135-****-A6F8-2BAB801A2CE4",
"latency": 38,
"usage": {
"token_count": 3072
},
"result": {
"embeddings": [
{
"index": 0,
"embedding": [
-0.02868066355586052,
0.022033605724573135,
-0.0417383536696434,
-0.044081952422857285,
0.02141784131526947,
-8.240503375418484E-4,
-0.01309406291693449,
-0.02169642224907875,
-0.03996409475803375,
0.008053945377469063,
...
-0.05131729692220688,
-0.016595875844359398
]
}
]
}
}
Error response
When a request fails, the response includes a code and message describing the problem:
{
"request_id": "651B3087-8A07-****-B931-9C4E7B60F52D",
"latency": 0,
"code": "InvalidParameter",
"message": "JSON parse error: Cannot deserialize value of type `InputType` from String \"xxx\""
}
For a full list of status codes, see Status codes.