All Products
Search
Document Center

OpenSearch:Text embedding

Last Updated:Apr 03, 2026

Converts text into dense vector representations via a POST API. Use text embeddings to build semantic search, retrieval-augmented generation (RAG) pipelines, text classification systems, and similarity search applications.

Available models

Six models are available. Choose based on the languages your application handles, the input length you need, and the vector dimension that fits your index:

Model Service ID Languages Max input length Vector dimension QPS limit
OpenSearch text vectorization service -001 ops-text-embedding-001 Multilingual (40+) 300 1536 50
OpenSearch Text Embedding Service-Chinese-001 ops-text-embedding-zh-001 Chinese 1024 768
OpenSearch Text Embedding Service-English-001 ops-text-embedding-en-001 English 512 768
OpenSearch General Text Embedding Service-002 ops-text-embedding-002 Multilingual (100+) 8192 1024
GTE Text Embedding-Multilingual-Base ops-gte-sentence-embedding-multilingual-base Multilingual (70+) 8192 768
Qwen3 Text Embedding-0.6B ops-qwen3-embedding-0.6b Multilingual (100+) 32k 1024

Notes:

  • ops-text-embedding-001 has a default QPS limit of 50, shared across your Alibaba Cloud account and all RAM users. To request a higher limit, submit a ticket.

  • ops-text-embedding-002 offers broader language support and better retrieval performance than ops-text-embedding-001.

  • ops-qwen3-embedding-0.6b is a 0.6B-parameter model from the Qwen3 series.

Prerequisites

Before you begin, ensure that you have:

  • Authentication credentials (API key) for the AI Search Open Platform

  • A service endpoint — call the API over the Internet or through a virtual private cloud (VPC). For details, see Get service registration address

API reference

Request

Method: POST

URL:

{host}/v3/openapi/workspaces/{workspace_name}/text-embedding/{service_id}
Path parameter Description Example
host Service endpoint, accessible over the Internet or through a VPC ****-hangzhou.opensearch.aliyuncs.com
workspace_name The name of the workspace default
service_id The service ID of the model to use ops-text-embedding-001

Constraints: The request body must not exceed 8 MB.

Header parameters

Parameter Type Required Description Example
Content-Type String Yes Must be application/json application/json
Authorization String Yes API key in Bearer token format Bearer OS-d1**2a

Body parameters

Parameter Type Required Description Example
input Array or String Yes Text to embed. Pass up to 32 strings per request. Empty strings are not accepted. Maximum length per string depends on the model. ["Science and technology are the primary productive forces", "opensearch product documentation"]
input_type String No How the input will be used. Valid values: query or document (default). document

Choosing input_type:

  • Use document when embedding text that goes into your vector index — for example, article content or product descriptions stored for later retrieval.

  • Use query when embedding a search query that will be compared against indexed documents at query time.

Setting input_type correctly lets the model optimize vector representations for each role. When omitted, the model uses document as the default.

Response parameters

Parameter Type Description Example
request_id String The request ID B4AB89C8-B135-****-A6F8-2BAB801A2CE4
latency Float or Int Request duration in milliseconds 10
usage Object Metering information generated by this call {"token_count": 3072}
usage.token_count Int Number of tokens consumed by this call 3072
result.embeddings List Array of embedding results, one entry per input string See example below
result.embeddings[].index Int Position of the input string this result corresponds to (zero-based) 0
result.embeddings[].embedding List(Float) The vector for this input string [0.003143, 0.009750, ..., -0.017395]

The result.embeddings array preserves input order via the index field, so you can map each result back to its source string when processing a batch.

Examples

Embed a single string

cURL

curl -X POST \
  "http://****-hangzhou.opensearch.aliyuncs.com/v3/openapi/workspaces/default/text-embedding/ops-text-embedding-001" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <your-api-key>" \
  -d '{
    "input": "opensearch product documentation",
    "input_type": "query"
  }'

Python

import os
import requests

host = "http://****-hangzhou.opensearch.aliyuncs.com"
workspace = "default"
service_id = "ops-text-embedding-001"
api_key = os.environ.get("OPENSEARCH_API_KEY")

url = f"{host}/v3/openapi/workspaces/{workspace}/text-embedding/{service_id}"
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {api_key}",
}
payload = {
    "input": "opensearch product documentation",
    "input_type": "query",
}

response = requests.post(url, headers=headers, json=payload)
print(response.json())

Embed multiple strings

Pass an array to embed up to 32 strings in a single request. The response embeddings array contains one entry per input string, matched by index.

cURL

curl -X POST \
  "http://****-hangzhou.opensearch.aliyuncs.com/v3/openapi/workspaces/default/text-embedding/ops-text-embedding-001" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <your-api-key>" \
  -d '{
    "input": [
      "Science and technology are the primary productive forces",
      "opensearch product documentation"
    ],
    "input_type": "query"
  }'

Python

import os
import requests

host = "http://****-hangzhou.opensearch.aliyuncs.com"
workspace = "default"
service_id = "ops-text-embedding-001"
api_key = os.environ.get("OPENSEARCH_API_KEY")

url = f"{host}/v3/openapi/workspaces/{workspace}/text-embedding/{service_id}"
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {api_key}",
}
payload = {
    "input": [
        "Science and technology are the primary productive forces",
        "opensearch product documentation",
    ],
    "input_type": "query",
}

response = requests.post(url, headers=headers, json=payload)
data = response.json()
# data["result"]["embeddings"] is a list of objects, one per input string.
# Each object has an "index" field that maps back to the original input position.
for item in data["result"]["embeddings"]:
    print(item["index"], item["embedding"][:5])

Replace <your-api-key> in cURL examples with your actual API key. In Python, set the OPENSEARCH_API_KEY environment variable before running the script.

Successful response

{
    "request_id": "B4AB89C8-B135-****-A6F8-2BAB801A2CE4",
    "latency": 38,
    "usage": {
        "token_count": 3072
    },
    "result": {
        "embeddings": [
            {
                "index": 0,
                "embedding": [
                    -0.02868066355586052,
                    0.022033605724573135,
                    -0.0417383536696434,
                    -0.044081952422857285,
                    0.02141784131526947,
                    -8.240503375418484E-4,
                    -0.01309406291693449,
                    -0.02169642224907875,
                    -0.03996409475803375,
                    0.008053945377469063,
                    ...
                    -0.05131729692220688,
                    -0.016595875844359398
                ]
            }
        ]
    }
}

Error response

When a request fails, the response includes a code and message describing the problem:

{
    "request_id": "651B3087-8A07-****-B931-9C4E7B60F52D",
    "latency": 0,
    "code": "InvalidParameter",
    "message": "JSON parse error: Cannot deserialize value of type `InputType` from String \"xxx\""
}

For a full list of status codes, see Status codes.