This document explains how to calculate the number of tokens used when accessing the AI Search Open Platform service via the API.
Token calculation
In a language model, a token is the smallest unit of text segmentation, which can be a word, phrase, punctuation mark, character, or other element. Different models use various segmentation methods, so the number of characters and tokens may not align directly. For instance, in the AI Search Open Platform:
"Apple"equates to 1 token"Test Case"equates to 2 tokens"OpenSearch"equates to 2 tokens
LLM services provided by the AI Search Open Platform are billed based on the number of input and output tokens processed. You can use the Token Calculation API to estimate the costs of service invocation.
Supported models list
The following models support using the Token Calculation service to obtain token counts.
Model classification | Service ID (service_id) |
OpenSearch SFT model | ops-qwen-turbo |
Qwen model | qwen-turbo qwen-plus qwen-max |
Http call interface
Prerequisites
The authentication information is obtained.
When you call an AI Search Open Platform service by using an API, you need to authenticate the caller's identity.
The service access address is obtained.
You can call a service over the Internet or a virtual private cloud (VPC). For more information, see Get service registration address.
General description
The maximum request body size must not exceed 8 MB.
Request method
POST
URL
{host}/v3/openapi/workspaces/{workspace_name}/text-generation/{service_id}/tokenizerhost: The service address, which supports both public network and VPC access methods. For more information, see Obtain service access address.

workspace_name: The workspace name, for example, default.
service_id: The built-in service identifier, such as ops-qwen-turbo.
Request parameters
Header parameters
API-KEY authentication
Parameter | Type | Required | Description | Example value |
Content-Type | String | Yes | The media type of the request, application/json | application/json |
Authorization | String | Yes | API-Key | Bearer OS-d1**2a |
Body parameters
messages | List | Yes | The conversation history between the user and the model. Each element in the list is formatted as {"role": role, "content": content}, with available roles: system, user, assistant.
| [{"role": "user", "content": "Test token calculation interface"}] |
Response parameters
Parameter | Type | Description | Example |
request_id | String | The request ID. | 310032DA-****-46CC-94D1-0FE789BAE3A7 |
latency | Float/Int | The time taken for the request in milliseconds. | 10 |
usage | Object | Details of the API call's metering information. | "usage":{"input_tokens":4} |
usage.input_tokens | Integer | The number of input tokens. | 4 |
result.token_ids | List<Integer> | The token IDs that correspond to the input text. | [81705,5839,100768,107736] |
result.tokens | List<String> | The actual tokens derived from the input text. | ["Test","token","calculation","interface"] |
Curl request example
curl -XPOST -H "Content-Type:application/json"
"http://****-shanghai.opensearch.aliyuncs.com/v3/openapi/workspaces/default/text-generation/ops-qwen-turbo/tokenizer"
-H "Authorization: Bearer Your API-KEY"
-d "{
\"messages\":[
{
\"role\":\"user\",
\"content\":\"Test token calculation interface\"
}
]}"Response example
Correct response example
{
"request_id":"9d197d47-d6b5-****-964e-12b893c47a8b",
"latency":11,
"usage":{
"input_tokens":4
},
"result":{
"token_ids":[81705,5839,100768,107736],
"tokens":["Test","token","calculation","interface"]
}
}Abnormal response example
If an error occurs during the request, the output will provide the error reason through the code and message fields.
{
"request_id":"388476DB-C4D4-****-A7A6-7594F92885FA",
"latency":0,
"code":"InvalidParameter",
"message":"Messages must be end with role[user]."
}Status codes
For more information, see Status code description of AI Search Open Platform.