To ensure efficiency, retrieval systems may return results that are not sufficiently precise during the initial retrieval phase. A rerank model performs a more accurate sorting of the retrieved documents to ensure the most relevant results appear at the top.
Model overview
Singapore
Model | Max number of documents | Max input tokens per item | Max input tokens per request | Supported languages | Price (per 1M tokens) | Free quota | Scenarios |
qwen3-rerank | 500 | 4,000 | 120,000 | Over 100 major languages, such as Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, and Russian | $0.1 | 1 million tokens Valid for 90 days after activating Model Studio |
|
Beijing
Model | Max number of documents | Max input tokens per item | Max input tokens per request | Supported languages | Price (per 1M tokens) | Free quota | Scenarios |
qwen3-vl-rerank | 100 | 8,000 | 800,000 | 33 major languages, such as Chinese, English, Japanese, Korean, French, and German | Image: $0.258 Text: $0.1 | No free quota |
|
gte-rerank-v2 | 500 | 4,000 | 30,000 | Over 50 languages, such as Chinese, English, Japanese, Korean, Thai, Spanish, French, Portuguese, German, Indonesian, and Arabic | $0.115 |
|
Max input tokens per item: The maximum number of tokens allowed for each query or document. If the input exceeds this limit, it is truncated. The API computes results based on the truncated content, which may lead to inaccurate ranking.
Max number of documents: The maximum number of documents permitted in a single request.
Max input tokens per request: Calculated using the formula
Query Tokens × Number of documents + Total document tokens. This total must not exceed the maximum input tokens allowed per request.
Input limitations
Model | Image | Video |
qwen3-vl-rerank | JPEG, PNG, WEBP, BMP, TIFF, ICO, DIB, ICNS, and SGI (URL or Base64 supported) | MP4, AVI, and MOV (URL only) |
Prerequisites
Get an API key and set the API key as an environment variable. To use the SDK, install the DashScope SDK.
HTTP
POST https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerankRequest | qwen3-rerankqwen3-vl-rerankgte-rerank-v2 |
Request headers | |
Content-Type The content type of the request. Must be | |
Authorization The authentication credentials using a Model Studio API key. Example: | |
Request body | |
model The model name. Supported models include qwen3-rerank, gte-rerank-v2, and qwen3-vl-rerank. | |
input The input content. When you use | |
parameters object (Optional) Optional parameters. When you use |
Response | Successful responseFailed responseIf a request fails, the |
request_id Unique identifier for the request. Use for tracing and troubleshooting issues. | |
output The task output. | |
usage Provides output statistics. | |
code The error code. Returned only when the request fails. See error codes for details. | |
message Detailed error message. Returned only when the request fails. See error codes for details. |
Use the SDK
Example
The following example shows how to call the rerank model API.
The parameter names in the SDK are mostly consistent with those in the HTTP API, but the parameter structure is encapsulated. For example, the HTTP API uses nestedinputandparametersstructures, while the SDK uses a flat structure. Note this difference during development.
import dashscope
def text_rerank():
resp = dashscope.TextReRank.call(
model="gte-rerank-v2",
query="What is a rerank model?",
documents=[
"Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance.",
"Quantum computing is a cutting-edge field of computer science.",
"The development of pre-trained language models has brought new advancements to rerank models."
],
top_n=2,
return_documents=True
)
print(resp)
if __name__ == '__main__':
text_rerank()Sample output
The SDK encapsulates the original HTTP response. For a successful request, the SDK always returns the code and message fields with empty strings as their values.
{
"status_code": 200,
"request_id": "4b0805c0-6b36-490d-8bc1-4365f4c89905",
"code": "",
"message": "",
"output": {
"results": [
{
"index": 0,
"relevance_score": 0.9334521178273196,
"document": {
"text": "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance."
}
},
{
"index": 2,
"relevance_score": 0.34100082626411193,
"document": {
"text": "The development of pre-trained language models has brought new advancements to rerank models."
}
}
]
},
"usage": {
"total_tokens": 79
}
}Error Codes
If the model call fails and returns an error message, see Error messages for resolution.