Retrieval systems often return imprecise results during initial retrieval. A rerank model sorts these documents more accurately to surface the most relevant results.
Model overview
Singapore
|
Model |
Max number of documents |
Max input tokens per item |
Max input tokens per request |
Supported languages |
Price (per 1M tokens) |
Free quota |
Scenarios |
|
qwen3-rerank |
500 |
4,000 |
120,000 |
Over 100 major languages, such as Chinese, English, Spanish, French, Portuguese, Indonesian, Japanese, Korean, German, and Russian |
$0.1 |
1 million tokens Valid for 90 days after activating Model Studio |
|
Beijing
|
Model |
Max number of documents |
Max input tokens per item |
Max input tokens per request |
Supported languages |
Price (per 1M tokens) |
Free quota |
Scenarios |
|
qwen3-vl-rerank |
100 |
8,000 |
120,000 |
33 major languages, such as Chinese, English, Japanese, Korean, French, and German |
Image: $0.258 Text: $0.1 |
No free quota |
|
|
gte-rerank-v2 |
500 |
4,000 |
30,000 |
Over 50 languages, such as Chinese, English, Japanese, Korean, Thai, Spanish, French, Portuguese, German, Indonesian, and Arabic |
$0.115 |
|
-
Max input tokens per item: The maximum tokens allowed per query or document. Exceeding this limit triggers truncation, which may reduce ranking accuracy.
-
Max number of documents: The maximum documents per request.
-
Max input tokens per request: Calculated using the formula
Query Tokens × Number of documents + Total document tokens. This total must not exceed the per-request token limit.
Input limitations
|
Model |
Image |
Video |
|
qwen3-vl-rerank |
JPEG, PNG, WEBP, BMP, TIFF, ICO, DIB, ICNS, and SGI (URL or Base64 supported) |
MP4, AVI, and MOV (URL only) |
Prerequisites
Get an API key and set the API key as an environment variable. To use the SDK: install the DashScope SDK.
HTTP
POST https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank
Request |
qwen3-rerank
qwen3-vl-rerankText query
Image query
gte-rerank-v2
|
Request headers |
|
|
Content-Type The content type of the request. Must be |
|
|
Authorization The authentication credentials using a Model Studio API key. Example: |
|
Request body |
|
|
model The model name. Supported models include qwen3-rerank, gte-rerank-v2, and qwen3-vl-rerank. |
|
|
input The input content. When you use |
|
|
parameters object (Optional) Optional parameters. When you use |
Response |
Successful response
Failed responseIf a request fails,
|
|
request_id Unique identifier for the request. Use for tracing and troubleshooting issues. |
|
|
output Task output. |
|
|
usage Output statistics. |
|
|
code The error code. Returned only when the request fails. See error codes for details. |
|
|
message Detailed error message. Returned only when the request fails. See error codes for details. |
Use the SDK
Example
Example: Call the rerank model API.
SDK parameter names match the HTTP API, but the structure is encapsulated. HTTP API uses nestedinputandparametersstructures; SDK uses a flat structure. Note this difference during development.
import dashscope
def text_rerank():
resp = dashscope.TextReRank.call(
model="gte-rerank-v2",
query="What is a rerank model?",
documents=[
"Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance.",
"Quantum computing is a cutting-edge field of computer science.",
"The development of pre-trained language models has brought new advancements to rerank models."
],
top_n=2,
return_documents=True
)
print(resp)
if __name__ == '__main__':
text_rerank()Example: Use qwen3-vl-rerank for multimodal sorting with an image query.
import dashscope
from http import HTTPStatus
import json
def vl_rerank():
resp = dashscope.TextReRank.call(
model="qwen3-vl-rerank",
query={"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
documents=[
{"text": "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance."},
{"image": "https://img.alicdn.com/imgextra/i3/O1CN01rdstgY1uiZWt8gqSL_!!6000000006071-0-tps-1970-356.jpg"},
{"video": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20250107/lbcemt/new+video.mp4"}
],
top_n=2,
return_documents=True
)
if resp.status_code == HTTPStatus.OK:
print(json.dumps(resp, default=str, ensure_ascii=False, indent=4))
else:
print(resp)
if __name__ == '__main__':
vl_rerank()Sample output
SDK encapsulates the HTTP response. For successful requests, code and message fields are always empty strings.
{
"status_code": 200,
"request_id": "4b0805c0-6b36-490d-8bc1-4365f4c89905",
"code": "",
"message": "",
"output": {
"results": [
{
"index": 0,
"relevance_score": 0.9334521178273196,
"document": {
"text": "Rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance."
}
},
{
"index": 2,
"relevance_score": 0.34100082626411193,
"document": {
"text": "The development of pre-trained language models has brought new advancements to rerank models."
}
}
]
},
"usage": {
"total_tokens": 79
}
}
Error codes
If the model call fails and returns an error message, see Error messages for resolution.