To maintain efficiency, text retrieval systems may return results that are not precise enough during the retrieval phase. A text rerank model can perform a second, more precise sorting on the retrieved documents. This process ensures that the results most relevant to a user's query are ranked highest, which improves the application's accuracy.
Model overview
Model | Max number of documents | Max input tokens per item | Max input tokens per request | Supported languages | Price (Million input tokens) | Scenarios |
gte-rerank-v2 | 500 | 4,000 | 30,000 | Over 50 languages, such as Chinese, English, Japanese, Korean, Thai, Spanish, French, Portuguese, German, Indonesian, and Arabic | $0.115 |
|
Max input tokens per item: The maximum number of tokens for each query or document is 4,000. If the input content exceeds this length, it is truncated. The API calculates the result based on the truncated content, which may lead to inaccurate sorting.
Max number of documents: The maximum number of documents in each request is 500.
Max input tokens per request: The total number of tokens for the query and all documents in a request cannot exceed 30,000.
Prerequisites
You must get an API key and set the API key as an environment variable. If you use the SDK to make calls, you must also install the DashScope SDK.
HTTP
POST https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerankRequest | Text rerank |
Headers | |
Content-Type The content type of the request. Set this parameter to | |
Authorization The identity authentication credentials for the request. This API uses an Model Studio API key for identity authentication. Example: Bearer sk-xxxx. | |
Request body | |
model The model name. The value must be | |
input The input content. | |
parameters object (Optional) Optional parameters. |
Response | Successful responseFailed responseIf a request fails, the |
request_id The unique request ID. You can use this ID to trace and troubleshoot issues. | |
output The task output. | |
usage Usage statistics for the request. | |
code The error code for a failed request. This parameter is not returned if the request is successful. For more information, see Error messages. | |
message The detailed information about a failed request. This parameter is not returned if the request is successful. For more information, see Error messages. |
Use the SDK
Example
The following example shows how to call the text rerank model API.
The parameter names in the SDK are mostly consistent with those in the HTTP API, but the parameter structure is encapsulated. For example, the HTTP API uses nestedinputandparametersstructures, while the SDK uses a flat structure. Note this difference during development.
import dashscope
def text_rerank():
resp = dashscope.TextReRank.call(
model="gte-rerank-v2",
query="What is a text rerank model",
documents=[
"Text rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance",
"Quantum computing is a cutting-edge field in computer science",
"The development of pre-trained language models has brought new progress to text rerank models"
],
top_n=2,
return_documents=True
)
print(resp)
if __name__ == '__main__':
text_rerank()Sample output
The SDK encapsulates the original HTTP response. For a successful request, the SDK always returns the code and message fields with empty strings as their values.
{
"status_code": 200,
"request_id": "4b0805c0-6b36-490d-8bc1-4365f4c89905",
"code": "",
"message": "",
"output": {
"results": [
{
"index": 0,
"relevance_score": 0.9334521178273196,
"document": {
"text": "Text rerank models are widely used in search engines and recommendation systems. They sort candidate documents based on text relevance"
}
},
{
"index": 2,
"relevance_score": 0.34100082626411193,
"document": {
"text": "The development of pre-trained language models has brought new progress to text rerank models"
}
}
]
},
"usage": {
"total_tokens": 79
}
}Error codes
If a call fails, see Error messages for troubleshooting.