Judge models allow you to call algorithm services by using OpenAI SDK for Python and HTTP. This topic describes how to call APIs, and API parameters and sample code for judge models.
Chat completions
Sample code
Request example
import os from openai import OpenAI def main(): base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1" judge_model_token = os.getenv("JUDGE_MODEL_TOKEN") client = OpenAI( api_key=f'Authorization: Bearer {judge_model_token}', base_url=base_url ) completion = client.chat.completions.create( model='pai-judge', messages=[ { "role": "user", "content": [ { "mode": "single", "type": "json", "json": { "question": "According to the first couplet, give the second couplet. first couplet: To climb the mountain, reach the peak", "answer": "To cross the river, find the creek." } } ] } ] ) print(completion.model_dump()) if __name__ == '__main__': main()
$ curl -X POST https://aiservice.cn-hangzhou.aliyuncs.com/v1/chat/completions \ -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}" \ -H "Content-Type: application/json" \ -d '{ "model": "pai-judge", "messages": [ { "role": "user", "content": [ { "mode": "single", "type": "json", "json": { "question": "According to the first couplet, give the second couplet. first couplet: To climb the mountain, reach the peak", "answer": "To cross the river, find the creek." } } ] } ] }'
Response example
{ "id": "3b7c3822-1e51-4dc9-b2ad-18b9649a7f19", "choices": [ { "finish_reason": "stop", "index": 0, "logprobs": null, "message": { "content": "I think the overall score of the answer is [[2]] due to the following reasons:\n Advantages of the answer:\n1. Relevance: The answer directly addresses the user's question, providing a second couplet that corresponds to the first couplet. ThiS meets the relevance criteria.[[4]]\n2. Harmlessness: The answer does not contain offensive content. This meets the harmlessness criteria.[[5]]\n\n Disadvantages of the answer: \n1. Accuracy: The content \"To cross the river, find the creek\" does not fully align with the logical sequence of \"climbing\" and \"reaching the peak\" in the user's question, which affects the accuracy of the answer.[[2]]\n2. Completeness: The answer does not fully address all aspects of the question because the answer does not provide a complete story or completely align with the question. This affects the completeness of the answer.[[2]]\n3. Source reliability: The answer does not provide source information. Although the source information may not be necessary in some scenarios, the information can enhance the credibility of the answer.[[3]]\n4. Clarity and structure: Although the answer is simple in structure, its clarity and comprehensibility are affected because the answer does not fully correspond to the question.[[3]]\n5. Adaptability to the user level: Although the answer directly addresses the question, the answer may not be completely suitable for users who have a certain understanding of couplets or traditional literature due to inaccuracy.[[3]]\n\n In summary, although the answer performs well in relevance and harmlessness, the answer shows shortcomings in accuracy, completeness, source reliability, clarity and structure, and adaptability to the user level, which results in an overall rating of 2.", "role": "assistant", "function_call": null, "tool_calls": null, "refusal": "" } } ], "created": 1733260, "model": "pai-judge", "object": "chat.completion", "service_tier": "", "system_fingerprint": "", "usage": { "completion_tokens": 333, "prompt_tokens": 790, "total_tokens": 1123 } }
Request parameters
The APIs of judge models are compatible with those of OpenAI. You can configure the parameters described in the following table. For information about other parameters, see OpenAI documentation.
If you cannot access the web page, you may need to configure a proxy and then try again.
Parameter | Type | Required | Default value | Description |
model | string | Yes | None | The name of the model. Valid values:
For more information about the models, see Supported judge models. |
messages | array | Yes | None | |
temperature | float | No | 0.2 | Controls the randomness and diversity of the answer provided by the model. Valid values: [0, 2). |
top_p | float | No | None | The probability threshold of the nucleus sampling method used in the generation process. |
stream | boolean | No | False | Specifies whether to enable the streaming output mode. |
stream_options | object | No | None | Specifies whether to display the number of tokens used in the streaming output mode. This parameter takes effect only when the stream parameter is set to True. If you want to count the number of tokens in streaming output mode, set this parameter to |
max_tokens | integer | No | 2048 | The maximum number of tokens that can be generated by the model. |
frequency_penalty | integer or null | No | 0 | Positive values are used to penalize the model for new words that frequently occur in the text. This reduces the possibility that the model repeatedly generates the same line of text word for word. Valid values: [-2.0, 2.0]. |
presence_penalty | float | No | 0 | Penalizes the model for content that has already appeared in the text. If you set this parameter to a large value, the possibility of generating content that already appeared in the text can be reduced. Valid values: [-2.0, 2.0]. |
seed | integer | No | None | The random seed used during content generation. This parameter controls the randomness of the generated content. Valid values: unsigned 64-bit integers. |
stop | string or array | No | None | If you set this parameter to a string or a token ID, the model stops generating content before the string or token ID is generated. This parameter helps achieve precise control over the content generation process. |
tools | array | No | None | The list of tools that can be called by the model. The model calls a tool from the tool list during each function call. |
tool_choice | string or object | No | None | The tool that can be called by the model. |
parallel_tool_calls | object | No | True | Specifies whether to enable the parallel function calling feature when a tool is used. |
user | string | No | None | The user identifier. |
logit_bias | map | No | None | The possibility that the model generates a specific label. |
logprobs | boolean | No | False | Specifies whether to return the log probability of each output label. Valid values:
|
top_logprobs | integer | No | None | The maximum number of labels that can be returned from each labeled position. Each label has a log probability. Valid values: 0 to 20. If you configure this parameter, you must set the logprobs parameter to True. |
n | integer | No | 1 | The number of chat completion options that are generated for each input message. |
response_format | object | No | {"type": "text"} | The format of the object that the model must return. Valid values:
|
service_tier | string | No | None | The latency tier that is used to process requests. |
messages parameter
The messages
parameter specifies the information received by a judge model. The following sample code provides an example:
messages=[
{
"role": "user",
"content": [
{
"mode": "single",
"type": "json",
"json": {
"question": "Provide the second couplet to \"To climb the mountain, reach the peak\"",
"answer": "To cross the river, find the creek"
}
}
]
}
]
The messages
parameter is in the array format. Each element is in the {"role":role, "content": content}
format. role
specifies the user
. content
specifies the content to be evaluated by the model. The content includes the following information:
mode
: the evaluation mode.single
specifies single-model evaluation.pairwise
specifies dual-model competition.type
: Set the value tojson
.json
: the details of the content to be evaluated.
The following table describes the json
parameter.
Parameter | Type | Required | Description | Default value |
question | string | Yes | The question. | None |
answer | string | Required for single-model evaluation | The answer. | None |
answer1 | string | Required for dual-model competition | The answer from Model 1. | None |
answer2 | string | Required for dual-model competition | The answer from Model 2. | None |
ref_answer | string | No | The reference answer. | None |
scene | string | No | The scenario name. | The name is automatically generated by the judge model. Example: Open questions. |
scene_desc | string | No | The scenario description. | The description is automatically generated by the judge model. For example, a user asks an open question, and the answer is open, such as casual conversation, consultation advice, and recommendation. |
metric | string | No | The scenario dimension. | The dimension is automatically generated by the judge model. Example:
|
max_score | integer | No | The scoring range. Valid values: 2 to 10. | 5 |
score_desc | string | No | The detailed description of each score. We recommend that you define the quality of answers that are scored from 1 to the value of the max_score parameter in sequence. | 1: The answer has major flaws, completely deviates from the standards, and should not appear in practice. 2: Some content of the answer meets the standards and can be accepted, but as a whole, the quality of the answer is not qualified. 3: The answer has both advantages and disadvantages, and the advantages outweigh the disadvantages within the required evaluation criteria. 4: The quality of the answer is acceptable, the answer generally meets the standards, and there are small issues that can be improved. This level represents the quality of a reference answer. 5: The answer is perfect and strictly meets the standards in all aspects. When a reference answer is provided, this level represents the quality of the answer that is better than the reference answer. |
steps | string | No | The evaluation steps. |
|
The parameters in content
are used to fill in and generate prompt templates. The single-model evaluation and dual-model competition request examples use the following templates to construct requests. The content of content
is automatically filled in the corresponding positions of the template.
Single-model evaluation request template
Your task is to evaluate the quality of an AI assistant's answer.
You clearly understand that when a user provides a question about the [${scene}] scenario (defined as: ${scene_desc}), an AI assistant's answer should meet the following criteria (listed in order of importance from highest to lowest):
[Start of criteria]
${metric}
[End of criteria]
The scoring system uses a ${max_score}-level scale (1-${max_score}), with each score level defined as follows:
[Start of score description]
${score_desc}
[End of score description]
We have collected the following answer from an AI assistant to a user question.
Please comprehensively evaluate the answer based on the criteria that you use for the current scenario. The following items describe the question and answer from an AI assistant:
[Start of data]
***
[User question]: ${question}
***
[Answer]: ${answer}
***
[Reference answer]: ${ref_answer}
***
[End of data]
You must evaluate the preceding answer based on the following process:
${steps}
Think carefully and then provide your conclusion.
Dual-model competition request template
Your task is to evaluate the quality of AI assistants' answers.
You clearly understand that when a user provides a question about the [${scene}] scenario (defined as: ${scene_desc}), an AI assistant's answer should meet the following criteria (listed in order of importance from highest to lowest):
[Start of criteria]
${metric}
[End of criteria]
The scoring system uses a ${max_score}-level scale (1-${max_score}), with each score level defined as follows:
[Start of score description]
${score_desc}
[End of score description]
For a user question in the [${scene}] scenario, we have collected answers from two AI assistants.
Please comprehensively evaluate the answers and determine which answer is better or whether the quality of the two answers is the same based on the criteria that you use for the current scenario. The following items describe a question and the answers from the AI assistants:
[Start of data]
***
[User question]: ${question}
***
[Answer 1]: ${answer1}
***
[Answer 2]: ${answer2}
***
[Reference answer]: ${ref_answer}
***
[End of data]
You must evaluate and compare the two answers based on the following process:
{steps}
Think carefully and then provide your conclusion.
If you leave the ${scene}
parameter empty, the judge model automatically configures the scenario based on the ${question}
parameter that you specify, and generates the corresponding scenario description specified by the ${scene_desc}
parameter and the scenario dimension specified by the ${metric}
parameter.
Response parameters
Parameter | Type | Description |
id | string | The ID that is automatically generated by the system to identify the model call. |
model | string | The name of the model that is called. |
system_fingerprint | string | The configuration version of the model that is called. This parameter is not supported and an empty string "" is returned. |
choices | array | The details that are generated by the model. |
choices[i].finish_reason | string | The model evaluation status.
|
choices[i].message | object | The message returned by the model. |
choices[i].message.role | string | The role of the model. Set the value to assistant. |
choices[i].message.content | string | The content generated by the model. |
choices[i].index | integer | The sequence number of the content. Default value: 0. |
created | integer | The timestamp of the generated content. Unit: seconds. |
usage | string or array | The number of tokens that are consumed during the request. |
usage.prompt_tokens | integer | The number of tokens that are converted from the input text. |
usage.completion_tokens | integer | The number of tokens that are converted from the response generated by the model. |
usage.total_tokens | integer | The sum of the values returned by the usage.prompt_tokens and usage.completion_tokens parameters. |
Status codes
Status code | Code | Error message | Description |
200 | OK | None | The request is successful. |
400 | MessagesError | "messages" not in body or type of "messages" is not list. | The messages parameter cannot be left empty. The reason may be that the format of the value for the messages parameter is invalid. The value must be in the list format. |
400 | ContentError | Content should be like: {"content": [{"type": "json", "mode": "[single / pairwise]", "json": {"question": "<question>", "answer": "<answer>" ...}}] | The content is incorrect. Configure the content settings based on the example:
|
400 | ResponseFormatError | Response_format should be one of [{"type": "text"}, {"type": "json_object"}] | You must set the response_format parameter to one of the following values:
|
400 | ModeError | Mode must be in [single, pairwise], mode: {mode}. | You must set the mode parameter to one of the following values:
|
400 | QuestionError | Question should not be empty | The question parameter cannot be left empty. |
400 | AnswerError | Answer should not be empty when mode=single. | The answer parameter cannot be left empty when the mode parameter is set to single. |
400 | AnswerError | Answer1 or answer2 should not be empty when mode=pairwise, answer1: {answer1}, answer2: {answer2}. | The answer1 and answer2 parameters cannot be left empty when the mode parameter is set to pairwise. |
400 | SceneError | Scene need to be specified a judge-native scece when scene_desc and metric is empty. | If the scene_desc and metric parameters are left empty, you must set the scene parameter to one of the following built-in scenarios:
|
400 | SceneError | Scene_desc and metric need not be specified when scene is not empty and not a inner scene, scene_desc: {scene_desc}, metric: {metric}. | If you configure the scene parameter and the value is not a built-in scenario, you must configure the scene_desc and metric parameters. |
400 | SceneError | Scene_desc and metric need not to be specified when scene is empty, scene_desc: {scene_desc}, metric: {metric}. | If you leave the scene parameter empty, you must leave the scene_desc and metric parameters empty. |
400 | ScoreError | Score_desc need to be specified when max_score is not empty. | If you configure the max_score parameter, you must configure the score_desc parameter. |
400 | ScoreError | Score_desc need not to be specified when max_score is empty. | If you leave the max_score parameter empty, you must leave the score_desc parameter empty. |
401 | InvalidToken | Invalid Token provided. | The token is invalid. |
402 | InvalidBody | json load request body error | The request body is not in the JSON format. |
403 | GreenNetFilter | The output content contains high risk. risk_info: xxx | The output content has a high risk. |
404 | ModelNotFound | Model not found, model must in ['pai-judge', 'pai-judge-plus'] | The model that you access does not exist. |
500 | ModelServiceFailed | Scenario_division, load error, request_id: xxx, errmsg: xxx | Failed to call the scene segmentation model. |
500 | ModelServiceFailed | Request_judge_model, load error, request_id: xxx, errmsg: xxx | Failed to call the judge model. |
500 | ModelServiceFailed | Request_judge_model_with_stream, load error, request_id: xxx, errmsg: xxx | Failed to call the judge model in streaming output mode. |
Files
Upload a file: POST /v1/files
Request example
import os from openai import OpenAI def main(): base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1" judge_model_token = os.getenv("JUDGE_MODEL_TOKEN") client = OpenAI( api_key=f'Authorization: Bearer {judge_model_token}', base_url=base_url ) upload_files = client.files.create( file=open("/home/xxx/input.jsonl", "rb"), purpose="batch", ) print(upload_files.model_dump_json(indent=4)) if __name__ == '__main__': main()
$ curl -XPOST https://aiservice.cn-hangzhou.aliyuncs.com/v1/files \ -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}" \ -F purpose="batch" \ -F file="@/home/xxx/input.jsonl"
Response example
{ "id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713", "object": "file", "bytes": 698, "created_at": 1742454203, "filename": "input.jsonl", "purpose": "batch" }
Request parameters
Parameter
Type
Required
Description
file
file
Yes
The file.
purpose
string
Yes
The expected usage of the file.
assistants: the assistant and message files.
vision: the assistant image file.
batch: the API file for batch processing.
fine-tune: the fine-tuning file.
Response parameters
For more information, see File description.
Query a list of files: GET /v1/files
Request example
import os from openai import OpenAI def main(): base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1" judge_model_token = os.getenv("JUDGE_MODEL_TOKEN") client = OpenAI( api_key=f'Authorization: Bearer {judge_model_token}', base_url=base_url ) list_files = client.files.list( purpose="batch", order="desc", limit=10, after="" ) print(list_files.model_dump_json(indent=4)) if __name__ == '__main__': main()
$ curl -XGET https://aiservice.cn-hangzhou.aliyuncs.com/v1/files \ -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
Response example
{ "object": "list", "data": [ { "id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713", "object": "file", "bytes": 698, "created_at": 1742454203, "filename": "input.jsonl", "purpose": "batch" }, { "id": "file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022", "object": "file", "bytes": 1420, "created_at": 1742455638, "filename": "file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022_success.jsonl", "purpose": "batch_output" } ] }
Request parameters
Parameter
Type
Required
Description
purpose
string
No
Returns only files with the specific purpose.
limit
string
No
The expected usage of the file.
Only API files for batch processing are supported.
Default value: 10000.
order
string
No
The order in which the created_at parameter is used to sort the query results.
asc
desc (default)
after
string
No
The cursor for pagination. This parameter defines an object ID that indicates a position in the query results. For example, if you send a request and receive 100 objects ending with obj_foo, you can set after to obj_foo in a subsequent call to obtain the query results after obj_foo.
Response parameters
Parameter
Type
Description
object
string
Returns only files with the specific purpose.
data
array
Query a file: GET /v1/files/{file_id}
Request example
import os from openai import OpenAI def main(): base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1" judge_model_token = os.getenv("JUDGE_MODEL_TOKEN") client = OpenAI( api_key=f'Authorization: Bearer {judge_model_token}', base_url=base_url ) retrieve_files = client.files.retrieve( file_id="file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713", ) print(retrieve_files.model_dump_json(indent=4)) if __name__ == '__main__': main()
$ curl -XGET https://aiservice.cn-hangzhou.aliyuncs.com/v1/files/file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713 \ -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
Response example
{ "id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713", "object": "file", "bytes": 698, "created_at": 1742454203, "filename": "input.jsonl", "purpose": "batch" }
Request parameters
Parameter
Type
Required
Description
file_id
string
Yes
The ID of the file.
Response parameters
Query or download the file content: GET /v1/files/{file_id}/content
Only files for batch processing are supported.
Request example
import os from openai import OpenAI def main(): base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1" judge_model_token = os.getenv("JUDGE_MODEL_TOKEN") client = OpenAI( api_key=f'Authorization: Bearer {judge_model_token}', base_url=base_url ) content_files = client.files.content( file_id="file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022", ) print(content_files) if __name__ == '__main__': main()
$ curl -XGET https://aiservice.cn-hangzhou.aliyuncs.com/v1/files/file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022/content \ -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}" > output.jsonl
Response example
{"id":"dcee3584-6f30-9541-a855-873a6d86b7d9","custom_id":"request-1","response":{"status_code":200,"request_id":"dcee3584-6f30-9541-a855-873a6d86b7d9","body":{"created":1737446797,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-dcee3584-6f30-9541-a855-873a6d86b7d9","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null} {"id":"dcee3584-6f30-9541-a855-873a6d86b7d9","custom_id":"request-2","response":{"status_code":200,"request_id":"dcee3584-6f30-9541-a855-873a6d86b7d9","body":{"created":1737446797,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-dcee3584-6f30-9541-a855-873a6d86b7d9","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}
Request parameters
Parameter
Type
Required
Description
file_id
string
Yes
The ID of the file.
Response parameters
Delete a file: DELETE /v1/files/{file_id}
Request example
import os from openai import OpenAI def main(): base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1" judge_model_token = os.getenv("JUDGE_MODEL_TOKEN") client = OpenAI( api_key=f'Authorization: Bearer {judge_model_token}', base_url=base_url ) delete_files = client.files.delete( file_id="file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022", ) print(delete_files) if __name__ == '__main__': main()
$ curl -XDELETE https://aiservice.cn-hangzhou.aliyuncs.com/v1/files/file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022 \ -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
Response example
{ "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022", "object": "file", "deleted": "true" }
Request parameters
Parameter
Type
Required
Description
file_id
string
Yes
The ID of the file.
Response parameters
Parameter
Type
Description
id
string
The ID of the file that was deleted.
object
string
The type of the deleted file.
deleted
string
Indicates whether the file was deleted.
File description
Parameter | Type | Description |
id | string | The ID of the file that was deleted. |
object | string | The type of the deleted file. |
bytes | integer | The size of the file. |
created_at | integer | The time when the file was created. |
filename | string | The name of the file. |
purpose | string | The expected usage of the file. |
Batch
Create a batch task: POST /v1/batches
Request example
import os from openai import OpenAI def main(): base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1" judge_model_token = os.getenv("JUDGE_MODEL_TOKEN") client = OpenAI( api_key=f'Authorization: Bearer {judge_model_token}', base_url=base_url ) create_batches = client.batches.create( endpoint="/v1/chat/completions", input_file_id="file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713", completion_window="24h", ) print(create_batches.model_dump_json(indent=4)) if __name__ == '__main__': main()
$ curl -XPOST https://aiservice.cn-hangzhou.aliyuncs.com/v1/batches \ -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}" \ -d '{ "input_file_id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713", "endpoint": "/v1/chat/completions", "completion_window": "24h" }'
Response example
{ "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022", "object": "batch", "endpoint": "/v1/chat/completions", "errors": null, "input_file_id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713", "completion_window": "24h", "status": "Creating", "output_file_id": null, "error_file_id": null, "created_at": 1742455213, "in_process_at": null, "expires_at": null, "FinalizingAt": null, "completed_at": null, "failed_at": null, "expired_at": null, "cancelling_at": null, "cancelled_at": null, "request_counts": { "total": 3, "completed": 0, "failed": 0 }, "metadata": null }
Request parameters
Parameter
Type
Required
Description
input_file_id
string
Yes
The ID of the file that is formatted as a .jsonl file and must be uploaded with the target batch task. The file can contain up to 50,000 requests and be up to 20 MB in size.
endpoint
string
Yes
The endpoint to use for all requests in a batch task. Only
/v1/chat/completions
is supported.completion_window
string
Yes
The time range in which a batch of files are processed. Set the value to 24h.
completion_window
object
No
The custom metadata for the batch task.
Response parameters
Query a list of batche tasks: GET /v1/files
Request example
import os from openai import OpenAI def main(): base_url = "http://aiservice.cn-hangzhou.aliyuncs.com/v1" judge_model_token = os.getenv("JUDGE_MODEL_TOKEN") client = OpenAI( api_key=f'Authorization: Bearer {judge_model_token}', base_url=base_url ) list_batches = client.batches.list( after="batch_66f245a0-88d1-458c-8e1c-a819a5943022", limit=10, ) print(list_batches.model_dump_json(indent=4)) if __name__ == '__main__': main()
$ curl -XGET https://aiservice.cn-hangzhou.aliyuncs.com/v1/batches \ -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
Response example
{ "object": "list", "data": [ { "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022", "object": "batch", "endpoint": "/v1/chat/completions", "errors": null, "input_file_id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713", "completion_window": "24h", "status": "Succeeded", "output_file_id": "file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022", "error_file_id": null, "created_at": 1742455213, "in_process_at": 1742455640, "expires_at": 1742455640, "FinalizingAt": 1742455889, "completed_at": 1742455889, "failed_at": null, "expired_at": null, "cancelling_at": null, "cancelled_at": null, "request_counts": { "total": 3, "completed": 3, "failed": 0 }, "metadata": null } ], "first_id": "", "last_id": "", "has_more": false }
Request parameters
Parameter
Type
Required
Description
purpose
string
No
Returns only files with the specific purpose.
limit
string
No
The expected usage of the file.
assistants: the assistant and message files.
vision: the assistant image file.
batch: the API file for batch processing.
fine-tune: the fine-tuning file.
order
string
No
The order in which the created_at parameter is used to sort the query results.
asc
desc (default)
after
string
No
The cursor for pagination. This parameter defines an object ID that indicates a position in the query results. For example, if you send a request and receive 100 objects ending with obj_foo, you can set after to obj_foo in a subsequent call to obtain the query results after obj_foo.
Response parameters
Parameter
Type
Description
object
string
The object type.
data
array
Query a batch task: GET /v1/batches/{batch_id}
Request example
import os from openai import OpenAI def main(): base_url = "http://aiservice.cn-hangzhou.aliyuncs.com/v1" judge_model_token = os.getenv("JUDGE_MODEL_TOKEN") client = OpenAI( api_key=f'Authorization: Bearer {judge_model_token}', base_url=base_url ) retrieve_batches = client.batches.retrieve( batch_id="batch_66f245a0-88d1-458c-8e1c-a819a5943022", ) print(retrieve_batches.model_dump_json(indent=4)) if __name__ == '__main__': main()
$ curl -XGET https://aiservice.cn-hangzhou.aliyuncs.com/v1/batches/batch_66f245a0-88d1-458c-8e1c-a819a5943022 \ -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
Response example
{ "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022", "object": "batch", "endpoint": "/v1/chat/completions", "errors": null, "input_file_id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713", "completion_window": "24h", "status": "Succeeded", "output_file_id": "file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022", "error_file_id": null, "created_at": 1742455213, "in_process_at": 1742455640, "expires_at": 1742455640, "FinalizingAt": 1742455889, "completed_at": 1742455889, "failed_at": null, "expired_at": null, "cancelling_at": null, "cancelled_at": null, "request_counts": { "total": 3, "completed": 3, "failed": 0 }, "metadata": null }
Request parameters
Parameter
Type
Required
Description
batch_id
string
Yes
The ID of the batch task.
Response parameters
Cancel a batch task: POST /v1/batches/{batch_id}/cancel
Cancels an in-progress batch task. A batch task maintains the cancelling state for up to 10 minutes before changing to the cancelled state. In this case, if the batch task generates outputs, they are available in the output file.
Request example
import os from openai import OpenAI def main(): base_url = "https://aiservice.cn-hangzhou.aliyuncs.com/v1" judge_model_token = os.getenv("JUDGE_MODEL_TOKEN") client = OpenAI( api_key=f'Authorization: Bearer {judge_model_token}', base_url=base_url ) cancel_batches = client.batches.cancel( batch_id="batch_66f245a0-88d1-458c-8e1c-a819a5943022", ) print(cancel_batches.model_dump_json(indent=4)) if __name__ == '__main__': main()
$ curl -XPOST https://aiservice.cn-hangzhou.aliyuncs.com/v1/batches/batch_66f245a0-88d1-458c-8e1c-a819a5943022/cancel \ -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
Response example
{ "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022", "object": "batch", "endpoint": "/v1/chat/completions", "errors": null, "input_file_id": "file-batch-EC043540BE1C7BE3F9F2F0A8F47D1713", "completion_window": "24h", "status": "Stopping", "output_file_id": "file-batch_output-66f245a0-88d1-458c-8e1c-a819a5943022", "error_file_id": null, "created_at": 1742455213, "in_process_at": 1742455640, "expires_at": 1742455640, "FinalizingAt": 1742455889, "completed_at": 1742455889, "failed_at": null, "expired_at": null, "cancelling_at": null, "cancelled_at": null, "request_counts": { "total": 3, "completed": 3, "failed": 0 }, "metadata": null }
Request parameters
Parameter
Type
Required
Description
batch_id
string
Yes
The ID of the batch task.
Response parameters
Delete a batch task: DELETE /v1/batches/{batch_id}
Request example
$ curl -XDELETE https://aiservice.cn-hangzhou.aliyuncs.com/v1/batches/batch_66f245a0-88d1-458c-8e1c-a819a5943022 \ -H "Authorization: Bearer ${JUDGE_MODEL_TOKEN}"
Response example
{ "id": "batch_66f245a0-88d1-458c-8e1c-a819a5943022", "object": "batch", "deleted": "true" }
Request parameters
Parameter
Type
Required
Description
batch_id
string
Yes
The ID of the batch task.
Response parameters
Parameter
Type
Description
id
string
The ID of the batch task.
object
string
The type of the deleted batch task.
deleted
string
Indicates whether the batch task was deleted.
Batch task description
Parameter | Type | Description |
id | string | The ID of the batch task that was deleted. |
object | string | The type of the deleted batch task. |
endpoint | string | The data endpoint. |
errors | string | The error message. |
input_file_id | string | The ID of the input file. |
completion_window | string | The time window. |
status | string | The running status. |
output_file_id | string | The ID of the output file. |
error_file_id | string | The ID of the error file. |
created_at | integer | The time when the batch task was created. |
in_process_at | integer | The time when the batch task started to be processed. |
expires_at | integer | The time when the batch task expires. |
finalizing_at | integer | The time when the batch task was finalized. |
completed_at | integer | The time when the batch task was completed. |
failed_at | integer | The time when the batch task failed. |
expired_at | integer | The actual expiration time. |
cancelling_at | integer | The time when the batch task started to be cancelled. |
cancelled_at | integer | The time when the batch task was cancelled. |
request_counts | object | The details of the request count. |
request_counts.total | integer | The total number of requests. |
request_counts.completed | integer | The number of successful requests. |
request_counts.failed | integer | The number of failed requests. |
metadata | object | The metadata. |