Model Studio offers a Batch interface that is compatible with OpenAI. It allows for the batch submission of tasks as files and supports asynchronous execution. This service processes large-scale data offline during non-peak hours and delivers results upon task completion or when the maximum wait time is reached, at a cost that is only 50% of real-time requests.
To use this feature in the console, see Batch Inference.
Prerequisites
You have activated Alibaba Cloud Model Studio and obtained an API Key.
We recommend that you set your API Key as an environment variable to reduce the risk of API Key leakage.
To use the OpenAI Python SDK, you must first run the following command to install it.
pip3 install -U openai
Supported models
Text generation models: qwen-max, qwen-plus, qwen-turbo
Billing
Batch calls cost 50% of real-time calls. For specific pricing, see Models.
Batch calling does not support discounts such as free quota or context cache.
Get started
Before starting a batch task, you can use batch-test-model
to perform an end-to-end test, including: validating input data, creating a task, querying task results, and downloading result files. Note:
The test file must meet the input file format requirements. Also, it must not exceed 1 MB in size and contain no more than 100 lines.
Concurrency limit: Up to 2 parallel tasks.
Resource usage: The test model will not perform inference, so it does not incur model inference fees.
Perform the following steps:
Prepare the test file
Download the sample file test_model.jsonl that contains request information. Make sure it is in the same directory as the Python script below
Sample content: The model parameter is set to
batch-test-model
, and the url is set to/v1/chat/ds-test
{"custom_id":"1","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}} {"custom_id":"2","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}
Run the script
Execute this Python script
Edit file paths or other parameters according to your actual situation.
import os from pathlib import Path from openai import OpenAI import time # Initialize the client client = OpenAI( # If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API Key, but it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API Key leakage. api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1" # Alibaba Cloud Model Studio service base_url ) def upload_file(file_path): print(f"Uploading JSONL file containing request information...") file_object = client.files.create(file=Path(file_path), purpose="batch") print(f"File uploaded successfully. File ID obtained: {file_object.id}\n") return file_object.id def create_batch_job(input_file_id): print(f"Creating a Batch task based on the file ID...") # Note: The endpoint parameter value here must be consistent with the url field in the input file. For the test model (batch-test-model), fill in /v1/chat/ds-test, for other models fill in /v1/chat/completions batch = client.batches.create(input_file_id=input_file_id, endpoint="/v1/chat/ds-test", completion_window="24h") print(f"Batch task creation completed. Batch task ID obtained: {batch.id}\n") return batch.id def check_job_status(batch_id): print(f"Checking Batch task status...") batch = client.batches.retrieve(batch_id=batch_id) print(f"Batch task status: {batch.status}\n") return batch.status def get_output_id(batch_id): print(f"Getting the output file ID for successfully executed requests in the Batch task...") batch = client.batches.retrieve(batch_id=batch_id) print(f"Output file ID: {batch.output_file_id}\n") return batch.output_file_id def get_error_id(batch_id): print(f"Getting the output file ID for failed requests in the Batch task...") batch = client.batches.retrieve(batch_id=batch_id) print(f"Error file ID: {batch.error_file_id}\n") return batch.error_file_id def download_results(output_file_id, output_file_path): print(f"Printing and downloading the successful request results of the Batch task...") content = client.files.content(output_file_id) # Print part of the content for testing print(f"Printing the first 1000 characters of the successful request results: {content.text[:1000]}...\n") # Save the result file locally content.write_to_file(output_file_path) print(f"Complete output results have been saved to the local output file result.jsonl\n") def download_errors(error_file_id, error_file_path): print(f"Printing and downloading the failed request information of the Batch task...") content = client.files.content(error_file_id) # Print part of the content for testing print(f"Printing the first 1000 characters of the failed request information: {content.text[:1000]}...\n") # Save the error information file locally content.write_to_file(error_file_path) print(f"Complete failed request information has been saved to the local error file error.jsonl\n") def main(): # File paths input_file_path = "test_model.jsonl" # Can be replaced with your input file path output_file_path = "result.jsonl" # Can be replaced with your output file path error_file_path = "error.jsonl" # Can be replaced with your error file path try: # Step 1: Upload the JSONL file containing request information to get the input file ID input_file_id = upload_file(input_file_path) # Step 2: Create a Batch task based on the input file ID batch_id = create_batch_job(input_file_id) # Step 3: Check the Batch task status until it ends status = "" while status not in ["completed", "failed", "expired", "cancelled"]: status = check_job_status(batch_id) print(f"Waiting for task completion...") time.sleep(10) # Wait 10 seconds before checking the status again # If the task fails, print the error message and exit if status == "failed": batch = client.batches.retrieve(batch_id) print(f"Batch task failed. Error message: {batch.errors}\n") print(f"See error code documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code") return # Step 4: Download results: If the output file ID is not empty, print the first 1000 characters of the successful request results and download the complete successful request results to the local output file; # If the error file ID is not empty, print the first 1000 characters of the failed request information and download the complete failed request information to the local error file. output_file_id = get_output_id(batch_id) if output_file_id: download_results(output_file_id, output_file_path) error_file_id = get_error_id(batch_id) if error_file_id: download_errors(error_file_id, error_file_path) print(f"See error code documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code") except Exception as e: print(f"An error occurred: {e}") print(f"See error code documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code") if __name__ == "__main__": main()
Verify the test results
The task status is
completed
result.jsonl
contains the fixed response{"content":"This is a test result."}
{"id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","custom_id":"1","response":{"status_code":200,"request_id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-bca7295b-67c3-4b1f-8239-d78323bb669f","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null} {"id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","custom_id":"2","response":{"status_code":200,"request_id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-1e32a8ba-2b69-4dc4-be42-e2897eac9e84","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null}
In case of errors, see Error messages for solutions.
After the test, follow these steps to execute a batch task.
Prepare an input file according to the input file format requirements. In the file, set the model parameter to a supported model, and set the url to: /v1/chat/completions
Replace the endpoint in the Python script above to match the url in the input file
Run the script and wait for the task to complete. If the task is successful, an output result file
result.jsonl
will be generated in the same directoryIf the task fails, the program will exit and print the error message.
If an error file ID is returned, the error file
error.jsonl
will be generated in the same directory for troubleshooting.Exceptions that occur during the process will be caught and error messages will be printed.
File format
Input file format
The input file for a batch task is a JSONL file with the following requirements:
Each line contains a request in the JSON format.
A single batch task can contain up to 50,000 requests.
The batch file's maximum size is 500 MB.
The maximum size for an individual line within the file is 1 MB.
Each line's content must comply with the context length limits specific to each model.
Set url in the file and endpoint in the code to /v1/chat/completions
.
Single-line request example:
{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-max","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}
Multi-line request example:
{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-max","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hi, how can I help you?"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-max","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}
Request parameters
Field | Type | Required | Description |
custom_id | String | Yes | The user-defined request ID. Each line represents a request with a unique |
method | String | Yes | The request method. Currently, only POST is supported. |
url | String | Yes | The base URL. Must be consistent with the endpoint field when creating a Batch task.
|
body | Object | Yes | The request body. |
body.model | String | Yes | The model used for this batch task. Important Requests in a task must use the same model. |
body.messages | Array | Yes | The messages array.
|
Convert CSV to JSONL
If you have a CSV file where the first column is custom_id
and the second column is content
, you can quickly create a JSONL file that meets the requirements using the Python code below. The CSV file must be placed in the same directory as the Python script.
You can also use the template file provided in this topic. The specific steps are as follows:
Download the template file and place it in the same directory as the Python script below;
The CSV template file has the first column as request ID (custom_id) and the second column as content. You can paste your queries into this file.
After running the Python script code below, a JSONL file named input_demo.jsonl
that meets the file format requirements will be generated in the same directory.
Edit file paths or other parameters according to your actual situation.
import csv
import json
def messages_builder_example(content):
messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": content}]
return messages
with open("input_demo.csv", "r") as fin:
with open("input_demo.jsonl", 'w', encoding='utf-8') as fout:
csvreader = csv.reader(fin)
for row in csvreader:
body = {"model": "qwen-turbo", "messages": messages_builder_example(row[1])}
# Use /v1/chat/completions
request = {"custom_id": row[0], "method": "POST", "url": "/v1/chat/completions", "body": body}
fout.write(json.dumps(request, separators=(',', ':'), ensure_ascii=False) + "\n", )
Output file format
The output is a JSONL file, with one JSON per line, corresponding to one request result.
Sample response
Single-line result example:
{"id":"73291560-xxx","custom_id":"1","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}
Multi-line result example:
{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-max","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course I can help. Whether you need information queries, learning materials, methods to solve problems, or any other assistance, I'm here to support you. Please tell me what kind of help you need?"}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}
Response parameters
Field | Type | Required | Description |
id | String | Yes | The request ID. |
custom_id | String | Yes | The user-defined request ID. |
response | Object | No | The request result. |
error | Object | No | The error response result. |
error.code | String | No | The error code. |
error.message | String | No | The error message. |
completion_tokens | Integer | No | The tokens in the completion. |
prompt_tokens | Integer | No | The tokens in the prompt. |
model | String | No | The model used in this task. |
Convert JSONL to CSV
Compared to JSONL, CSV files usually contain only the necessary data values without additional key names or other metadata, making them suitable for automated scripts and batch tasks. If you need to convert a batch output JSONL file into a CSV file, you can use the following Python code.
Ensure that result.jsonl
is in the same directory as the Python script below. After running the code below, a CSV file named result.csv
will be generated.
If you need to adjust file paths or other parameters, please modify the code according to your actual situation.
import json
import csv
columns = ["custom_id",
"model",
"request_id",
"status_code",
"error_code",
"error_message",
"created",
"content",
"usage"]
def dict_get_string(dict_obj, path):
obj = dict_obj
try:
for element in path:
obj = obj[element]
return obj
except:
return None
with open("result.jsonl", "r") as fin:
with open("result.csv", 'w', encoding='utf-8') as fout:
rows = [columns]
for line in fin:
request_result = json.loads(line)
row = [dict_get_string(request_result, ["custom_id"]),
dict_get_string(request_result, ["response", "body", "model"]),
dict_get_string(request_result, ["response", "request_id"]),
dict_get_string(request_result, ["response", "status_code"]),
dict_get_string(request_result, ["error", "error_code"]),
dict_get_string(request_result, ["error", "error_message"]),
dict_get_string(request_result, ["response", "body", "created"]),
dict_get_string(request_result, ["response", "body", "choices", 0, "message", "content"]),
dict_get_string(request_result, ["response", "body", "usage"])]
rows.append(row)
writer = csv.writer(fout)
writer.writerows(rows)
When a CSV file contains Chinese characters and you encounter garbled text when opening it with Excel, you can use a text editor (such as Sublime) to convert the CSV file's encoding to GBK, and then open it with Excel. Another method is to create a new Excel file and specify the correct encoding format (UTF-8) when importing the data.
Detailed process
1. Prepare and upload files
Before creating a Batch task, you need to upload a JSONL file that meets the input file format requirements through the following file upload interface. Then, obtain the file_id
and set purpose
to batch
You can upload a single file of up to 500 MB. The Model Studio storage under each Alibaba Cloud account supports up to 10,000 files, with a total size limit of 100 GB. The files currently have no expiration date.
OpenAI Python SDK
Sample request
import os
from pathlib import Path
from openai import OpenAI
client = OpenAI(
# If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API Key. However, it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API Key leakage.
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # Alibaba Cloud Model Studio service base_url
)
# test.jsonl is a local example file, purpose must be batch
file_object = client.files.create(file=Path("test.jsonl"), purpose="batch")
print(file_object.model_dump_json())
Content of test.jsonl:
{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}
Sample response
{
"id": "file-batch-xxx",
"bytes": 437,
"created_at": 1742304153,
"filename": "test.jsonl",
"object": "file",
"purpose": "batch",
"status": "processed",
"status_details": null
}
curl
Sample request
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
--form 'file=@"test.jsonl"' \
--form 'purpose="batch"'
Content of test.jsonl:
{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "qwen-turbo", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is 2+2?"}]}}
Sample response
{
"id": "file-batch-xxx",
"bytes": 231,
"created_at": 1729065815,
"filename": "test.jsonl",
"object": "file",
"purpose": "batch",
"status": "processed",
"status_details": null
}
2. Create a batch task
Use the input_file_id
parameter returned in 1. Prepare and upload files to create a batch task.
Rate limit: For each Alibaba Cloud account, 100 requests per minute, with a maximum of 100 running tasks (including unfinished tasks). If you exceed the maximum number of tasks, you must wait for tasks to complete before creating new ones.
OpenAI Python SDK
Sample request
import os
from openai import OpenAI
client = OpenAI(
# If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API Key. However, it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API Key leakage.
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # Alibaba Cloud Model Studio service base_url
)
batch = client.batches.create(
input_file_id="file-batch-xxx", # File ID returned from upload
endpoint="/v1/chat/completions", # For test model batch-test-model set to /v1/chat/ds-test, for other models set to /v1/chat/completions
completion_window="24h",
metadata={'ds_name':"Task Name",'ds_description':'Task Description'} # metadata, optional field, used to create task name and description
)
print(batch)
curl
Sample request
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-batch-xxx",
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"metadata":{"ds_name":"Task Name","ds_description":"Task Description"}
}'
Replace the value of input_file_id
with the actual value.
Request parameters
Field | Type | Passing method | Required | Description |
input_file_id | String | Body | Yes | The ID of the input file for the batch task. Use the file ID returned by the prepare and upload files interface, such as |
endpoint | String | Body | Yes | The path, which must be consistent with the url field in the input file.
|
completion_window | String | Body | Yes | The wait time, from 24 hours to 336 hours, only integers are supported. Units: "h" (hour) and "d" (day), such as "24h" or "14d". |
metadata | Map | Body | No | The task extension metadata, additional information in key-value pairs. |
metadata.ds_name | String | Body | No | The task name. Example: The name can be up to 20 characters. If this field is defined multiple times, the last value passed will be used. |
metadata.ds_description | String | Body | No | The task description. Example: The description can be up to 200 characters. If this field is defined multiple times, the last value passed will be used. |
Sample response
{
"id": "batch_xxx",
"object": "batch",
"endpoint": "/v1/chat/completions",
"errors": null,
"input_file_id": "file-batch-xxx",
"completion_window": "24h",
"status": "validating",
"output_file_id": null,
"error_file_id": null,
"created_at": 1742367779,
"in_progress_at": null,
"expires_at": null,
"finalizing_at": null,
"completed_at": null,
"failed_at": null,
"expired_at": null,
"cancelling_at": null,
"cancelled_at": null,
"request_counts": {
"total": 0,
"completed": 0,
"failed": 0
},
"metadata": {
"ds_name": "Task Name",
"ds_description": "Task Description"
}
}
Response parameters
Field | Type | Description |
id | String | The batch task ID. |
object | String | The object type, fixed to |
endpoint | String | The endpoint. |
errors | Map | The error message. |
input_file_id | String | The input file ID. |
completion_window | String | The wait time, from 24 hours to 336 hours, only integers are supported. Units: "h" (hour) and "d" (day), such as "24h" or "14d". |
status | String | Task status, including: validating, failed, in_progress, finalizing, completed, expired, cancelling, cancelled. |
output_file_id | String | The output file ID for successful requests. |
error_file_id | String | The output file ID for failed requests. |
created_at | Integer | The timestamp (in seconds) when the task was created. |
in_progress_at | Integer | The timestamp (in seconds) when the task started running. |
expires_at | Integer | The timestamp (in seconds) when the task started to expire. |
finalizing_at | Integer | The timestamp (in seconds) when the task started to finalize. |
completed_at | Integer | The timestamp (in seconds) when the task was completed. |
failed_at | Integer | The timestamp (in seconds) when the task failed. |
expired_at | Integer | The timestamp (in seconds) when the task expired. |
cancelling_at | Integer | The timestamp (in seconds) when the task was set to cancelling. |
cancelled_at | Integer | The timestamp (in seconds) when the task was cancelled. |
request_counts | Map | The number of requests in different statuses. |
metadata | Map | The metadata information, in key-value pairs. |
metadata.ds_name | String | The name of the current task. |
metadata.ds_description | String | The description of the current task. |
3. Query and manage a batch task
Query task details
Query the details of a batch task by providing the task ID obtained from 2. Create a batch task. Only batch tasks created withnin 30 days can be queried.
Rate limit: 300 requests per minute per Alibaba Cloud account. Because batch task takes some time, you can query once per minute after creating a batch task.
OpenAI Python SDK
Sample request
import os
from openai import OpenAI
client = OpenAI(
# If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API Key. However, it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API Key leakage.
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # Alibaba Cloud Model Studio service base_url
)
batch = client.batches.retrieve("batch_id") # Replace batch_id with the Batch task ID
print(batch)
curl
Sample request
curl --request GET 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY"
Replace batch_id
with the actual value.
Request parameters
Field | Type | Passing method | Required | Description |
batch_id | String | Path | Yes | The ID of the batch task to be queried, returned from 2. Create a batch task. The ID starts with "batch", such as "batch_xxx". |
Sample response
See the Sample response for creating a batch task.
Response parameters
See the response parameters for creating a batch task.
Use the returned output_file_id
and error_file_id
to download result files.
Query task list
Use the batches.list()
method to query a list of batch tasks and gradually retrieve the complete list with a pagination mechanism. Provide the last batch task ID from the previous query result as the after
parameter value, and you can get the next page of data. You can also use the limit
parameter to limit the number of tasks to return.
Rate limit: 100 requests per minute per Alibaba Cloud account.
OpenAI Python SDK
Sample request
import os
from openai import OpenAI
client = OpenAI(
# If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API Key. However, it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API Key leakage.
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # Alibaba Cloud Model Studio service base_url
)
batches = client.batches.list(after="batch_id", limit=20) # Replace batch_id with the Batch task ID
print(batches)
curl
Sample request
curl --request GET 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches?limit=20&after=batch_id' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY"
Replacebatch_id
inafter=batch_id
with the actual value, and setlimit
to the desired number of tasks to return.
Request parameters
Field | Type | Passing method | Required | Description |
after | String | Query | No | The cursor for pagination. Set In paged queries, assign the last batch task ID (last_id) to For example, if this query returns 20 lines of data and |
limit | Integer | Query | No | The number of tasks returned per query. Valid range: [1,100]. Default value: 20. |
Sample response
{
"object": "list",
"data": [
{
"id": "batch_xxx",
"object": "batch",
"endpoint": "/v1/chat/completions",
"errors": null,
"input_file_id": "file-batch-xxx",
"completion_window": "24h",
"status": "completed",
"output_file_id": "file-batch_output-xxx",
"error_file_id": null,
"created_at": 1722234109,
"in_progress_at": 1722234109,
"expires_at": null,
"finalizing_at": 1722234165,
"completed_at": 1722234165,
"failed_at": null,
"expired_at": null,
"cancelling_at": null,
"cancelled_at": null,
"request_counts": {
"total": 100,
"completed": 95,
"failed": 5
},
"metadata": {}
},
{ ... }
],
"first_id": "batch_xxx",
"last_id": "batch_xxx",
"has_more": true
}
Response parameters
Field | Type | Description |
object | String | The object type, fixed to list. |
data | Array | Batch task objects, same as the response parameters of 2. Create a batch task. |
first_id | String | The first task ID on the current page. |
last_id | String | The last task ID on the current page. |
has_more | Boolean | Indicates whether the current page is followed by another page. |
Cancel a task
Cancel a specific task by providing its task ID returned from 2. Create a batch task.
Rate limit: 100 requests per minute per Alibaba Cloud account.
OpenAI Python SDK
Sample request
import os
from openai import OpenAI
client = OpenAI(
# If you haven't configured environment variables, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API Key. However, it's not recommended to hardcode API Keys directly in your code in production environments to reduce the risk of API Key leakage.
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # Alibaba Cloud Model Studio service base_url
)
batch = client.batches.cancel("batch_id") # Replace batch_id with the Batch task ID
print(batch)
curl
Sample request
curl --request POST 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id/cancel' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY"
Replace batch_id
with the actual value.
Request parameters
Field | Type | Passing method | Required | Description |
batch_id | String | Path | Yes | The ID of the task to cancel, starting with "batch", such as "batch_xxx". |
Sample response
See the sample response for creating a Batch task.
Response parameters
See the response parameters for creating a Batch task.
4. Download result file
After a task is completed, you can download the result file.
To download the result file, you needfile_id
, which isoutput_file_id
from querying task details or querying task list. Only files that correspond tofile_id
starting withfile-batch_output
can be downloaded.
OpenAI Python SDK
Use the content
method to obtain the content of result file. Use the write_to_file
method to save it locally.
Sample request
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
content = client.files.content(file_id="file-batch_output-xxx")
# Print the result file content
print(content.text)
# Save the result file to local
content.write_to_file("result.jsonl")
Sample response
{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-plus","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course. Whether you need information queries, learning materials, methods to solve problems, or any other assistance, I am here to provide support. Please tell me what kind of help you need?"}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-plus","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}
curl
Use the GET method and specify the file_id
in the URL to download the result file.
Sample request
curl -X GET https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files/file-batch_output-xxx/content \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" > result.jsonl
Request parameters
Field | Type | Passing method | Required | Description |
file_id | string | Path | Yes | The ID of the file to be downloaded, which is |
Sample response
The JSONL file containing the batch task results, see Output file format.
Extended features
Filter and query task list
OpenAI Python SDK
Sample request
import os
from openai import OpenAI
client = OpenAI(
# If environment variables are not configured, you can replace the line below with api_key="sk-xxx" using your Alibaba Cloud Model Studio API Key. However, it is not recommended to hard-code API Keys directly in your code in production environments to reduce the risk of API Key leakage.
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # Alibaba Cloud Model Studio service base_url
)
batches = client.batches.list(after="batch_xxx", limit=2,extra_query={'ds_name':'Task Name','input_file_ids':'file-batch-xxx,file-batch-xxx','status':'completed,expired','create_after':'20250304000000','create_before':'20250306123000'})
print(batches)
curl
Sample request
curl --request GET 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches?after=batch_xxx&limit=2&ds_name=Batch&input_file_ids=file-batch-xxx,file-batch-xxx&status=completed,failed&create_after=20250303000000&create_before=20250320000000' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY"
Replacebatch_id
inafter=batch_id
with the actual value. Setlimit
to the number of tasks to return. Fillds_name
with a part of the task name. The value of input_file_ids can include multiple file IDs. Fillstatus
with multiple Batch task statuses. Fill the values ofcreate_after
andcreate_before
with time points.
Request parameters
Field | Type | Passing method | Required | Description |
ds_name | String | Query | No | Filter tasks by name using fuzzy matching. Enter any consecutive character segment to match task names containing that content. For example, entering "Batch" can match "Batch task", "Batch task_20240319", and others. |
input_file_ids | String | Query | No | Filter up to 20 file IDs, separated by commas. The file IDs are returned from Prepare and upload files. |
status | String | Query | No | Filter multiple statuses, separated by commas, including: validating, failed, in_progress, finalizing, completed, expired, cancelling, cancelled. |
create_after | String | Query | No | Filter tasks created after this time point, format: |
create_before | String | Query | No | Filter tasks created before this time point, format: |
Sample response
See the sample response in Query task list.
Response parameters
See the response parameters in Query task list.
Error codes
If the call failed and an error message is returned, see Error messages.
FAQ
Does throttling rate limits apply to batch request of models?
A: No, only real-time requests have RPM (Requests Per Minute) limits. Batch calls do not.
Do I have to place an order for batch calls, and where to place it?
A: No, you do not need to place a separate order. You pay directly for the use of the batch interface on a pay-as-you-go basis.
How does the backend process submitted batch requests? Are they executed in the order of submission?
A: No, it is not a queuing mechanism but a scheduling mechanism. Batch tasks are scheduled and executed based on resource availability.
How long does it take to complete the execution of submitted batch requests?
A: The running time for batch tasks is determined by the system's allocation of resources.
When system resources are constrained, tasks might not finish within the specified maximum wait time.
Therefore, for strict real-time requirements, we recommend real-time calls. For processing large-scale data with flexible timing, we recommend batch calls.