Alibaba Cloud Model Studio provides an OpenAI-compatible Batch API. You can use this API to submit tasks in batches using files for asynchronous execution. This lets you process large-scale data offline during non-peak hours. Results are returned after the task is complete or the maximum wait time is reached. The cost is 50% of the cost for real-time calls.
For information about how to perform this operation in the console, see Batch inference.
Prerequisites
Activate Alibaba Cloud Model Studio and obtain an API key.
We recommend that you set the API key as an environment variable to reduce the risk of API key leakage.
If you use the OpenAI Python SDK to call the Batch API, run the following command to install the latest version of the OpenAI SDK.
pip3 install -U openai
Scope
Supported region: International (Singapore)
Supported models: qwen-max, qwen-plus, and qwen-turbo
Billing
Unit price: The unit price for the input and output tokens of all successful requests is 50% of the real-time inference price for the corresponding model. For more information, see Model list.
Billing scope:
Only successfully executed requests in a task are billed.
No fees are charged for file parsing failures, task execution failures, or row-level request errors.
For canceled tasks, requests that were successfully completed before the cancellation are billed.
Batch inference is a separate billable item and does not support discounts, such as subscriptions (savings plans) and the free quota, or features such as context cache.
Getting started
Before you start a formal batch task, you can use the test model batch-test-model to perform a complete end-to-end test. This test includes verifying input data, creating a task, querying the task, and downloading the result file. Note the following:
The test file must meet the requirements for an input file. The file size cannot exceed 1 MB, and the number of lines cannot exceed 100.
Concurrency limit: The maximum number of parallel tasks is 2.
Resource usage: The test model does not perform the inference process, so no model inference fees are incurred.
The procedure is as follows:
Prepare a test file
Download the sample file test_model.jsonl, which contains request information, to your local machine. Make sure that the file is in the same directory as the Python script that is described later in this topic.
Sample content: The model parameter is set to
batch-test-model, and the url parameter is set to/v1/chat/ds-test.{"custom_id":"1","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}} {"custom_id":"2","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}
Run the script
Execute the following Python script.
You can modify the code as needed to adjust the file path or other parameters.
import os from pathlib import Path from openai import OpenAI import time # Initialize the client. client = OpenAI( # If the environment variable is not set, you can replace the following line with api_key="sk-xxx". # However, we do not recommend hard-coding the API key in your code in a production environment to reduce the risk of leakage. api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1" # The base_url of Alibaba Cloud Model Studio. ) def upload_file(file_path): print(f"Uploading the JSONL file that contains the request information...") file_object = client.files.create(file=Path(file_path), purpose="batch") print(f"File uploaded successfully. File ID: {file_object.id}\n") return file_object.id def create_batch_job(input_file_id): print(f"Creating a batch task based on the file ID...") # Note: The value of the endpoint parameter must be the same as the value of the url field in the input file. # For the test model (batch-test-model), set this to /v1/chat/ds-test. For other models, set this to /v1/chat/completions. batch = client.batches.create(input_file_id=input_file_id, endpoint="/v1/chat/ds-test", completion_window="24h") print(f"Batch task created. Batch task ID: {batch.id}\n") return batch.id def check_job_status(batch_id): print(f"Checking the batch task status...") batch = client.batches.retrieve(batch_id=batch_id) print(f"Batch task status: {batch.status}\n") return batch.status def get_output_id(batch_id): print(f"Getting the output file ID for successful requests in the batch task...") batch = client.batches.retrieve(batch_id=batch_id) print(f"Output file ID: {batch.output_file_id}\n") return batch.output_file_id def get_error_id(batch_id): print(f"Getting the error file ID for failed requests in the batch task...") batch = client.batches.retrieve(batch_id=batch_id) print(f"Error file ID: {batch.error_file_id}\n") return batch.error_file_id def download_results(output_file_id, output_file_path): print(f"Printing and downloading the results of successful requests in the batch task...") content = client.files.content(output_file_id) # Print some of the content for testing. print(f"Printing the first 1,000 characters of the successful results: {content.text[:1000]}...\n") # Save the result file to your local machine. content.write_to_file(output_file_path) print(f"The complete output results have been saved to the local output file result.jsonl\n") def download_errors(error_file_id, error_file_path): print(f"Printing and downloading the information of failed requests in the batch task...") content = client.files.content(error_file_id) # Print some of the content for testing. print(f"Printing the first 1,000 characters of the failure information: {content.text[:1000]}...\n") # Save the error information file to your local machine. content.write_to_file(error_file_path) print(f"The complete failure information has been saved to the local error file error.jsonl\n") def main(): # File paths input_file_path = "test_model.jsonl" # Replace with your input file path. output_file_path = "result.jsonl" # Replace with your output file path. error_file_path = "error.jsonl" # Replace with your error file path. try: # Step 1: Upload the JSONL file that contains the request information to get the input file ID. input_file_id = upload_file(input_file_path) # Step 2: Create a batch task based on the input file ID. batch_id = create_batch_job(input_file_id) # Step 3: Check the batch task status until it is complete. status = "" while status not in ["completed", "failed", "expired", "cancelled"]: status = check_job_status(batch_id) print(f"Waiting for the task to complete...") time.sleep(10) # Wait 10 seconds and query the status again. # If the task fails, print the error message and exit. if status == "failed": batch = client.batches.retrieve(batch_id) print(f"Batch task failed. Error message:{batch.errors}\n") print(f"For more information, see Error codes: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code") return # Step 4: Download the results. If an output file ID exists, print the first 1,000 characters of the successful results and download the complete results to a local output file. # If an error file ID exists, print the first 1,000 characters of the failure information and download the complete information to a local error file. output_file_id = get_output_id(batch_id) if output_file_id: download_results(output_file_id, output_file_path) error_file_id = get_error_id(batch_id) if error_file_id: download_errors(error_file_id, error_file_path) print(f"For more information, see Error codes: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code") except Exception as e: print(f"An error occurred: {e}") print(f"For more information, see Error codes: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code") if __name__ == "__main__": main()
Verify the test results
The task status is
completed.The result file
result.jsonlcontains the fixed response{"content":"This is a test result."}.{"id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","custom_id":"1","response":{"status_code":200,"request_id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-bca7295b-67c3-4b1f-8239-d78323bb669f","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null} {"id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","custom_id":"2","response":{"status_code":200,"request_id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-1e32a8ba-2b69-4dc4-be42-e2897eac9e84","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null}If an error occurs, see Error messages for a solution.
After you verify the test, perform the following steps to run a formal batch task:
Prepare an input file that meets the requirements described in Input file. In the file, set the model parameter to a supported model and set the url parameter to /v1/chat/completions.
Replace the endpoint in the Python script.
ImportantMake sure that the endpoint in the script is the same as the url parameter in the input file.
Run the script and wait for the task to complete. If the task is successful, an output file named
result.jsonlis generated in the same directory.If the task fails, the program exits and prints an error message.
If an error file ID exists, an error file named
error.jsonlis generated in the same directory for you to review.Exceptions that occur during the process are caught and an error message is printed.
Data file format
Input file
Before you create a batch inference task, you must prepare a file that meets the following specifications:
Format: JSONL with UTF-8 encoding. Each line must be an independent JSON object.
Size limits: A single file can contain up to 50,000 requests and be up to 500 MB in size.
Line limit: Each JSON object can be up to 6 MB and cannot exceed the context length of the model.
Consistency: All requests in the same file must use the same model and, if applicable, the same thinking mode.
Unique identifier: Each request must contain a custom_id field that is unique within the file. This field is used for result matching.
Request examples
{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-max","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello!"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-max","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}JSONL batch generation tool
You can use the tool below to quickly generate JSONL files. To prevent performance issues, process no more than 10,000 lines at a time. For larger amounts of data, process them in batches.
Request parameters
Field | Type | Required | Description |
custom_id | String | Yes | A user-defined request ID. Each line represents a request, and each request has a unique |
method | String | Yes | The request method. Currently, only POST is supported. |
url | String | Yes | The URL associated with the API. This must be the same as the endpoint field specified when you create a batch job.
|
body | Object | Yes | The request body for the model call. It includes all parameters required to call the model, such as The parameters in the request body are the same as those supported by the real-time inference API. For more information about the parameters, see OpenAI compatible API. To add more parameters, such as Example: |
body.model | String | Yes | The model used for this batch job. Important For a single job, all batch requests must use the same model. The thinking mode, if supported, must also be consistent across all requests. |
body.messages | Array | Yes | A list of messages. |
Convert a CSV file to a JSONL file
If you have a CSV file with a request ID (`custom_id`) in the first column and content in the second, you can use the following Python script to quickly create a JSONL file that is formatted for batch tasks. The CSV file must be in the same folder as the Python script.
Alternatively, you can use the template file that is provided in this topic:
Download the template file and place it in the same folder as the Python script.
The template is a CSV file. The first column is for the request ID (`custom_id`) and the second column is for the content. Paste your content into this file.
Running the following Python script creates a JSONL file named input_demo.jsonl in the same folder. This file is formatted for batch tasks.
You can modify the file path or other parameters in the code as needed.
import csv
import json
def messages_builder_example(content):
messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": content}]
return messages
with open("input_demo.csv", "r") as fin:
with open("input_demo.jsonl", 'w', encoding='utf-8') as fout:
csvreader = csv.reader(fin)
for row in csvreader:
body = {"model": "qwen-turbo", "messages": messages_builder_example(row[1])}
# The default value is /v1/chat/completions.
request = {"custom_id": row[0], "method": "POST", "url": "/v1/chat/completions", "body": body}
fout.write(json.dumps(request, separators=(',', ':'), ensure_ascii=False) + "\n", )Output file
The output is a JSONL file. Each line is a JSON object that corresponds to a request result.
Response examples
Example of a single-line response:
{"id":"73291560-xxx","custom_id":"1","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}Example of a multi-line response:
{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-max","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course. I am here to help you with information queries, learning materials, problem-solving methods, or anything else you need. Just tell me how I can assist you."}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}Response parameters
Field | Type | Required | Description |
id | String | Yes | Request ID. |
custom_id | String | Yes | User-defined request ID. |
response | Object | No | Request result. |
error | Object | No | Error response. |
error.code | String | No | Error code. |
error.message | String | No | Error message. |
completion_tokens | Integer | No | Number of tokens for the generated completion. |
prompt_tokens | Integer | No | Number of tokens in the prompt. |
model | String | No | Model used for inference in the task. |
Convert a JSONL file to a CSV file
CSV files are ideal for automated scripts and batch tasks because they contain only data values, without the extra keys or metadata that are found in JSONL files. You can use the following Python script to convert the JSONL output file from a batch task to a CSV file.
Place the result.jsonl file in the same folder as the Python script. Running the script creates a CSV file named result.csv.
You can modify the code to adjust the file path or other parameters as needed.
import json
import csv
columns = ["custom_id",
"model",
"request_id",
"status_code",
"error_code",
"error_message",
"created",
"content",
"usage"]
def dict_get_string(dict_obj, path):
obj = dict_obj
try:
for element in path:
obj = obj[element]
return obj
except:
return None
with open("result.jsonl", "r") as fin:
with open("result.csv", 'w', encoding='utf-8') as fout:
rows = [columns]
for line in fin:
request_result = json.loads(line)
row = [dict_get_string(request_result, ["custom_id"]),
dict_get_string(request_result, ["response", "body", "model"]),
dict_get_string(request_result, ["response", "request_id"]),
dict_get_string(request_result, ["response", "status_code"]),
dict_get_string(request_result, ["error", "error_code"]),
dict_get_string(request_result, ["error", "error_message"]),
dict_get_string(request_result, ["response", "body", "created"]),
dict_get_string(request_result, ["response", "body", "choices", 0, "message", "content"]),
dict_get_string(request_result, ["response", "body", "usage"])]
rows.append(row)
writer = csv.writer(fout)
writer.writerows(rows)If a CSV file displays garbled text when opened in Excel, you can use a text editor, such as Sublime, to change the encoding format of the file. Then, you can open the file in Excel. Alternatively, you can create a new file in Excel and specify the correct encoding format, such as UTF-8, when you import the data.
Procedure
1. Prepare and upload a file
Before you create a batch task, you must upload a JSONL file that meets the input file requirements through the file upload API. After the upload is complete, retrieve the file_id and set the `purpose` parameter to batch.
The maximum size for a single file that you can upload for a batch task is 500 MB. The Model Studio storage space under your Alibaba Cloud account supports a maximum of 10,000 files, with a total size not exceeding 100 GB. The files do not have an expiration date.
OpenAI Python SDK
Request example
import os
from pathlib import Path
from openai import OpenAI
client = OpenAI(
# If the environment variable is not set, replace the next line with your Model Studio API key: api_key="sk-xxx".
# Do not hard code the API key in production environments to reduce the risk of leaks.
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # The base_url for the Alibaba Cloud Model Studio service
)
# test.jsonl is a local sample file. The purpose must be batch.
file_object = client.files.create(file=Path("test.jsonl"), purpose="batch")
print(file_object.model_dump_json())Content of the test file test.jsonl:
{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}Response example
{
"id": "file-batch-xxx",
"bytes": 437,
"created_at": 1742304153,
"filename": "test.jsonl",
"object": "file",
"purpose": "batch",
"status": "processed",
"status_details": null
}curl
Request example
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
--form 'file=@"test.jsonl"' \
--form 'purpose="batch"'Content of the test file test.jsonl:
{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "qwen-turbo", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is 2+2?"}]}}Response example
{
"id": "file-batch-xxx",
"bytes": 231,
"created_at": 1729065815,
"filename": "test.jsonl",
"object": "file",
"purpose": "batch",
"status": "processed",
"status_details": null
}2. Create a batch task
You can create a batch task by passing the file ID returned by the Prepare and upload a file API to the input_file_id parameter.
The API rate limit is 1,000 calls per minute per Alibaba Cloud account. The maximum number of running tasks is 1,000. This includes all tasks that have not finished. If you exceed the maximum, you must wait for a task to finish before you can create another one.
OpenAI Python SDK
Request example
import os
from openai import OpenAI
client = OpenAI(
# If the environment variable is not set, replace the next line with your Model Studio API key: api_key="sk-xxx".
# Do not hard code the API key in production environments to reduce the risk of leaks.
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # The base_url for the Alibaba Cloud Model Studio service
)
batch = client.batches.create(
input_file_id="file-batch-xxx", # The ID returned after uploading the file
endpoint="/v1/chat/completions", # For the test model batch-test-model, enter /v1/chat/ds-test. For other models, enter /v1/chat/completions.
completion_window="24h",
metadata={'ds_name':"Task Name",'ds_description':'Task Description'} # Metadata. This is an optional field used to create a task name and description.
)
print(batch)curl
Request example
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-batch-xxx",
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"metadata":{"ds_name":"Task Name","ds_description":"Task Description"}
}'Replace the value of input_file_id with the actual value.Input parameter settings
Field | Type | Passed In | Required | Description |
input_file_id | String | Body | Yes | Specifies the file ID to use as the input file for the batch task. Use the file ID returned by the Prepare and upload a file API, such as |
endpoint | String | Body | Yes | The access path. It must be consistent with the `url` field in the input file.
|
completion_window | String | Body | Yes | The completion window for the task. The minimum is 24h and the maximum is 336h. Only integers are supported. The units "h" and "d" are supported, such as "24h" or "14d". |
metadata | Map | Body | No | Extended metadata for the task. Attach information in key-value pairs. |
metadata.ds_name | String | Body | No | The name of the task. Example: Limit: The length cannot exceed 100 characters. If this field is defined multiple times, the last value passed is used. |
metadata.ds_description | String | Body | No | The description of the task. Example: Limit: The length cannot exceed 200 characters. If this field is defined multiple times, the last value passed is used. |
Response example
{
"id": "batch_xxx",
"object": "batch",
"endpoint": "/v1/chat/completions",
"errors": null,
"input_file_id": "file-batch-xxx",
"completion_window": "24h",
"status": "validating",
"output_file_id": null,
"error_file_id": null,
"created_at": 1742367779,
"in_progress_at": null,
"expires_at": null,
"finalizing_at": null,
"completed_at": null,
"failed_at": null,
"expired_at": null,
"cancelling_at": null,
"cancelled_at": null,
"request_counts": {
"total": 0,
"completed": 0,
"failed": 0
},
"metadata": {
"ds_name": "Task Name",
"ds_description": "Task Description"
}
}Response parameters
Field | Type | Description |
id | String | The batch task ID. |
object | String | The object type. The value is fixed to |
endpoint | String | The access path. |
errors | Map | The error message. |
input_file_id | String | The file ID. |
completion_window | String | The completion window for the task. The minimum is 24h and the maximum is 336h. Only integers are supported. The units "h" and "d" are supported, such as "24h" or "14d". |
status | String | The status of the task. Valid values include validating, failed, in_progress, finalizing, completed, expired, cancelling, and cancelled. |
output_file_id | String | The ID of the output file for successfully executed requests. |
error_file_id | String | The ID of the output file for failed requests. |
created_at | Integer | The UNIX timestamp (in seconds) when the task was created. |
in_progress_at | Integer | The UNIX timestamp (in seconds) when the task started running. |
expires_at | Integer | The UNIX timestamp (in seconds) when the task expires. |
finalizing_at | Integer | The UNIX timestamp (in seconds) when the task started finalizing. |
completed_at | Integer | The UNIX timestamp (in seconds) when the task was completed. |
failed_at | Integer | The UNIX timestamp (in seconds) when the task failed. |
expired_at | Integer | The UNIX timestamp (in seconds) when the task expired. |
cancelling_at | Integer | The UNIX timestamp (in seconds) when the task was set to cancelling. |
cancelled_at | Integer | The UNIX timestamp (in seconds) when the task was cancelled. |
request_counts | Map | The number of requests in different states. |
metadata | Map | Additional information in key-value pairs. |
metadata.ds_name | String | The name of the current task. |
metadata.ds_description | String | The description of the current task. |
3. Query and manage batch tasks
Query batch task details
You can query the details of a specific batch task by passing the batch task ID that was returned when you created the batch task. You can query only tasks that were created within the last 30 days.
The API rate limit is 1,000 calls per minute per Alibaba Cloud account. Because a batch task takes time to execute, we recommend that you call this query API once per minute after you create the task to retrieve its status.
OpenAI Python SDK
Request example
import os
from openai import OpenAI
client = OpenAI(
# If the environment variable is not set, replace the next line with your Model Studio API key: api_key="sk-xxx".
# Do not hard code the API key in production environments to reduce the risk of leaks.
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # The base_url for the Alibaba Cloud Model Studio service
)
batch = client.batches.retrieve("batch_id") # Replace batch_id with the ID of the batch task.
print(batch)curl
Request example
curl --request GET 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY"Replace batch_id with the actual value.Input parameter settings
Field | Type | Passed In | Required | Description |
batch_id | String | Path | Yes | The ID of the batch task to query. This is the ID returned when you created the batch task. The ID starts with `batch`, for example, `batch_xxx`. |
Response example
For more information, see the response example for Create a batch task.
Response parameters
For more information, see the response parameters for Create a batch task.
You can retrieve the content of the files specified by output_file_id and error_file_id in the response parameters by downloading the batch result file.
Query a list of batch tasks
You can use the batches.list() method to query a list of batch tasks. You can use paging to retrieve the complete list incrementally.
Use the
afterparameter: Pass the ID of the last batch task from the previous page to retrieve the next page of data.Use the
limitparameter: Set the number of tasks to return.You can filter the query using parameters such as
input_file_ids.
The API rate limit is 100 calls per minute per Alibaba Cloud account.
OpenAI Python SDK
Request example
import os
from openai import OpenAI
client = OpenAI(
# If the environment variable is not set, replace the next line with your Model Studio API key: api_key="sk-xxx".
# Do not hard code the API key in production environments to reduce the risk of leaks.
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # The base_url for the Alibaba Cloud Model Studio service
)
batches = client.batches.list(after="batch_xxx", limit=2,extra_query={'ds_name':'Task Name','input_file_ids':'file-batch-xxx,file-batch-xxx','status':'completed,expired','create_after':'20250304000000','create_before':'20250306123000'})
print(batches)curl
Request example
curl --request GET 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches?after=batch_xxx&limit=2&ds_name=Batch&input_file_ids=file-batch-xxx,file-batch-xxx&status=completed,failed&create_after=20250303000000&create_before=20250320000000' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY"Replacebatch_idinafter=batch_idwith the actual value. Set thelimitparameter to the number of tasks to return. Fords_name, enter a partial task name. For `input_file_ids`, you can enter multiple file IDs. Forstatus, enter multiple batch task statuses. Forcreate_afterandcreate_before, enter specific points in time.
Input parameter settings
Field | Type | Passed In | Required | Description |
after | String | Query | No | A cursor for pagination. The value of the For example, if the current query returns 20 rows of data and the last batch task ID (`last_id`) is `batch_xxx`, you can set |
limit | Integer | Query | No | The number of batch tasks to return for each query. The range is [1, 100]. The default is 20. |
ds_name | String | Query | No | Filters by task name using a partial match. Enter any part of a task name to match tasks that contain that string. For example, entering "Batch" matches "Batch Task" and "Batch Task_20240319". |
input_file_ids | String | Query | No | Filters by multiple file IDs, separated by commas. You can enter up to 20 IDs. These can be file IDs returned by Prepare and upload a file. |
status | String | Query | No | Filters by multiple statuses, separated by commas. Valid values include validating, failed, in_progress, finalizing, completed, expired, cancelling, and cancelled. |
create_after | String | Query | No | Filters for tasks created after this point in time. Format: |
create_before | String | Query | No | Filters for tasks created before this point in time. Format: |
Response example
{
"object": "list",
"data": [
{
"id": "batch_xxx",
"object": "batch",
"endpoint": "/v1/chat/completions",
"errors": null,
"input_file_id": "file-batch-xxx",
"completion_window": "24h",
"status": "completed",
"output_file_id": "file-batch_output-xxx",
"error_file_id": null,
"created_at": 1722234109,
"in_progress_at": 1722234109,
"expires_at": null,
"finalizing_at": 1722234165,
"completed_at": 1722234165,
"failed_at": null,
"expired_at": null,
"cancelling_at": null,
"cancelled_at": null,
"request_counts": {
"total": 100,
"completed": 95,
"failed": 5
},
"metadata": {}
},
{ ... }
],
"first_id": "batch_xxx",
"last_id": "batch_xxx",
"has_more": true
}Response parameters
Field | Type | Description |
object | String | The type. The value is fixed to `list`. |
data | Array | A batch task object. For more information, see the response parameters for creating a batch task. |
first_id | String | The ID of the first batch task on the current page. |
last_id | String | The ID of the last batch task on the current page. |
has_more | Boolean | Indicates whether there is a next page. |
Cancel a batch task
You can cancel a specific batch task by passing the batch task ID that was returned when you created the batch task.
The API rate limit is 1,000 calls per minute per Alibaba Cloud account.
OpenAI Python SDK
Request example
import os
from openai import OpenAI
client = OpenAI(
# If the environment variable is not set, replace the next line with your Model Studio API key: api_key="sk-xxx".
# Do not hard code the API key in production environments to reduce the risk of leaks.
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # The base_url for the Alibaba Cloud Model Studio service
)
batch = client.batches.cancel("batch_id") # Replace batch_id with the ID of the batch task.
print(batch)curl
Request example
curl --request POST 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id/cancel' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY"Replace batch_id with the actual value.Input parameter settings
Field | Type | Parameter passing | Required | Description |
batch_id | String | Path | Yes | The ID of the batch task to cancel. The ID starts with `batch`, for example, `batch_xxx`. |
Response example
For more information, see the response example for Create a batch task.
Response parameters
For more information, see the response parameters for Create a batch task.
4. Download the batch result file
After a batch inference task is complete, you can use the API to download the result file.
You can obtain thefile_idof the file to download from theoutput_file_idparameter that is returned by the Query Batch Task DetailsQuery Batch Task Listfile_idstarts withfile-batch_output.
OpenAI Python SDK
You can retrieve the content of a Batch job result file with the content method and save it to a local file with the write_to_file method.
Request example
import os
from openai import OpenAI
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
content = client.files.content(file_id="file-batch_output-xxx")
# Print the content of the result file
print(content.text)
# Save the result file locally
content.write_to_file("result.jsonl")Response example
{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-plus","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course. Whether you need to look up information, find learning materials, solve problems, or need any other help, I am here to support you. Please tell me what you need help with."}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-plus","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}curl
You can use the GET method and specify file_id in the URL to download the Batch job result file.
Request example
curl -X GET https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files/file-batch_output-xxx/content \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" > result.jsonlParameter settings
Field | Type | Parameter passing | Required | Description |
file_id | string | Path | Yes | The ID of the file to download, which is the value of |
Response
A JSONL file that contains the batch task results. For more information about the format, see Output file.
Extended features
Error codes
If a call fails and returns an error message, see Error messages to resolve the issue.
FAQ
Is there a basic rate limit for models that use batch pricing?
No. Only real-time calls have a Requests Per Minute (RPM) limit. Batch calls do not have an RPM limit.
Do I need to place an order to use batch calls? If so, where?
No. Batch is a call method and does not require a separate order. This method uses a pay-as-you-go billing model, and you are billed directly for your batch API calls.
How are submitted batch call requests processed? Are they executed in the order they are submitted?
No. They are not processed in a queue. Instead, a scheduling mechanism is used. Batch request tasks are scheduled and executed based on resource availability.
How long does it take for a submitted batch call request to complete?
The running time of a batch task depends on the resource allocation of the system.
If system resources are limited, tasks might not be completed within the configured maximum wait time.
For scenarios that require fast model inference, use real-time calls. For scenarios that process large amounts of data and can tolerate some delay, use batch calls.