Alibaba Cloud Model Studio provides an OpenAI-compatible Batch API that lets you submit batch tasks using files for asynchronous execution. This service processes large-scale data offline during off-peak hours and returns the results after a task is complete or the maximum waiting time is reached, at 50% of the cost of real-time calls.
To perform this operation in the console, see Batch processing.
Prerequisites
Activate Model Studio and create an API key.
Export the API key as an environment variable to reduce the risk of API key leaks.
If you use the OpenAI Python SDK to call the Batch API, run the following command to install the latest version of the OpenAI SDK.
pip3 install -U openai
Availability
Beijing region
Supported models:
Text generation models: Stable and some
latestversions of Qwen Max, Plus, Flash, and Long. Also supports the QwQ series (qwq-plus) and third-party models such as deepseek-r1 and deepseek-v3.Multimodal models: Stable and some
latestversions of Qwen VL Max, Plus, and Flash. Also supports the Qwen OCR model.Text embedding models: The text-embedding-v4 model.
Singapore region
Supported models: qwen-max, qwen-plus, and qwen-turbo.
Getting started
Before you start a batch, use the test model batch-test-model to perform a full-link, closed-loop test. This test includes validating input data, creating a job, querying the job, and downloading the result file. Note:
The test file must meet the requirements for an input file. It also must not exceed 1 MB in size or have more than 100 lines.
Concurrency limit: A maximum of 2 parallel jobs.
Resource usage: The test model does not go through the inference process and does not incur model inference fees.
The steps are as follows:
Prepare the test file
Download the sample file test_model.jsonl, which contains request information, to your local machine. Make sure that it is in the same directory as the Python script below.
Sample content: Set the model parameter to
batch-test-modeland the url parameter to/v1/chat/ds-test.{"custom_id":"1","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}} {"custom_id":"2","method":"POST","url":"/v1/chat/ds-test","body":{"model":"batch-test-model","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}
Run the script
Run this Python script.
To adjust the file path or other parameters, modify the code as needed.
import os from pathlib import Path from openai import OpenAI import time # Initialize the client client = OpenAI( # If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks. # The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key api_key=os.getenv("DASHSCOPE_API_KEY"), # The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1 base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1" # The base_url for the Alibaba Cloud Model Studio service ) def upload_file(file_path): print(f"Uploading the JSONL file that contains request information...") file_object = client.files.create(file=Path(file_path), purpose="batch") print(f"File uploaded successfully. File ID: {file_object.id}\n") return file_object.id def create_batch_job(input_file_id): print(f"Creating a batch job based on the file ID...") # Note: The value of the endpoint parameter here must be the same as the url field in the input file. For the test model (batch-test-model), enter /v1/chat/ds-test. For text embedding models, enter /v1/embeddings. For other models, enter /v1/chat/completions. batch = client.batches.create(input_file_id=input_file_id, endpoint="/v1/chat/ds-test", completion_window="24h") print(f"Batch job created successfully. Batch job ID: {batch.id}\n") return batch.id def check_job_status(batch_id): print(f"Checking the batch job status...") batch = client.batches.retrieve(batch_id=batch_id) print(f"Batch job status: {batch.status}\n") return batch.status def get_output_id(batch_id): print(f"Getting the output file ID for successful requests in the batch job...") batch = client.batches.retrieve(batch_id=batch_id) print(f"Output file ID: {batch.output_file_id}\n") return batch.output_file_id def get_error_id(batch_id): print(f"Getting the error file ID for failed requests in the batch job...") batch = client.batches.retrieve(batch_id=batch_id) print(f"Error file ID: {batch.error_file_id}\n") return batch.error_file_id def download_results(output_file_id, output_file_path): print(f"Printing and downloading the results of successful requests from the batch job...") content = client.files.content(output_file_id) # Print some content for testing print(f"Printing the first 1,000 characters of the successful results: {content.text[:1000]}...\n") # Save the result file to your local machine content.write_to_file(output_file_path) print(f"The complete output results have been saved to the local output file result.jsonl\n") def download_errors(error_file_id, error_file_path): print(f"Printing and downloading the information for failed requests from the batch job...") content = client.files.content(error_file_id) # Print some content for testing print(f"Printing the first 1,000 characters of the failed request information: {content.text[:1000]}...\n") # Save the error information file to your local machine content.write_to_file(error_file_path) print(f"The complete failed request information has been saved to the local error file error.jsonl\n") def main(): # File paths input_file_path = "test_model.jsonl" # Replace with your input file path output_file_path = "result.jsonl" # Replace with your output file path error_file_path = "error.jsonl" # Replace with your error file path try: # Step 1: Upload the JSONL file containing request information to get the input file ID input_file_id = upload_file(input_file_path) # Step 2: Create a batch job based on the input file ID batch_id = create_batch_job(input_file_id) # Step 3: Check the batch job status until it is finished status = "" while status not in ["completed", "failed", "expired", "cancelled"]: status = check_job_status(batch_id) print(f"Waiting for the job to complete...") time.sleep(10) # Wait 10 seconds before checking the status again # If the job fails, print the error message and exit if status == "failed": batch = client.batches.retrieve(batch_id) print(f"Batch job failed. Error information: {batch.errors}\n") print(f"For more information, see the error codes documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code") return # Step 4: Download the results. If an output file ID exists, print the first 1,000 characters of the successful results and download the complete results to a local output file. # If an error file ID exists, print the first 1,000 characters of the failed request information and download the complete information to a local error file. output_file_id = get_output_id(batch_id) if output_file_id: download_results(output_file_id, output_file_path) error_file_id = get_error_id(batch_id) if error_file_id: download_errors(error_file_id, error_file_path) print(f"For more information, see the error codes documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code") except Exception as e: print(f"An error occurred: {e}") print(f"For more information, see the error codes documentation: https://www.alibabacloud.com/help/en/model-studio/developer-reference/error-code") if __name__ == "__main__": main()
Verify the test results
The job status shows
completed.Result file
result.jsonl: Contains the fixed response{"content":"This is a test result."}.{"id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","custom_id":"1","response":{"status_code":200,"request_id":"a2b1ae25-21f4-4d9a-8634-99a29926486c","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-bca7295b-67c3-4b1f-8239-d78323bb669f","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null} {"id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","custom_id":"2","response":{"status_code":200,"request_id":"39b74f09-a902-434f-b9ea-2aaaeebc59e0","body":{"created":1743562621,"usage":{"completion_tokens":6,"prompt_tokens":20,"total_tokens":26},"model":"batch-test-model","id":"chatcmpl-1e32a8ba-2b69-4dc4-be42-e2897eac9e84","choices":[{"finish_reason":"stop","index":0,"message":{"content":"This is a test result."}}],"object":"chat.completion"}},"error":null}If an error occurs, see Error messages to resolve it.
After you verify the test, follow these steps to run a formal batch job.
Prepare the input file according to the input file requirements. Set the model parameter in the file to a supported model and set the url parameter to: /v1/chat/completions
Replace the endpoint in the Python script above.
ImportantMake sure that the endpoint in the script matches the url parameter in the input file.
Run the script and wait for the job to complete. If the job is successful, an output file named
result.jsonlis generated in the same directory.If the job fails, the program exits and prints an error message.
If an error file ID exists, an error file named
error.jsonlis generated in the same directory for you to review.Exceptions that occur during the process are caught, and an error message is printed.
Data file format
Input file
Prepare a UTF-8 encoded .jsonl file that meets the following requirements:
Format: One JSON object per line, each describing an individual request.
Size limit: Up to 50,000 requests per file and no larger than 500 MB.
Line limit: Each JSON object up to 6 MB and within the model's context window.
Consistency: All requests in a file must target the same API endpoint (
url) and use the same model (body.model).Unique identifier: Each request requires a
custom_idunique within the file, which can be used to reference results after completion.
Request example
{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello!"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}JSONL batch generation tool
Use this tool to quickly generate JSONL files. To avoid performance issues, do not process more than 10,000 rows at a time. If you have a large data volume, process the data in batches.
Request parameters
Field | Type | Required | Description |
custom_id | String | Yes | A custom request ID. Each line represents a request, and each request has a unique |
method | String | Yes | The request method. Currently, only POST is supported. |
url | String | Yes | The URL associated with the API. It must be the same as the endpoint field used when creating the batch job.
|
body | Object | Yes | The request body for the model call. It contains all the parameters required to call the model, such as The parameters in the request body are consistent with those supported by the real-time inference API. For more information about the parameters, see OpenAI compatible API. To extend this further, also add more parameters, such as Example: |
body.model | String | Yes | The model used for this batch job. Important All batch requests in the same job must use the same model. The thinking mode, if supported, must also be consistent. |
body.messages | Array | Yes | A list of messages. |
Convert a CSV file to a JSONL file
If you have a CSV file where the first column is the request ID (custom_id) and the second column is the content, use the following Python code to quickly create a JSONL file for a batch job. The CSV file must be in the same directory as the Python script below.
You can also use the template file provided in this topic. The steps are as follows:
Download the template file to your local machine and place it in the same directory as the Python script below.
In this CSV template file, the first column is the request ID (custom_id) and the second column is the content. You can paste your business questions into this file.
After you run the following Python script, a JSONL file named input_demo.jsonl for a batch job is generated in the same directory.
To adjust the file path or other parameters, modify the code as needed.
import csv
import json
def messages_builder_example(content):
messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": content}]
return messages
with open("input_demo.csv", "r") as fin:
with open("input_demo.jsonl", 'w', encoding='utf-8') as fout:
csvreader = csv.reader(fin)
for row in csvreader:
body = {"model": "qwen-turbo", "messages": messages_builder_example(row[1])}
# The default value is /v1/chat/completions.
request = {"custom_id": row[0], "method": "POST", "url": "/v1/chat/completions", "body": body}
fout.write(json.dumps(request, separators=(',', ':'), ensure_ascii=False) + "\n", )Output file
A JSONL file. Each line is a JSON object that corresponds to a request result.
Response example
A single-line content example:
{"id":"73291560-xxx","custom_id":"1","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}A multi-line content example:
{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-max","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course. I am here to support you, whether you need to query information, find learning materials, get solutions to problems, or need any other help. Please tell me what you need help with."}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-max","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}Response parameters
Field | Type | Required | Description |
id | String | Yes | The request ID. |
custom_id | String | Yes | The custom request ID. |
response | Object | No | The request result. |
error | Object | No | The abnormal response result. |
error.code | String | No | The error code. |
error.message | String | No | The error message. |
completion_tokens | Integer | No | The number of tokens required to complete the generation. |
prompt_tokens | Integer | No | The number of tokens in the prompt. |
model | String | No | The model used for inference in this job. |
Convert a JSONL file to a CSV file
Compared to JSONL files, CSV files usually contain only the necessary data values without extra keys or metadata, making them ideal for automated scripts and batch jobs. To convert the JSONL output from a batch job to a CSV file, use the following Python code.
Make sure that the result.jsonl file is in the same directory as the Python script below. After you run the script, a CSV file named result.csv is generated.
To adjust the file path or other parameters, modify the code as needed.
import json
import csv
columns = ["custom_id",
"model",
"request_id",
"status_code",
"error_code",
"error_message",
"created",
"content",
"usage"]
def dict_get_string(dict_obj, path):
obj = dict_obj
try:
for element in path:
obj = obj[element]
return obj
except:
return None
with open("result.jsonl", "r") as fin:
with open("result.csv", 'w', encoding='utf-8') as fout:
rows = [columns]
for line in fin:
request_result = json.loads(line)
row = [dict_get_string(request_result, ["custom_id"]),
dict_get_string(request_result, ["response", "body", "model"]),
dict_get_string(request_result, ["response", "request_id"]),
dict_get_string(request_result, ["response", "status_code"]),
dict_get_string(request_result, ["error", "error_code"]),
dict_get_string(request_result, ["error", "error_message"]),
dict_get_string(request_result, ["response", "body", "created"]),
dict_get_string(request_result, ["response", "body", "choices", 0, "message", "content"]),
dict_get_string(request_result, ["response", "body", "usage"])]
rows.append(row)
writer = csv.writer(fout)
writer.writerows(rows)If a CSV file contains Chinese characters and you encounter garbled text when you open it with Excel, use a text editor such as Sublime to convert the CSV file's encoding to GBK and then open it in Excel. Alternatively, create a new Excel file and specify the correct encoding format, UTF-8, when you import the data.
Procedure
1. Prepare and upload the file
Before you create a batch job, prepare a JSONL file that meets the input file requirements. Upload the file using the file upload API operation to obtain a file_id. Use the `purpose` parameter to specify the file's purpose as batch.
The maximum size of a single file that you can upload for a batch job is 500 MB. The maximum number of files allowed in your Model Studio storage space under your Alibaba Cloud account is 10,000, and the total size cannot exceed 100 GB. Files do not currently have an expiration date.
OpenAI Python SDK
Request example
import os
from pathlib import Path
from openai import OpenAI
client = OpenAI(
# If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # The base_url for the Alibaba Cloud Model Studio service
)
# test.jsonl is a local sample file. The purpose must be batch.
file_object = client.files.create(file=Path("test.jsonl"), purpose="batch")
print(file_object.model_dump_json())Content of the test file test.jsonl:
{"custom_id":"1","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"Hello! How can I help you?"}]}}
{"custom_id":"2","method":"POST","url":"/v1/chat/completions","body":{"model":"qwen-plus","messages":[{"role":"system","content":"You are a helpful assistant."},{"role":"user","content":"What is 2+2?"}]}}Response example
{
"id": "file-batch-xxx",
"bytes": 437,
"created_at": 1742304153,
"filename": "test.jsonl",
"object": "file",
"purpose": "batch",
"status": "processed",
"status_details": null
}curl
Request example
# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/files
# === Delete this comment before running ===
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
--form 'file=@"test.jsonl"' \
--form 'purpose="batch"'Content of the test file test.jsonl:
{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "qwen-turbo", "messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is 2+2?"}]}}Response example
{
"id": "file-batch-xxx",
"bytes": 231,
"created_at": 1729065815,
"filename": "test.jsonl",
"object": "file",
"purpose": "batch",
"status": "processed",
"status_details": null
}2. Create a batch
Create a batch job by setting the input_file_id parameter to the file ID returned by the Prepare and upload the file API operation.
API rate limit: Each Alibaba Cloud account can make up to 1,000 calls per minute. The maximum number of running jobs is 1,000 (including all unfinished jobs). If you exceed the maximum, you must wait for a job to finish before creating a new one.
OpenAI Python SDK
Request example
import os
from openai import OpenAI
client = OpenAI(
# If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # The base_url for the Alibaba Cloud Model Studio service
)
batch = client.batches.create(
input_file_id="file-batch-xxx", # The ID returned after uploading the file
endpoint="/v1/chat/completions", # For the test model batch-test-model, enter /v1/chat/ds-test. For other models, enter /v1/chat/completions.
completion_window="24h",
metadata={'ds_name':"Job Name",'ds_description':'Job Description'} # Metadata, an optional field, used to create a job name and description
)
print(batch)curl
Request example
# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/files
# === Delete this comment before running ===
curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"input_file_id": "file-batch-xxx",
"endpoint": "/v1/chat/completions",
"completion_window": "24h",
"metadata":{"ds_name":"Job Name","ds_description":"Job Description"}
}'Replace the value of input_file_id with the actual value.Input parameter settings
Field | Type | Parameter passing method | Required | Description |
input_file_id | String | Body | Yes | Specifies the file ID to be used as the input file for the batch job. Use the file ID returned by the Prepare and upload the file API operation, such as |
endpoint | String | Body | Yes | The access path. It must be the same as the url field in the input file.
|
completion_window | String | Body | Yes | The waiting time. The minimum waiting time is 24h, and the maximum is 336h. Only integers are supported. The supported units are "h" and "d", such as "24h" or "14d". |
metadata | Map | Body | No | Extended metadata for the job. Attach information as key-value pairs. |
metadata.ds_name | String | Body | No | The name of the job. Example: Limit: Up to 100 characters in length. If this field is defined multiple times, the last value passed is used. |
metadata.ds_description | String | Body | No | The description of the job. Example: Limit: Up to 200 characters in length. If this field is defined multiple times, the last value passed is used. |
Response example
{
"id": "batch_xxx",
"object": "batch",
"endpoint": "/v1/chat/completions",
"errors": null,
"input_file_id": "file-batch-xxx",
"completion_window": "24h",
"status": "validating",
"output_file_id": null,
"error_file_id": null,
"created_at": 1742367779,
"in_progress_at": null,
"expires_at": null,
"finalizing_at": null,
"completed_at": null,
"failed_at": null,
"expired_at": null,
"cancelling_at": null,
"cancelled_at": null,
"request_counts": {
"total": 0,
"completed": 0,
"failed": 0
},
"metadata": {
"ds_name": "Job Name",
"ds_description": "Job Description"
}
}Response parameters
Field | Type | Description |
id | String | The batch job ID. |
object | String | The object type. The value is fixed to |
endpoint | String | The access path. |
errors | Map | The error message. |
input_file_id | String | The file ID. |
completion_window | String | The waiting time. The minimum waiting time is 24h, and the maximum is 336h. Only integers are supported. The supported units are "h" and "d", such as "24h" or "14d". |
status | String | The job status, which can be validating, failed, in_progress, finalizing, completed, expired, cancelling, or cancelled. |
output_file_id | String | The output file ID for successful requests. |
error_file_id | String | The output file ID for failed requests. |
created_at | Integer | The UNIX timestamp (in seconds) when the job was created. |
in_progress_at | Integer | The UNIX timestamp (in seconds) when the job started running. |
expires_at | Integer | The timestamp (in seconds) when the job started to time out. |
finalizing_at | Integer | The timestamp (in seconds) when the job last started. |
completed_at | Integer | The timestamp (in seconds) when the job was completed. |
failed_at | Integer | The timestamp (in seconds) when the job failed. |
expired_at | Integer | The timestamp (in seconds) when the job timed out. |
cancelling_at | Integer | The timestamp (in seconds) when the job was set to cancelling. |
cancelled_at | Integer | The timestamp (in seconds) when the job was canceled. |
request_counts | Map | The number of requests in different states. |
metadata | Map | Additional information as key-value pairs. |
metadata.ds_name | String | The name of the current job. |
metadata.ds_description | String | The description of the current job. |
3. Query and manage batch jobs
Query batch job details
Pass the batch job ID returned by Create a batch job to query information about a specific batch job. You can only query batch jobs created within the last 30 days.
API rate limit: Each Alibaba Cloud account can make up to 1,000 calls per minute. Because a batch job takes some time to execute, we recommend calling this query API operation once per minute to retrieve job information after you create a batch job.
OpenAI Python SDK
Request example
import os
from openai import OpenAI
client = OpenAI(
# If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # The base_url for the Alibaba Cloud Model Studio service
)
batch = client.batches.retrieve("batch_id") # Replace batch_id with the ID of the batch job
print(batch)curl
Request example
# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/files
# === Delete this comment before running ===
curl --request GET 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY"Replace batch_id with the actual value.Input parameter settings
Field | Type | Parameter passing method | Required | Description |
batch_id | String | Path | Yes | The ID of the batch job to query (the batch job ID returned by Create a batch job), which starts with "batch", for example, "batch_xxx". |
Response example
For more information, see the response example for Create a batch job.
Response parameters
For more information, see the response parameters for Create a batch job.
The content of output_file_id and error_file_id in the response parameters can be retrieved using Download the batch result file.
Query the batch job list
You can use the batches.list() method to query the batch job list and use the paging mechanism to retrieve the complete job list.
Use the
afterparameter: Pass the ID of the last batch job from the previous page to retrieve the next page of data.Use the
limitparameter: Set the number of jobs to return.You can filter queries using parameters such as
input_file_ids.
API rate limit: Each Alibaba Cloud account can make up to 100 calls per minute.
OpenAI Python SDK
Request example
import os
from openai import OpenAI
client = OpenAI(
# If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # The base_url for the Alibaba Cloud Model Studio service
)
batches = client.batches.list(after="batch_xxx", limit=2,extra_query={'ds_name':'Job Name','input_file_ids':'file-batch-xxx,file-batch-xxx','status':'completed,expired','create_after':'20250304000000','create_before':'20250306123000'})
print(batches)curl
Request example
# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/batches?xxxxxx
# === Delete this comment before running ===
curl --request GET 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches?after=batch_xxx&limit=2&ds_name=Batch&input_file_ids=file-batch-xxx,file-batch-xxx&status=completed,failed&create_after=20250303000000&create_before=20250320000000' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY"Replacebatch_idinafter=batch_idwith the actual value. Set thelimitparameter to the number of jobs to return. Setds_nameto a fragment of the job name. For the value of input_file_ids, enter multiple file IDs. Setstatusto multiple batch job statuses. Set the values ofcreate_afterandcreate_beforeto specific points in time.
Input parameter settings
Field | Type | Parameter passing method | Required | Description |
after | String | Query | No | A cursor for paging. The value of the For example, if the current query returns 20 rows of data and the last batch job ID (last_id) is batch_xxx, set |
limit | Integer | Query | No | The number of batch jobs to return for each query. The range is [1, 100]. The default value is 20. |
ds_name | String | Query | No | Performs a fuzzy search based on the job name. Enter any continuous character fragment to match job names that contain it. For example, entering "Batch" can match "Batch Job" and "Batch Job_20240319". |
input_file_ids | String | Query | No | Filters by multiple file IDs, separated by commas. You can enter up to 20 IDs. The file ID returned by Prepare and upload the file. |
status | String | Query | No | Filters by multiple statuses, separated by commas. The statuses include validating, failed, in_progress, finalizing, completed, expired, cancelling, and cancelled. |
create_after | String | Query | No | Filters for jobs created after this point in time. The format is |
create_before | String | Query | No | Filters for jobs created before this point in time. The format is |
Response example
{
"object": "list",
"data": [
{
"id": "batch_xxx",
"object": "batch",
"endpoint": "/v1/chat/completions",
"errors": null,
"input_file_id": "file-batch-xxx",
"completion_window": "24h",
"status": "completed",
"output_file_id": "file-batch_output-xxx",
"error_file_id": null,
"created_at": 1722234109,
"in_progress_at": 1722234109,
"expires_at": null,
"finalizing_at": 1722234165,
"completed_at": 1722234165,
"failed_at": null,
"expired_at": null,
"cancelling_at": null,
"cancelled_at": null,
"request_counts": {
"total": 100,
"completed": 95,
"failed": 5
},
"metadata": {}
},
{ ... }
],
"first_id": "batch_xxx",
"last_id": "batch_xxx",
"has_more": true
}Response parameters
Field | Type | Description |
object | String | The type. The value is fixed to list. |
data | Array | The batch job object. For more information, see the response parameters for creating a batch job. |
first_id | String | The ID of the first batch job on the current page. |
last_id | String | The ID of the last batch job on the current page. |
has_more | Boolean | Indicates whether there is a next page. |
Cancel a batch job
Pass the batch job ID returned by Create a batch job to cancel the specified batch job.
API rate limit: Each Alibaba Cloud account can make up to 1,000 calls per minute.
OpenAI Python SDK
Request example
import os
from openai import OpenAI
client = OpenAI(
# If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", # The base_url for the Alibaba Cloud Model Studio service
)
batch = client.batches.cancel("batch_id") # Replace batch_id with the ID of the batch job
print(batch)curl
Request example
# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/batches/batch_id/cancel
# === Delete this comment before running ===
curl --request POST 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1/batches/batch_id/cancel' \
-H "Authorization: Bearer $DASHSCOPE_API_KEY"Replace batch_id with the actual value.Input parameter settings
Field | Type | Parameter passing method | Required | Description |
batch_id | String | Path | Yes | The ID of the batch job to cancel, which starts with "batch", for example, "batch_xxx". |
Response example
For more information, see the response example for Create a batch job.
Response parameters
For more information, see the response parameters for Create a batch job.
4. Download the batch result file
After the batch inference job is complete, use the API operation to download the result file.
Get thefile_idfor downloading the file from theoutput_file_idin the response parameters of the Query batch job details or Query the batch job list API. Only files withfile_idthat starts withfile-batch_outputcan be downloaded.
OpenAI Python SDK
Use the content method to retrieve the content of the batch job result file and use the write_to_file method to save it to your local machine.
Request example
import os
from openai import OpenAI
client = OpenAI(
# If you have not configured environment variables, replace the following line with api_key="sk-xxx" using your Model Studio API key. However, we do not recommend hard-coding the API key into your code in a production environment to reduce the risk of leaks.
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
api_key=os.getenv("DASHSCOPE_API_KEY"),
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
content = client.files.content(file_id="file-batch_output-xxx")
# Print the content of the result file
print(content.text)
# Save the result file to your local machine
content.write_to_file("result.jsonl")Response example
{"id":"c308ef7f-xxx","custom_id":"1","response":{"status_code":200,"request_id":"c308ef7f-0824-9c46-96eb-73566f062426","body":{"created":1742303743,"usage":{"completion_tokens":35,"prompt_tokens":26,"total_tokens":61},"model":"qwen-plus","id":"chatcmpl-c308ef7f-0824-9c46-96eb-73566f062426","choices":[{"finish_reason":"stop","index":0,"message":{"content":"Hello! Of course. I am here to support you, whether you need to query information, find learning materials, get solutions to problems, or need any other help. Please tell me what you need help with."}}],"object":"chat.completion"}},"error":null}
{"id":"73291560-xxx","custom_id":"2","response":{"status_code":200,"request_id":"73291560-7616-97bf-87f2-7d747bbe84fd","body":{"created":1742303743,"usage":{"completion_tokens":7,"prompt_tokens":26,"total_tokens":33},"model":"qwen-plus","id":"chatcmpl-73291560-7616-97bf-87f2-7d747bbe84fd","choices":[{"finish_reason":"stop","index":0,"message":{"content":"2+2 equals 4."}}],"object":"chat.completion"}},"error":null}curl
Use the GET method and specify the file_id in the URL to download the batch job result file.
Request example
# ======= Important =======
# The API keys for the Singapore and Beijing regions are different. To obtain an API key, see https://www.alibabacloud.com/help/en/model-studio/get-api-key
# The following is the base_url for the Singapore region. If you use a model in the Beijing region, replace the base_url with https://dashscope.aliyuncs.com/compatible-mode/v1/batches/batch_id/cancel
# === Delete this comment before running ===
curl -X GET https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files/file-batch_output-xxx/content \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" > result.jsonlInput parameter settings
Field | Type | Parameter passing method | Required | Description |
file_id | string | Path | Yes | The ID of the file to download. This is the value of the |
Returned result
A JSONL file of the batch job results. For more information about the format, see Output file.
Extended features
Billing
Unit price: The input and output tokens for successful requests are billed at 50% of the standard synchronous API for that model. Pricing details: Models.
Scope:
Only successfully executed requests in a task are billed.
File parsing failures, execution failures, or row-level request errors do not incur charges.
For canceled tasks, requests that successfully completed before the cancellation are still billed.
Batches are billed separately and does not support savings plan, new user free quotas, or features such as context cache.
Error codes
If the call fails and an error message is returned, see Error messages to resolve the issue.
FAQ
Do I need to place an order to use batch calls? If so, where?
A: Batch is a calling method and does not require an additional order. This calling method uses a pay-as-you-go billing model, where you pay directly for batch API operation calls.
How are submitted batch call requests processed in the background? Are they executed in the order they are submitted?
A: It is not a queuing mechanism, but a scheduling mechanism. Batch request jobs are scheduled and executed based on resource availability.
How long does it actually take for a submitted batch call request to be completed?
A: The execution time of a batch job depends on the system's resource allocation.
When system resources are tight, a job may not be fully completed within the set maximum waiting time.
Therefore, for scenarios with strict requirements on the timeliness of model inference, we recommend using real-time calls. For scenarios that involve processing large-scale data and have a certain tolerance for timeliness, we recommend using batch calls.