model fine-tuning, API, model tuning, model fine-tuning, Dashscope, fine tuning, model training -

Fine-tune Qwen in Alibaba Cloud Model Studio using an HTTP API.

Important

This topic is applicable only to the International Edition (Singapore region).

Prerequisites

You understand fine-tuning concepts, procedures, and data format requirements.
You have activated Model Studio and got an API key. See Create an API key.

Fine-tuning overview

Fine-tuning improves model performance:

Improve performance for specific industries or businesses
Reduce output latency
Suppress hallucinations
Align outputs with human values or preferences
Replace larger models with fine-tuned lightweight models

During fine-tuning, the model learns business- and scenario-specific features from your training data, such as domain knowledge, tone, communication style, and self-awareness. Because the model has already learned many industry- or scenario-specific examples during pre-training, its zero-shot or one-shot performance after fine-tuning surpasses the base model’s few-shot performance. This reduces input tokens and lowers output latency.

Overall procedure

Supported models

Text generation

Name	Model code	Full-parameter SFT (sft)	Efficient SFT (efficient_sft)
Qwen3-32B	qwen3-32b	Supported	Supported
Qwen3-14B	qwen3-14b	Supported	Supported
Qwen3-VL-8B-Instruct	qwen3-vl-8b-instruct	Supported	Supported
Qwen3-VL-8B-Thinking	qwen3-vl-8b-thinking	Supported	Supported

Compare training modes

Full-parameter training

Efficient training (LoRA, recommended)

Scenarios

• Learn new capabilities.

• Achieve optimal overall performance.

• Optimize performance in specific scenarios.

• Time- and cost-sensitive.

Training duration

Longer, with slower convergence.

Shorter, with faster convergence.

Billing

Method	Billed by the volume of training data.
Formula	Model training fee = (Total tokens in training data + Total tokens in mixed training data) × Number of epochs × Training unit price (Minimum billing unit: 1 token)

Unit price for training

The following table lists the unit prices for training pre-trained models. The unit price for training a custom model matches that of the corresponding pre-trained model.

Qwen

Service	Code	Price
Qwen3-32B	qwen3-32b	$0.008/1,000 tokens
Qwen3-14B	qwen3-14b	$0.0016/1,000 tokens

Qwen-VL

Service	Code	Price
Qwen3-VL-8B-Instruct	qwen3-vl-8b-instruct	$0.002/1,000 tokens
Qwen3-VL-8B-Thinking	qwen3-vl-8b-thinking	$0.002/1,000 tokens

Dataset tips

Size requirements

SFT datasets require at least 1,000 high-quality entries. If evaluation results are unsatisfactory, collect more training data.

If you do not have enough data, consider building an agent application with a knowledge base. In many complex business scenarios, fine-tuning and knowledge base retrieval work best together.

For example, in a customer service scenario, fine-tune the model to adjust its tone, expression habits, and self-awareness, then use a knowledge base to dynamically inject domain knowledge into the context.

Try retrieval-augmented generation (RAG) first. After collecting enough data, use fine-tuning to further improve performance.

You can expand your dataset using the following strategies:

Use a larger, high-performing model to generate content for specific businesses or scenarios.
Manually collect data from various sources, such as application scenarios, web scraping, social media, forums, public datasets, partners, industry resources, and user contributions.

Data diversity and balance

For domain-specific use cases, domain expertise is the most important factor. For Q&A scenarios, generalization matters more. Design data samples based on your business modules or scenarios. Training quality depends on data volume, domain specificity, and diversity.

For example, in an AI assistant scenario, a professional and diverse dataset should include the following:

Business	Diverse scenarios and use cases
E-commerce customer service	Promotion pushes, pre-sales consultation, in-sales guidance, after-sales service, follow-up visits, complaint handling, and more.
Financial services	Loan consultation, investment and financial advice, credit card services, bank account management, and more.
Online healthcare	Symptom consultation, appointment scheduling, visit instructions, drug information queries, health tips, and more.
AI secretary	IT information, administrative information, HR information, employee benefit Q&A, company calendar queries, and more.
Travel assistant	Travel planning, entry and exit guides, travel insurance consultation, destination customs and culture introductions, and more.
Corporate legal counsel	Contract review, intellectual property protection, compliance checks, labor law Q&A, cross-border transaction consultation, case-specific legal analysis, and more.

Balance the data volume across scenarios to match actual usage ratios. This prevents bias toward any single feature type and improves generalization.

Upload a training dataset

Prepare a dataset

SFT training set

SFT uses training data in Chat Markup Language (ChatML) format, which supports multi-turn conversations and multiple role settings.

The OpenAI name and weight parameters are not supported. All assistant outputs are trained.

# A line of training data (JSON format), the typical structure is as follows when expanded:
{"messages": [
  {"role": "system", "content": "System input 1"}, 
  {"role": "user", "content": "User input 1"}, 
  {"role": "assistant", "content": "Expected model output 1"}, 
  {"role": "user", "content": "User input 2"}, 
  {"role": "assistant", "content": "Expected model output 2"}
  ...
]}

For details about the system, user, and assistant roles, see Overview of text generation models. Sample training sets: SFT-ChatML_format_example.jsonl and SFT-ChatML_format_example.xlsx. XLS and XLSX formats support only single-turn conversations.

For a single training entry, all assistant rows support the "loss_weight" parameter, which sets the relative importance during training. The valid values range from 0.0 to 1.0. Higher values indicate greater importance.

This parameter is in invitational preview. To use it, you can contact your account manager.

 {"role": "assistant", "content": "Expected model output 1", "loss_weight": 1.0}, 
 {"role": "assistant", "content": "Expected model output 2", "loss_weight": 0.5}

SFT for thinking model

The training data supports multi-turn conversations and multiple role settings, but only the final assistant output is trained.

The \n characters before and after the think tags must be retained.

# A line of training data (JSON format), the typical structure is as follows when expanded:
{"messages": [
  {"role": "system", "content": "System input 1"}, 
  {"role": "user", "content": "User input 1"}, 
  {"role": "assistant", "content": "Model output 1"}, --Intermediate assistant outputs should not have <think> tags
   ...
  {"role": "user", "content": "User input 2"}, 
  {"role": "assistant", "content": "<think>\nExpected thinking content 2\n</think>\n\nExpected output 2"} --Thinking content can only be included in the final assistant output. 
]}

For details about the system, user, and assistant roles, see Overview of text generation models. Sample training set: SFT-deep_thinking_content_example.jsonl.

You can configure the model to omit the <think> tag in training samples. If you use this output method, do not enable the thinking mode for calls after the model is trained.

{"role": "assistant", "content": "Expected model output 2"}  --Tells the model not to enable thinking

The final assistant row of a single training entry supports the "loss_weight" parameter, which sets the relative importance during training. The valid values range from 0.0 to 1.0. Higher values indicate greater importance.

This parameter is in invitational preview. To use it, you can contact your account manager.

 {"role": "assistant", "content": "<think>\nExpected thinking content 2\n</think>\n\nExpected output 2", "loss_weight": 1.0}

SFT for image understanding (Qwen-VL)

The OpenAI name and weight parameters are not supported. All assistant outputs are trained.

For more information about the differences between the system, user, and assistant roles, see Overview of text generation models. Sample training data in ChatML format:

# A line of training data (JSON format), the typical structure is as follows when expanded:
{"messages":[
  {"role":"user",
    "content":[
      {"text":"User input 1"},
      {"image":"Image file name 1"}]},
  {"role":"assistant",
    "content":[
      {"text":"Expected model output 1"}]},
  {"role":"user",
    "content":[
      {"text":"User input 2"}]},
  {"role":"assistant",
    "content":[
      {"text":"Expected model output 2"}]},
  ...
  ...
  ...
 ]}

Note

If you train a thinking model, you must follow the data format requirements for SFT for thinking model.

The following are the requirements for ZIP files:

Format: ZIP. Maximum size: 2 GB. The folder and file names within the ZIP file must contain only ASCII letters (a–z, A–Z), numbers (0–9), underscores (_), and hyphens (-).
The training text data file must be named data.jsonl and placed in the root directory of the ZIP file. Ensure that the data.jsonl file appears immediately when you open the ZIP file.
A single image cannot exceed 1024 pixels in width or height. The maximum size is 10 MB. Supported formats: .bmp, .jpeg /.jpg, .png, .tif /.tiff, and .webp.
Image file names cannot be duplicated, even if the files are stored in different folders.
ZIP file directory structure:
Single-level directory (recommended)
The image files and the data.jsonl file reside in the root directory of the ZIP file.
```
Trainingdata_vl.zip
   |--- data.jsonl # Note: Do not wrap in an outer folder
   |--- image1.png
   |--- image2.jpg
```
Multi-level directory
1. The data.jsonl file must be in the root directory of the ZIP file.
2. In the data.jsonl file, you can declare only the image file name, not the file path. For example:
  Correct: image1.jpg. Incorrect: jpg_folder/image1.jpg.
3. Image file names must be globally unique within the ZIP file.
```
Trainingdata_vl.zip
    |--- data.jsonl # Note: Do not wrap in an outer folder
    |--- jpg_folder
    |   └── image1.jpg
    |--- png_folder
        └── image2.png
```

Upload the training file

HTTP

For Windows CMD, you can replace ${DASHSCOPE_API_KEY} with %DASHSCOPE_API_KEY%. For PowerShell, you can replace it with $env:DASHSCOPE_API_KEY

curl -X POST https://dashscope-intl.aliyuncs.com/compatible-mode/v1/files \
-H "Authorization: Bearer $DASHSCOPE_API_KEY" \
--form 'file=@"path/to/your/sample.jsonl"' \
--form 'purpose="fine-tune"'

Note

Limits:

The maximum file size is 1 GB.
The total storage quota for all active (not deleted) files is 5 GB.
The maximum number of active (not deleted) files is 100.
There is no time limit for file storage.

See OpenAI compatible - File.

Response:

{
    "id": "file-ft-e73cafa11cef43a0ab75fb8e",
    "object": "file",
    "bytes": 23149,
    "filename": "qwen-fine-tune-sample.jsonl",
    "purpose": "fine-tune",
    "status": "processed",
    "created_at": 1769138847
}

Fine-tuning

Create a fine-tuning job

HTTP

For Windows CMD, you can replace ${DASHSCOPE_API_KEY} with %DASHSCOPE_API_KEY%. For PowerShell, you can replace it with $env:DASHSCOPE_API_KEY

curl --location "https://dashscope-intl.aliyuncs.com/api/v1/fine-tunes" \
--header "Authorization: Bearer ${DASHSCOPE_API_KEY}" \
--header 'Content-Type: application/json' \
--data '{
    "model":"qwen3-14b",
    "training_file_ids":[
        "<Replace with the file ID of training dataset 1>",
        "<Replace with the file ID of training dataset 2>"
    ],
    "hyper_parameters":
    {
        "n_epochs": 1,
        "batch_size": 16,
        "learning_rate": "1.6e-5",
        "split": 0.9,
        "warmup_ratio": 0.0,
        "eval_steps": 1,
        "save_strategy": "epoch",
        "save_total_limit": 10
    },
    "training_type":"sft"
}'

Input parameters

Field	Required	Type	Location	Description
training_file_ids	Yes	Array	Body	Training set file IDs.
validation_file_ids	No	Array	Body	Validation set file IDs.
model	Yes	String	Body	Base model ID, or the ID of a model generated by a previous fine-tuning job.
hyper_parameters	No	Map	Body	Hyperparameters for fine-tuning. Default values are used if omitted.
training_type	No	String	Body	Fine-tuning method. Valid values: `sft` `efficient_sft`
job_name	No	String	Body	Job name.
model_name	No	String	Body	Name of the fine-tuned model. The model ID is generated by the system.

Sample response

{
    "request_id": "635f7047-003e-4be3-b1db-6f98e239f57b",
    "output":
    {
        "job_id": "ft-202511272033-8ae7",
        "job_name": "ft-202511272033-8ae7",
        "status": "PENDING",
        "finetuned_output": "qwen3-14b-ft-202511272033-8ae7",
        "model": "qwen3-14b",
        "base_model": "qwen3-14b",
        "training_file_ids":
        [
            "9e9ffdfa-c3bf-436e-9613-6f053c66aa6e"
        ],
        "validation_file_ids":
        [],
        "hyper_parameters":
        {
            "n_epochs": 1,
            "batch_size": 16,
            "learning_rate": "1.6e-5",
            "split": 0.9,
            "warmup_ratio": 0.0,
            "eval_steps": 1,
            "save_strategy": "epoch",
            "save_total_limit": 10
        },
        "training_type": "sft",
        "create_time": "2025-11-27 20:33:15",
        "workspace_id": "llm-8v53etv3hwb8orx1",
        "user_identity": "1654290265984853",
        "modifier": "1654290265984853",
        "creator": "1654290265984853",
        "group": "llm",
        "max_output_cnt": 10
    }
}

`hyper_parameters` supported settings

Parameter	Default	Recommended settings	Type	Description
`n_epochs`	1	Adjust based on fine-tuning results.	Integer	Number of times the model iterates through the training data. Higher values increase training duration and cost.
`learning_rate`	`sft`: 1e-5 level `efficient_sft`: 1e-4 level The specific value varies depending on the selected model.	Use the default value.	Float	Controls the intensity of model weight updates. Too high: parameters change drastically, degrading performance. Too low: performance may not change significantly.
`freeze_vit`	true	Adjust as needed.	Boolean	Freezes the visual backbone parameters so that its weights remain unchanged during training. Applies only to Qwen-VL models.
`batch_size`	The specific value varies depending on the selected model. The larger the model, the smaller the default batch size.	Use the default value.	Integer	Number of data entries per training iteration. Smaller values prolong training time.
`eval_steps`	50	Adjust as needed.	Integer	Interval (in steps) for evaluating training accuracy and loss during training. Controls display frequency of Validation Loss and Token Accuracy.
`logging_steps`	5	Adjust as needed.	Integer	Interval (in steps) for printing fine-tuning logs.
`lr_scheduler_type`	`cosine`	Recommended: `linear` or `Inverse_sqrt`	String	Strategy for dynamically adjusting the learning rate during training. Valid values:
`max_length`	2048	8192	Integer	Maximum token length per training entry. Entries exceeding this limit are discarded.
`max_split_val_dataset_sample`	1000	Use the default value.	Integer	If `"validation_file_ids"` is not set, Alibaba Cloud Model Studio automatically splits a validation set of up to 1,000 entries. If you set `"validation_file_ids"`, this parameter is ignored.
`split`	0.8	Use the default value.	Float	If you do not set `"validation_file_ids"`, Model Studio automatically uses 80% of the training file as the training set and 20% as the validation set. If you set `"validation_file_ids"`, this parameter has no effect.
`warmup_ratio`	0.05	Use the default value.	Float	Proportion of total training steps dedicated to learning rate warmup. During warmup, the learning rate linearly increases from a small initial value to the configured rate. Limits the extent of parameter changes during early training, improving stability. Too high: equivalent to a low learning rate; performance may not change. Too low: equivalent to a high learning rate; may degrade performance. Does not apply to the "Constant" learning rate scheduler type.
`weight_decay`	0.1	Use the default value.	Float	L2 regularization strength. Helps maintain model generalization. If too high, fine-tuning effects are insignificant.
Parameters for efficient SFT (supports `efficient_sft`) Note When you perform a second round of efficient fine-tuning on a model that has already been efficiently fine-tuned, the `lora_rank`, `lora_alpha`, and `lora_dropout` parameters must be consistent.
`lora_rank`	8	64	Integer	Rank of the low-rank matrix in LoRA. Higher ranks improve fine-tuning results but slightly slow down training.
`lora_alpha`	32	Use the default value.	Integer	Scaling factor that controls the balance between original model weights and LoRA corrections. Larger values give more weight to LoRA corrections, making the model more task-specific. Smaller values preserve more pre-trained model knowledge.
`lora_dropout`	0.1	Use the default value.	Float	Dropout rate for LoRA low-rank matrix values. The recommended value enhances generalization. If too high, fine-tuning effects are insignificant.
Parameters for publishing checkpoints
`save_strategy`	`epoch`	Can be set to `epoch` or `steps`. When set to `steps`, set the `save_steps` parameter to adjust the saving interval.	String	Controls the interval and maximum number of checkpoints saved during fine-tuning.
`save_steps`	50	To modify it, set it to an integer multiple of the `eval_steps` parameter.	Integer	Number of training steps after which a checkpoint is saved.
`save_total_limit`	1	10	Integer	Maximum number of checkpoints to save for export.

Query job details

Use the returned job_id to query the job status.

HTTP

curl 'https://dashscope-intl.aliyuncs.com/api/v1/fine-tunes/<job_id>' \
--header 'Authorization: Bearer '${DASHSCOPE_API_KEY} \
--header 'Content-Type: application/json'

Input parameters

Field	Type	Location	Required	Description
job_id	String	Path Parameter	Yes	Job ID.

Sample successful response

{
    "request_id": "d100cddb-ac85-4c82-bd5c-9b5421c5e94d",
    "output":
    {
        "job_id": "ft-202511272033-8ae7",
        "job_name": "ft-202511272033-8ae7",
        "status": "RUNNING",
        "finetuned_output": "qwen3-14b-ft-202511272033-8ae7",
        "model": "qwen3-14b",
        "base_model": "qwen3-14b",
        "training_file_ids":
        [
            "9e9ffdfa-c3bf-436e-9613-6f053c66aa6e"
        ],
        "validation_file_ids":
        [],
        "hyper_parameters":
        {
            "n_epochs": 1,
            "batch_size": 16,
            "learning_rate": "1.6e-5",
            "split": 0.9,
            "warmup_ratio": 0.0,
            "eval_steps": 1,
            "save_strategy": "epoch",
            "save_total_limit": 10
        },
        "training_type": "sft",
        "create_time": "2025-11-27 20:33:15",
        "workspace_id": "llm-8v53etv3hwb8orx1",
        "user_identity": "1654290265984853",
        "modifier": "1654290265984853",
        "creator": "1654290265984853",
        "group": "llm",
        "max_output_cnt": 10
    }
}

Job status	Meaning
PENDING	Training is about to begin.
QUEUING	Job is queued. Only one job can run at a time.
RUNNING	Job is running.
CANCELING	Job is being canceled.
SUCCEEDED	Job succeeded.
FAILED	Job failed.
CANCELED	Job was canceled.

Note

After training succeeds, finetuned_output contains the ID of the fine-tuned model, which you can use for model deployment.

Get job logs

HTTP

curl 'https://dashscope-intl.aliyuncs.com/api/v1/fine-tunes/<job_id>/logs?offset=0&line=1000' \
--header 'Authorization: Bearer '${DASHSCOPE_API_KEY} \
--header 'Content-Type: application/json'

Use the offset and line parameters to retrieve a specific range of logs. The offset parameter specifies the starting position and line specifies the maximum number of log entries.

Sample response:

{
    "request_id":"1100d073-4673-47df-aed8-c35b3108e968",
    "output":{
        "total":57,
        "logs":[
            "{Log output 1}",
            "{Log output 2}",
            ...
            ...
            ...
        ]
    }
}

Query and publish checkpoints

Only SFT fine-tuning (efficient_sft and sft) supports saving and publishing checkpoints of intermediate states.

Query checkpoints

curl 'https://dashscope-intl.aliyuncs.com/api/v1/fine-tunes/<job_id>/checkpoints' \
--header 'Authorization: Bearer '${DASHSCOPE_API_KEY} \
--header 'Content-Type: application/json'

Input parameters

Field	Type	Location	Required	Description
job_id	String	Path Parameter	Yes	Job ID.

Sample successful response

{
    "request_id": "c11939b5-efa6-4639-97ae-ed4597984647",
    "output": [
        {
            "create_time": "2025-11-11T16:25:42",
            "full_name": "ft-202511272033-8ae7-checkpoint-20",
            "job_id": "ft-202511272033-8ae7",
            "checkpoint": "checkpoint-20",
            "model_name": "qwen3-14b-instruct-ft-202511272033-8ae7",
            "status": "SUCCEEDED"
        }
    ]
}

Snapshot publishing status	Description
PENDING	The checkpoint is pending export.
PROCESSING	The checkpoint is being exported.
SUCCEEDED	The checkpoint was exported.
FAILED	The checkpoint failed to export.

Note

The checkpoint parameter refers to the checkpoint ID, which is used to specify the checkpoint to export in the model publishing API. The model_name parameter refers to the model ID, which can be used for model deployment. The finetuned_output parameter returns the model_name of the last checkpoint.

Publish a model

Note

You can export a checkpoint after fine-tuning is complete. Export the checkpoint before deploying the model in Model Studio.

Exported checkpoints are stored in cloud storage and cannot be accessed or downloaded.

curl --request GET 'https://dashscope-intl.aliyuncs.com/api/v1/fine-tunes/<job_id>/export/<checkpoint_id>?model_name=<model_name>' \
--header 'Authorization: Bearer '${DASHSCOPE_API_KEY} \
--header 'Content-Type: application/json'

Input parameters

Field	Type	Location	Required	Description
job_id	String	Path Parameter	Yes	Job ID.
checkpoint_id	String	Path Parameter	Yes	Checkpoint ID.
model_name	String	Path Parameter	Yes	Expected model ID after export.

Sample successful response

{
    "request_id": "ed3faa41-6be3-4271-9b83-941b23680537",
    "output": true
}

Export is asynchronous. Monitor the export status by querying the checkpoint list.

Additional operations

List jobs

HTTP

curl 'https://dashscope-intl.aliyuncs.com/api/v1/fine-tunes' \
--header 'Authorization: Bearer '${DASHSCOPE_API_KEY} \
--header 'Content-Type: application/json'

Cancel a job

Terminate a running fine-tuning job.

HTTP

curl --request POST 'https://dashscope-intl.aliyuncs.com/api/v1/fine-tunes/<job_id>/cancel' \
--header 'Authorization: Bearer '${DASHSCOPE_API_KEY} \
--header 'Content-Type: application/json'

Delete a job

Running jobs cannot be deleted.

HTTP

curl --request DELETE 'https://dashscope-intl.aliyuncs.com/api/v1/fine-tunes/<job_id>' \
--header 'Authorization: Bearer '${DASHSCOPE_API_KEY} \
--header 'Content-Type: application/json'

Model deployment

Note

Fine-tuned models support only Model Unit deployment.

Go to the Model Deployment console (Singapore) to deploy a model. For more information about billing and other details, see Pay-as-you-go (model unit).

Call the model

After deploying a model, call it using OpenAI compatible APIs, Dashscope, or the Assistant SDK.

Set the model parameter to the model’s code. Go to the Model Deployment console (Singapore) to view the Model Code.

HTTP

curl 'https://dashscope-intl.aliyuncs.com/api/v1/services/aigc/text-generation/generation' \
--header 'Authorization: Bearer '${DASHSCOPE_API_KEY}  \
--header 'Content-Type: application/json' \
--data '{
    "model": "<Replace with the model instance Code after successful deployment>",
    "input":{
        "messages":[
            {
                "role": "user",
                "content": "Who are you?"
            }
        ]
    },
    "parameters": {
        "result_format": "message"
    }
}'

FAQ

Can I upload and deploy my own models?

Uploading and deploying your own models is not currently supported. Follow the latest updates from Alibaba Cloud Model Studio.

However, Platform for AI (PAI) supports the deployment of your own models. See Deploy large language models in PAI-LLM.

Prerequisites

Fine-tuning overview

Overall procedure

Supported models

Text generation

Compare training modes

Billing

Qwen

Qwen-VL

Dataset tips

Size requirements

Data diversity and balance

Upload a training dataset

Prepare a dataset

SFT training set

SFT for thinking model

SFT for image understanding (Qwen-VL)

Single-level directory (recommended)

Multi-level directory

Upload the training file

HTTP

Fine-tuning

Create a fine-tuning job

HTTP

Input parameters

Sample response

hyper_parameters supported settings

Query job details

HTTP

Input parameters

Sample successful response

Get job logs

HTTP

Query and publish checkpoints

Query checkpoints

Publish a model

Additional operations

List jobs

Cancel a job

Delete a job

Model deployment

Call the model

FAQ

Can I upload and deploy my own models?

`hyper_parameters` supported settings