All Products
Search
Document Center

Platform For AI:Create an experiment

Last Updated:Apr 01, 2026

AutoML in Platform for AI (PAI) automates hyperparameter search across multiple trials. Define a search space and let AutoML find the optimal hyperparameter combination—no tuning code required.

How it works

Each experiment runs multiple trials in parallel or sequence. For each trial, AutoML selects a hyperparameter combination from your configured search space, then submits either a Deep Learning Containers (DLC) job or one or more MaxCompute jobs to train the model. After all trials complete, AutoML compares evaluation metrics across trials and identifies the optimal combination.

For a deeper explanation, see How AutoML works.

Prerequisites

Before you begin, make sure you have:

If you plan to use DLC jobs, also complete the following:

If you plan to use MaxCompute jobs, also complete the following:

Create an experiment

  1. Log on to the PAI console.

  2. In the left-side navigation pane, click Workspaces. On the Workspaces page, click the name of the workspace you want to manage.

  3. In the left-side navigation pane, choose Model Training > AutoML.

  4. On the AutoML page, click Create Experiment.

  5. On the Create Experiment page, configure the following sections:

  6. Click Submit.

The experiment appears in the experiment list.

Basic information

Parameter

Description

Name

The name of the experiment.

Description

A brief description to distinguish this experiment from others.

Visibility

Who can see the experiment. Visible to Me: visible to your account and workspace administrators. Visible to Current Workspace: visible to all users in the workspace.

Execution configurations

Select a Job Type: DLC or MaxCompute.

DLC

DLC jobs use Deep Learning Containers to run the training process. Configure the following parameters:

Parameter

Description

Resource Group

The resource group for running DLC jobs. Select a public resource group or a dedicated resource group you have purchased. See Create a dedicated resource group and purchase general computing resources and Create resource quotas.

Framework

The training framework. Valid values: Tensorflow, PyTorch.

Datasets

The datasets to use for training.

Code

The code repository containing the training script. DLC downloads code to a specified working directory, so make sure you have access to the repository.

Node Image

The container image for worker nodes. Options: Alibaba Cloud Image (PAI-provided images for different resource types, Python versions, and deep learning frameworks—see Public images), Custom Image (images you have added to PAI), or Image Address (a Docker registry URL for a custom, community, or PAI image).

Instance Type

The compute instance type for the job. Pricing varies by instance type—see Billing of Deep Learning Containers (DLC).

Nodes

The number of compute nodes. Each node is billed separately, so factor in per-node costs when setting this value.

vCPUs

Available when you select a dedicated resource group. Set based on the purchased resource specifications.

Memory (GiB)

Available when you select a dedicated resource group. Set based on the purchased resource specifications.

Shared Memory (GiB)

Available when you select a dedicated resource group. Set based on the purchased resource specifications.

GPUs

Available when you select a dedicated resource group. Set based on the purchased resource specifications.

Advanced Settings

Additional settings for PyTorch jobs to improve training flexibility. See Configure advanced settings.

Node Startup Command

The command to start each node. Include ${<hyperparameter-variable>} placeholders for each hyperparameter you want to tune.

Hyperparameter

The hyperparameter list, auto-populated from the variables in your startup command. For each hyperparameter, set Constraint Type and Search Space.

Node startup command example

Reference hyperparameter variables using ${variable_name} syntax. In the following example, ${batch_size} and ${lr} are the hyperparameter variables AutoML searches over:

python /mnt/data/examples/search/dlc_mnist/mnist.py \
  --data_dir=/mnt/data/examples/search/data \
  --save_model=/mnt/data/exmaples/search/model/model_${exp_id}_${trial_id} \
  --batch_size=${batch_size} \
  --lr=${lr} \
  --metric_filepath=/mnt/data/examples/search/metric/metric_${exp_id}_${trial_id}

After you enter the command, AutoML automatically loads batch_size and lr into the Hyperparameter list. Set the Constraint Type and Search Space for each.

  • Constraint Type: the constraint applied to the hyperparameter. Hover over the image.png icon next to Constraint Type to view the supported types and their descriptions.

  • Search Space: the value range for the hyperparameter. The configuration method varies by constraint type. Click the image.png icon and add values as prompted.

MaxCompute

MaxCompute jobs run SQL commands or PAI commands from Machine Learning Designer components to perform hyperparameter tuning. Configure the following parameters:

Parameter

Description

Command

The SQL command or PAI command to run. Include ${<hyperparameter-variable>} placeholders for each hyperparameter you want to tune. For configuration examples, see Appendix: References.

Hyperparameter

The hyperparameter list, auto-populated from the variables in your command. For each hyperparameter, set Constraint Type and Search Space.

Command example

In the following example, ${centerCount} and ${distanceType} are the hyperparameter variables:

pai -name kmeans
    -project algo_public
    -DinputTableName=pai_kmeans_test_input
    -DselectedColNames=f0,f1
    -DappendColNames=f0,f1
    -DcenterCount=${centerCount}
    -Dloop=10
    -Daccuracy=0.01
    -DdistanceType=${distanceType}
    -DinitCenterMethod=random
    -Dseed=1
    -DmodelName=pai_kmeans_test_output_model_${exp_id}_${trial_id}
    -DidxTableName=pai_kmeans_test_output_idx_${exp_id}_${trial_id}
    -DclusterCountTableName=pai_kmeans_test_output_couter_${exp_id}_${trial_id}
    -DcenterTableName=pai_kmeans_test_output_center_${exp_id}_${trial_id};
  • Constraint Type: the constraint applied to the hyperparameter. Hover over the image.png icon next to Constraint Type to view the supported types and their descriptions.

  • Search Space: the value range for the hyperparameter. The configuration method varies by constraint type. Click the image.png icon and add values as prompted.

Trial configuration

The trial configuration defines how AutoML evaluates each trial's result.

Parameter

Description

Metric Type

The source and format of the evaluation metric. Valid values: summary (from a TensorFlow summary file in Object Storage Service (OSS)), table (from a MaxCompute table), stdout (from the job's standard output), json (from a JSON file in OSS).

Method

How to compute the final metric from intermediate values logged during the job. Valid values: final (use the last logged value), best (use the best value across all logged values), avg (use the average of all logged values).

Metric Weight

When optimizing multiple metrics simultaneously, configure a name and weight for each metric. AutoML uses the weighted sum as the final score. The weight can be negative, and the weights do not need to sum to 1.

Metric Source

The location or identifier of the metric data. Configuration varies by Metric Type—see the table below.

Optimization

Whether to maximize or minimize the final metric. Valid values: Maximize, Minimize.

Model Name

The path where the trained model is saved. The path must include ${exp_id}_${trial_id} to keep models from different trials separate. Example: oss://examplebucket/examples/search/pai/model/model_${exp_id}_${trial_id}.

Metric source configuration by metric type

Metric type

Metric source format

Example

summary or json

OSS file path

oss://examplebucket/examples/search/pai/model/model_${exp_id}_${trial_id}

table

SQL statement that returns the metric value

select GET_JSON_OBJECT(summary, '$.calinhara') as vrc from pai_ft_cluster_evaluation_out_${exp_id}_${trial_id}

stdout

Command keyword

Set to cmdx or cmdx;xxx (for example, cmd1;worker).

Search configurations

Parameter

Description

Search Algorithm

The algorithm AutoML uses to select the next hyperparameter combination based on prior trial results. See Supported search algorithms.

Maximum Trials

The total number of trials to run in the experiment.

Maximum Concurrent Trials

The number of trials that can run at the same time. Higher concurrency speeds up exploration but reduces the benefit of sequential algorithms like TPE and GP, which rely on prior results to guide the next selection. If you use a sequential algorithm such as TPE or GP, set this to 1 or a small number to preserve the benefit of guided search.

Supported search algorithms

AutoML supports six search algorithms. Choose based on your dataset size, compute budget, and whether you want sequential or parallel exploration.

Algorithm

When to use

TPE (Tree-structured Parzen Estimator)

Good default choice. Uses prior trial results to focus the search on promising regions of the hyperparameter space. Best for sequential exploration with a limited trial budget. Set Maximum Concurrent Trials to a low value (1–2) to preserve the benefit of guided search.

Random

Use when running a large number of parallel trials, or as a baseline. Each trial is independent, so it scales well with high concurrency.

GridSearch

Use when the search space is small and discrete. Exhaustively evaluates every combination.

Evolution

Use when the search space is large and complex. Applies evolutionary strategies to iteratively improve the hyperparameter population.

GP (Gaussian Process)

Similar to TPE—uses a probabilistic model to balance exploration and exploitation. Suitable for small to medium search spaces. Like TPE, works best with low concurrency.

PBT (Population Based Training)

Use for training jobs that support periodic checkpointing. PBT dynamically reallocates resources and updates hyperparameters during training, rather than between trials.

For full details, see the "Supported search algorithms" section in Limits and usage notes of AutoML.

What's next

Appendix: References

The following example shows two chained MaxCompute commands for K-means clustering hyperparameter tuning, using the K-means Clustering and Clustering model evaluation components. The commands run in the listed order. For the full procedure, see Best practices for MaxCompute k-means clustering.

cmd1 — K-means clustering

pai -name kmeans
    -project algo_public
    -DinputTableName=pai_kmeans_test_input
    -DselectedColNames=f0,f1
    -DappendColNames=f0,f1
    -DcenterCount=${centerCount}
    -Dloop=10
    -Daccuracy=0.01
    -DdistanceType=${distanceType}
    -DinitCenterMethod=random
    -Dseed=1
    -DmodelName=pai_kmeans_test_output_model_${exp_id}_${trial_id}
    -DidxTableName=pai_kmeans_test_output_idx_${exp_id}_${trial_id}
    -DclusterCountTableName=pai_kmeans_test_output_couter_${exp_id}_${trial_id}
    -DcenterTableName=pai_kmeans_test_output_center_${exp_id}_${trial_id};

cmd2 — Clustering model evaluation

PAI -name cluster_evaluation
    -project algo_public
    -DinputTableName=pai_cluster_evaluation_test_input
    -DselectedColNames=f0,f1
    -DmodelName=pai_kmeans_test_output_model_${exp_id}_${trial_id}
    -DoutputTableName=pai_ft_cluster_evaluation_out_${exp_id}_${trial_id};

References