PAI-TensorFlow allows you to configure hyperparameters by using a TXT file or specific commands. Therefore, you can try different learning rates and batch sizes during a model test.
GPU-accelerated servers will be phased out. You can submit TensorFlow tasks that run on CPU servers. If you want to use GPU-accelerated instances for model training, go to Deep Learning Containers (DLC) to submit jobs. For more information, see Submit training jobs.
Hyperparameter files
You can configure hyperparameters by using a local file. The following code shows the format of the local file:
batch_size=10
learning_rate=0.01
PAI-TensorFlow SDK for Python provides the parameters required to obtain the hyperparameters. You can use tf.app.flags.FLAGS
to read the required hyperparameters and pass them to the script that is running. Then, the hyperparameter definitions can be obtained from the model training file. Procedure:
The hyperparameter file is stored in oss://xxx.oss-cn-beijing.aliyuncs.com/tf/hyper_para.txt. Use the following sample code to read hyperparameters:
import tensorflow as tf tf.app.flags.DEFINE_string("learning_rate", "", "learning_rate") tf.app.flags.DEFINE_string("batch_size", "", "batch size") FAGS = tf.app.flags.FLAGS print("learning rate:" + FAGS.learning_rate) print("batch size:" + FAGS.batch_size)
Use
-DhyperParameters
to pass the hyperparameters to the script that is running. Example:pai -name tensorflow1120_ext -Dscript='oss://xxx.oss-cn-beijing.aliyuncs.com/tf/hello_hyperpara.py' -Dbuckets='oss://xxx.oss-cn-beijing.aliyuncs.com/' -DhyperParameters='oss://xxx.oss-cn-beijing.aliyuncs.com/tf/hyper_para.txt' -Darn='acs:ram::111***:role/***role';
STRING-type parameters
You can pass PAI-TensorFlow parameters in the form of strings by using userDefinedParameters
. Example:
pai -name tensorflow1120_ext
-Dscript='oss://xxx.oss-cn-beijing.aliyuncs.com/tf/hello_hyperpara.py'
-Dbuckets='oss://xxx.oss-cn-beijing.aliyuncs.com/'
-DuserDefinedParameters="--batch_size=10 --learning_rate=0.01"
-Darn='acs:ram::111***:role/***role';
The input parameters are in the key-value pair format and are prefixed with "--".