MXNet is a deep learning framework that supports imperative and symbolic programming. You can run MXNet on CPU or GPU clusters. This topic describes how to use the MXNet component in Machine Learning Platform for AI (PAI).
Configure the component
- Machine Learning Platform for AI console
Tab Parameter Description Parameters Setting Python Code Files The program execution file. Multiple files can be packaged to a TAR.GZ file and then uploaded. Primary Python File The primary file in a code file package. Data Source Directory The path to data sources in Object Storage Service (OSS). Configuration File Hyperparameters and Custom Parameters MXNet allows you to use commands to pass in hyperparameter settings. You can test different learning rates and batch sizes during model experiments. Output Directory The output directory of the model. Limit Job Runtime After you select this option, you can enter the maximum scheduled time of job execution. Valid values: 1 to 168. Unit: hours. Tuning GPUs The number of GPUs. Default value: 1.
- PAI command
PAI -name mxnet_ext -Dscript="oss://imagenet.oss-cn-shanghai-internal.aliyuncs.com/mxnet-ext-code/mxnet_cifar10_demo.tar.gz" -project algo_public_dev -DentryFile="train_cifar10.py" -Dbuckets="oss://imagenet.oss-cn-shanghai-internal.aliyuncs.com" -DcheckpointDir="oss://imagenet.oss-cn-shanghai-internal.aliyuncs.com/mxnet-ext-model/" -DhyperParameters="oss://imagenet.oss-cn-shanghai-internal.aliyuncs.com/mxnet-ext-code/hyperparam.txt.single" -Darn="acs:ram::1664081855183111:role/role-for-pai";You do not need to configure all parameters. We recommend that you do not directly copy the preceding command. The following table describes the parameters.
Parameter Required Description Default value script Yes The MXNet algorithm file. This file can be a single file or TAR.GZ file. oss://imagenet.oss-cn-shanghai-internal.aliyuncs.com/smoke_mxnet/mnist_ext.py entryFile No The entry file for the algorithm. If the script is a TAR.GZ file, this parameter is required. No default value buckets No The input bucket. Separate multiple buckets with commas (,). Each bucket must end with a forward slash (/). No default value hyperParameters No The path of the command line hyperparameters. No default value gpuRequired No The number of used GPUs. 100 checkpointDir No The directory of the checkpoint. No default value
- Upload the Python execution file and training dataset to OSS. In this example, a bucket named tfmnist is created in the China (Shanghai) region.
- Drag a MXNet component and connect it with a Read File Data component. Specify the region for the OSS bucket and complete RAM authorization.
- Configure MXNet component parameters. You can set the paths of the Python execution
file and the data source file based on the following figure.
- Select a TAR.GZ file for Python Code Files.
- Select an entry file in the tar file for Primary Python File.
- Select a file in the .txt.single format for Configuration File Hyperparameters and Custom Parameters.
- The checkpoint directory is the output directory of the model.
- Click Run in the upper-left corner of the canvas and wait for the component running task to complete.
- Right-click the MXNet component and select View Log to view the operational logs.
- The following figure shows the model generated in the checkpoint directory.