MXNet is a deep learning framework that supports imperative and symbolic programming. You can run MXNet on CPU or GPU clusters. This topic describes how to use the MXNet component in Machine Learning Platform for AI (PAI).

Configure the component

You can configure the component by using one of the following methods:
  • Machine Learning Platform for AI console
    Tab Parameter Description
    Parameters Setting Python Code Files The program execution file. Multiple files can be packaged to a TAR.GZ file and then uploaded.
    Primary Python File The primary file in a code file package.
    Data Source Directory The path to data sources in Object Storage Service (OSS).
    Configuration File Hyperparameters and Custom Parameters MXNet allows you to use commands to pass in hyperparameter settings. You can test different learning rates and batch sizes during model experiments.
    Output Directory The output directory of the model.
    Limit Job Runtime After you select this option, you can enter the maximum scheduled time of job execution. Valid values: 1 to 168. Unit: hours.
    Tuning GPUs The number of GPUs. Default value: 1.
  • PAI command
    PAI -name mxnet_ext
        -Dscript="oss://imagenet.oss-cn-shanghai-internal.aliyuncs.com/mxnet-ext-code/mxnet_cifar10_demo.tar.gz"
        -project algo_public_dev
        -DentryFile="train_cifar10.py"
        -Dbuckets="oss://imagenet.oss-cn-shanghai-internal.aliyuncs.com"
        -DcheckpointDir="oss://imagenet.oss-cn-shanghai-internal.aliyuncs.com/mxnet-ext-model/"
        -DhyperParameters="oss://imagenet.oss-cn-shanghai-internal.aliyuncs.com/mxnet-ext-code/hyperparam.txt.single"
        -Darn="acs:ram::1664081855183111:role/role-for-pai";
    You do not need to configure all parameters. We recommend that you do not directly copy the preceding command. The following table describes the parameters.
    Parameter Required Description Default value
    script Yes The MXNet algorithm file. This file can be a single file or TAR.GZ file. oss://imagenet.oss-cn-shanghai-internal.aliyuncs.com/smoke_mxnet/mnist_ext.py
    entryFile No The entry file for the algorithm. If the script is a TAR.GZ file, this parameter is required. No default value
    buckets No The input bucket. Separate multiple buckets with commas (,). Each bucket must end with a forward slash (/). No default value
    hyperParameters No The path of the command line hyperparameters. No default value
    gpuRequired No The number of used GPUs. 100
    checkpointDir No The directory of the checkpoint. No default value

Example

The CIFAR-10 dataset is provided by MXNet and contains 60,000 32 × 32 color images in 10 different categories. This dataset is commonly used to train machine learning algorithms to recognize objects and sort them into airplanes, cars, birds, cats, deers, dogs, frogs, horses, ships, or trucks. For more information, visit The CIFAR-10 dataset.
  1. Upload the Python execution file and training dataset to OSS. In this example, a bucket named tfmnist is created in the China (Shanghai) region.数据源准备
  2. Drag a MXNet component and connect it with a Read File Data component. Specify the region for the OSS bucket and complete RAM authorization.Example
  3. Configure MXNet component parameters. You can set the paths of the Python execution file and the data source file based on the following figure.Configure MXNet component parameters
    • Select a TAR.GZ file for Python Code Files.
    • Select an entry file in the tar file for Primary Python File.
    • Select a file in the .txt.single format for Configuration File Hyperparameters and Custom Parameters.
    • The checkpoint directory is the output directory of the model.
  4. Click Run in the upper-left corner of the canvas and wait for the component running task to complete.
  5. Right-click the MXNet component and select View Log to view the operational logs.Operational log
  6. The following figure shows the model generated in the checkpoint directory.Results model