All Products
Search
Document Center

Platform For AI:Before you begin

Last Updated:Feb 01, 2024

This topic describes what you need to prepare before submitting training jobs, including compute resources, an image, a dataset, and a code build. Platform for AI (PAI) allows you to specify datasets stored in Apsara File Storage NAS (NAS) file systems, Cloud Parallel File Storage (CPFS) file systems, or Object Storage Service (OSS) buckets and code builds stored in Git repositories.

Prerequisites

If you use OSS to store data, make sure that the role that you use is granted the permissions to access OSS. Otherwise, I/O errors may occur when the system accesses the data stored in your OSS bucket. For more information about how to grant a service-linked role the permissions to access OSS, see Grant the permissions that are required to use DLC.

Limits

OSS is a distributed object storage service instead of a file system. When you use OSS to store data, some file system features are not supported. For example, you cannot append data to or overwrite existing files in OSS buckets.

Step 1: Prepare resources

Before you submit a training job, you need to prepare computing resources for the training. Select one of the following resources:

  • The public resource group

    After you complete Deep Learning Containers (DLC) authorization, the system automatically prepares a public resource group for you. You do not need to manually create a resource group. For more information, see Grant the permissions that are required to use DLC. You can select the public resource group when you configure a job on the Create Job page in your workspace.

  • General computing resources

    You can create a dedicated resource group, purchase the required general computing resources, and allocate computing resources in the dedicated resource group by creating resource quotas and associating them with workspaces. After you associate a resource quota with a workspace, you can use the resource quota to run training jobs in the workspace. For more information, see Resource quota for general computing resources.

  • Intelligent computing LINGJUN resources

    If you want to leverage the high performance offered by LINGJUN resources, you need to prepare the intelligent computing LINGJUN resources for the training jobs and associate the resources with the workspace. For more information, see Resource quota for intelligent computing LINGJUN resources.

Step 2: Prepare an image

Before you submit a training job, you need to prepare the image for the training environment. Select one of the following image types:

  • Community image: If you use a general development environment, you can select a public standard image from open source communities without further configuration.

  • Alibaba Cloud image: PAI provides official images based on different frameworks that are optimized for Alibaba Cloud services. These images are suitable for trainings that use Alibaba Cloud services and help you achieve improved compatibility and performance.

  • Custom image: If you have specific requirements on training environments or dependencies, you can create a custom image to meet your business requirements.

The following table lists the available community images and Alibaba Cloud images when you submit a distributed training job.

Type

Framework

Image

Community image

TensorFlow

tensorflow-training:2.3-cpu-py36-ubuntu18.04

tensorflow-training:2.3-gpu-py36-cu101-ubuntu18.04

tensorflow-training:1.15-cpu-py36-ubuntu18.04

tensorflow-training:1.15-gpu-py36-cu100-ubuntu18.04

PyTorch

pytorch-training:1.6.0-gpu-py37-cu101-ubuntu18.04

pytorch-training:1.7.1-gpu-py37-cu110-ubuntu18.04

Alibaba Cloud image

TensorFlow

tensorflow-training:1.12.2PAI-cpu-py27-ubuntu16.04

tensorflow-training:1.12.2PAI-mkl-cpu-py27-ubuntu16.04

tensorflow-training:1.12.2PAI-gpu-py27-cu100-ubuntu16.04

tensorflow-training:1.12.2PAI-cpu-py36-ubuntu16.04

tensorflow-training:1.12.2PAI-mkl-cpu-py36-ubuntu16.04

tensorflow-training:1.12.2PAI-gpu-py36-cu100-ubuntu16.04

tensorflow-training:1.15.0PAI-gpu-py27-cu100-ubuntu16.04

tensorflow-training:1.15.0PAI-gpu-py36-cu100-ubuntu16.04

PyTorch

pytorch-training:1.3.1PAI-gpu-py37-cu100-ubuntu16.04

pytorch-training:1.4.0PAI-gpu-py37-cu100-ubuntu16.04

pytorch-training:1.5.1PAI-gpu-py37-cu100-ubuntu16.04

pytorch-training:1.6.0PAI-gpu-py37-cu100-ubuntu16.04

Community image

Images

Standard images provided by the community. They support resources of various types. Click to view the details of the image files.

registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.6.0-gpu-py37-cu101-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.7.1-gpu-py37-cu110-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-cpu-py36-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-gpu-py36-cu101-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-cpu-py36-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-gpu-py36-cu100-ubuntu18.04

Replace ${region} with a specific region. Example values:

  • cn-hangzhou

  • cn-shanghai

  • cn-qingdao

  • cn-beijing

  • cn-zhangjiakou

  • cn-huhehaote

  • cn-shenzhen

  • cn-chengdu

  • cn-hongkong

  • ap-southeast-1

The following table lists the URLs of the community images when ${region} is set to cn-hangzhou.

${region}

Framework

CPU/GPU

Python version

Image URL

cn-hangzhou

Tensorflow 2.3

CPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-cpu-py36-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-cpu-py36-ubuntu18.04

Tensorflow 2.3

GPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-gpu-py36-cu101-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:2.3.0-gpu-py36-cu101-ubuntu18.04

Tensorflow 1.15

CPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-cpu-py36-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-cpu-py36-ubuntu18.04

Tensorflow 1.15

GPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-gpu-py36-cu100-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4-gpu-py36-cu100-ubuntu18.04

PyTorch 1.6

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0-gpu-py37-cu101-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0-gpu-py37-cu101-ubuntu18.04

PyTorch 1.7

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.7.1-gpu-py37-cu110-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.7.1-gpu-py37-cu110-ubuntu18.04

Image versions

This section describes the operating systems, Python versions, and third-party libraries supported by each community image.

  • tensorflow-training:2.3-cpu-py36-ubuntu18.04

    • Operating system: Ubuntu 18.04.5 LTS

    • Python version: 3.6.9

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      asn1crypto 0.24.0

      astunparse 1.6.3

      cachetools 4.2.0

      certifi 2020.12.5

      cryptography 2.1.4

      gast 0.3.3

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      google-pasta 0.2.0

      grpcio 1.34.0

      h5py 2.10.0

      idna 2.6

      importlib-metadata 3.3.0

      Keras-Preprocessing 1.1.2

      keyring 10.6.0

      keyrings.alt 3.0

      Markdown 3.3.3

      numpy 1.18.5

      oauthlib 3.1.0

      opt-einsum 3.3.0

      pip 20.2.4

      protobuf 3.14.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycrypto 2.6.1

      pygobject 3.26.1

      pyxdg 0.25

      requests 2.25.1

      requests-oauthlib 1.3.0

      rsa 4.6

      SecretStorage 2.3.1

      setuptools 51.1.1

      six 1.15.0

      tensorboard 2.4.0

      tensorboard-plugin-wit 1.7.0

      tensorflow 2.3.2

      tensorflow-estimator 2.3.0

      termcolor 1.1.0

      typing-extensions 3.7.4.3

      urllib3 1.26.2

      werkzeug 1.0.1

      wheel 0.30.0

      wrapt 1.12.1

      zipp 3.4.0

  • tensorflow-training:2.3-gpu-py36-cu101-ubuntu18.04

    • Operating system: Ubuntu 18.04.5 LTS

    • Python version: 3.6.9

    • CUDA version: 10.1

    • Third-party libraries: The following table lists the supported third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      asn1crypto 0.24.0

      astunparse 1.6.3

      cachetools 4.2.0

      certifi 2020.12.5

      cryptography 2.1.4

      grpcio 1.34.0

      gast 0.3.3

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      google-pasta 0.2.0

      h5py 2.10.0

      idna 2.6

      importlib-metadata 3.3.0

      Keras-Preprocessing 1.1.2

      keyrings.alt 3.0

      keyring 10.6.0

      Markdown 3.3.3

      numpy 1.18.5

      oauthlib 3.1.0

      opt-einsum 3.3.0

      python-apt 1.6.5+ubuntu0.5

      pip 20.2.4

      protobuf 3.14.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycrypto 2.6.1

      pygobject 3.26.1

      pyxdg 0.25

      requests 2.25.1

      requests-oauthlib 1.3.0

      rsa 4.6

      SecretStorage 2.3.1

      setuptools 51.1.1

      six 1.15.0

      tensorboard 2.4.0

      tensorboard-plugin-wit 1.7.0

      tensorflow-gpu 2.3.2

      tensorflow-estimator 2.3.0

      termcolor 1.1.0

      typing-extensions 3.7.4.3

      urllib3 1.26.2

      werkzeug 1.0.1

      wheel 0.30.0

      wrapt 1.12.1

      zipp 3.4.0

  • tensorflow-training:1.15-cpu-py36-ubuntu18.04

    • Operating system: Ubuntu 18.04.5 LTS

    • Python version: 3.6.9

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      asn1crypto 0.24.0

      astor 0.8.1

      cryptography 2.1.4

      gast 0.2.2

      google-pasta 0.2.0

      grpcio 1.34.0

      h5py 2.10.0

      idna 2.6

      importlib-metadata 3.3.0

      Keras-Preprocessing 1.1.2

      Keras-Applications 1.0.8

      keyring 10.6.0

      keyrings.alt 3.0

      Markdown 3.3.3

      numpy 1.18.5

      opt-einsum 3.3.0

      pip 20.3.3

      protobuf 3.14.0

      pycrypto 2.6.1

      pygobject 3.26.1

      pyxdg 0.25

      SecretStorage 2.3.1

      setuptools 51.1.1

      six 1.11.0

      tensorboard 1.15.0

      tensorflow 1.15.5

      tensorflow-estimator 1.15.1

      termcolor 1.1.0

      typing-extensions 3.7.4.3

      werkzeug 1.0.1

      wheel 0.30.0

      wrapt 1.12.1

      zipp 3.4.0

  • tensorflow-training:1.15-gpu-py36-cu100-ubuntu18.04

    • Operating system: Ubuntu 18.04.5 LTS

    • Python version: 3.6.9

    • CUDA version: 10.0

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      asn1crypto 0.24.0

      astor 0.8.1

      cryptography 2.1.4

      gast 0.2.2

      google-pasta 0.2.0

      grpcio 1.34.0

      h5py 2.10.0

      idna 2.6

      importlib-metadata 3.3.0

      Keras-Preprocessing 1.1.2

      Keras-Applications 1.0.8

      keyring 10.6.0

      keyrings.alt 3.0

      Markdown 3.3.3

      numpy 1.18.5

      opt-einsum 3.3.0

      pip 20.3.3

      protobuf 3.14.0

      pycrypto 2.6.1

      pygobject 3.26.1

      pyxdg 0.25

      SecretStorage 2.3.1

      setuptools 51.1.1

      six 1.11.0

      tensorboard 1.15.0

      tensorflow-gpu 1.15.5

      tensorflow-estimator 1.15.1

      termcolor 1.1.0

      typing-extensions 3.7.4.3

      werkzeug 1.0.1

      wheel 0.30.0

      wrapt 1.12.1

      zipp 3.4.0

      python-apt 1.6.5+ubuntu0.5

  • pytorch-training:1.6.0-gpu-py37-cu101-ubuntu18.04

    • Operating system: Ubuntu 18.04.4 LTS

    • Python version: 3.7.7

    • CUDA version: 10.1

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      backcall 0.2.0

      beautifulsoup4 4.9.1

      certifi 2020.6.20

      cffi 1.14.0

      cryptography 2.9.2

      conda 4.8.3

      conda-build 3.18.11

      conda-package-handling 1.7.0

      decorator 4.4.2

      filelock 3.0.12

      glob2 0.7

      ipython-genutils 0.2.0

      idna 2.9

      ipython 7.16.1

      jedi 0.17.1

      Jinja2 2.11.2

      libarchive-c 2.9

      MarkupSafe 1.1.1

      mkl-fft 1.1.0

      mkl-service 2.3.0

      mkl-random 1.1.1

      numpy 1.18.5

      olefile 0.46

      PyYAML 5.3.1

      parso 0.7.0

      pexpect 4.8.0

      pickleshare 0.7.5

      Pillow 7.2.0

      pip 20.0.2

      pkginfo 1.5.0.1

      prompt-toolkit 3.0.5

      psutil 5.7.0

      ptyprocess 0.6.0

      pycosat 0.6.3

      pycparser 2.20

      Pygments 2.6.1

      pyOpenSSL 19.1.0

      PySocks 1.7.1

      pytz 2020.1

      ruamel-yaml 0.15.87

      requests 2.23.0

      soupsieve 2.0.1

      setuptools 46.4.0.post20200518

      six 1.14.0

      traitlets 4.3.3

      torch 1.6.0

      torchvision 0.7.0

      tqdm 4.46.0

      urllib3 1.25.8

      wheel 0.34.2

      wcwidth 0.2.5

  • pytorch-training:1.7.1-gpu-py37-cu110-ubuntu18.04

    • Operating system: Ubuntu 18.04.5 LTS

    • Python version: 3.8.5

    • CUDA version: 11.0

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      backcall 0.2.0

      beautifulsoup4 4.9.3

      brotlipy 0.7.0

      certifi 2020.12.5

      cffi 1.14.3

      cryptography 3.2.1

      conda 4.9.2

      conda-build 3.21.4

      conda-package-handling 1.7.2

      dnspython 2.1.0

      decorator 4.4.2

      filelock 3.0.12

      glob2 0.7

      ipython-genutils 0.2.0

      idna 2.10

      ipython 7.19.0

      Jinja2 2.11.2

      jedi 0.17.2

      libarchive-c 2.9

      mkl-service 2.3.0

      MarkupSafe 1.1.1

      mkl-fft 1.2.0

      mkl-random 1.1.1

      numpy 1.19.2

      olefile 0.46

      PyYAML 5.3.1

      parso 0.7.0

      pexpect 4.8.0

      pickleshare 0.7.5

      Pillow 8.1.0

      pip 20.2.4

      pkginfo 1.7.0

      prompt-toolkit 3.0.8

      psutil 5.7.2

      ptyprocess 0.7.0

      pycosat 0.6.3

      pycparser 2.20

      Pygments 2.7.4

      pyOpenSSL 19.1.0

      PySocks 1.7.1

      python-etcd 0.4.5

      pytz 2020.5

      ruamel-yaml 0.15.87

      requests 2.24.0

      soupsieve 2.1

      setuptools 50.3.1.post20201107

      six 1.15.0

      typing-extensions 3.7.4.3

      torch 1.7.1

      torchelastic 0.2.1

      torchvision 0.8.2

      tqdm 4.51.0

      traitlets 5.0.5

      urllib3 1.25.11

      wheel 0.35.1

      wcwidth 0.2.5

Alibaba Cloud image

Images

Official images provided by Alibaba Cloud. Click to view the details of the image files.

registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py27-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu100-ubuntu16.04

registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py36-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu100-ubuntu16.04

registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py27-cu100-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py36-cu100-ubuntu16.04

registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.3.1PAI-gpu-py37-cu100-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.4.0PAI-gpu-py37-cu100-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.5.1PAI-gpu-py37-cu100-ubuntu16.04
registry.${region}.aliyuncs.com/pai-dlc/pytorch-training:1.6.0PAI-gpu-py37-cu100-ubuntu16.04

registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu101-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu101-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-cpu-py36-ubuntu18.04
registry.${region}.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-gpu-py36-cu101-ubuntu18.04

Replace ${region} with a specific region. Example values:

  • cn-hangzhou

  • cn-shanghai

  • cn-qingdao

  • cn-beijing

  • cn-zhangjiakou

  • cn-huhehaote

  • cn-shenzhen

  • cn-chengdu

  • cn-hongkong

  • ap-southeast-1

The following table lists the URL of PAI images when ${region} is set to cn-hangzhou.

${region}

Framework

CPU/GPU

Python version

Image URL

cn-hangzhou

TensorFlow 1.12

CPU

2.7 (py27)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-cpu-py27-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-cpu-py27-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py27-ubuntu18.04

MKL-CPU

2.7 (py27)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py27-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py27-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-mkl-cpu-py27-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-mkl-cpu-py27-ubuntu16.04

GPU

2.7 (py27)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-gpu-py27-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-gpu-py27-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu101-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py27-cu101-ubuntu18.04

CPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-cpu-py36-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-cpu-py36-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-cpu-py36-ubuntu18.04

MKL-CPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py36-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-mkl-cpu-py36-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-mkl-cpu-py36-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-mkl-cpu-py36-ubuntu16.04

GPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-gpu-py36-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI2011-gpu-py36-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu101-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.12.2PAI-gpu-py36-cu101-ubuntu18.04

TensorFlow 1.15

GPU

2.7 (py27)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py27-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py27-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI2011-gpu-py27-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI2011-gpu-py27-cu100-ubuntu16.04

CPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-cpu-py36-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-cpu-py36-ubuntu18.04

GPU

3.6 (py36)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py36-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI-gpu-py36-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI2011-gpu-py36-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.0PAI2011-gpu-py36-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-gpu-py36-cu101-ubuntu18.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/tensorflow-training:1.15.4PAI-gpu-py36-cu101-ubuntu18.04

PyTorch 1.3

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.3.1PAI-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.3.1PAI-gpu-py37-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.3.1PAI2011-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.3.1PAI2011-gpu-py37-cu100-ubuntu16.04

PyTorch 1.4

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.4.0PAI-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.4.0PAI-gpu-py37-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.4.0PAI2011-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.4.0PAI2011-gpu-py37-cu100-ubuntu16.04

PyTorch 1.5

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.5.1PAI-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.5.1PAI-gpu-py37-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.5.1PAI2011-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.5.1PAI2011-gpu-py37-cu100-ubuntu16.04

PyTorch 1.6

GPU

3.7 (py37)

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0PAI-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0PAI-gpu-py37-cu100-ubuntu16.04

  • registry.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0PAI2011-gpu-py37-cu100-ubuntu16.04

  • registry-vpc.cn-hangzhou.aliyuncs.com/pai-dlc/pytorch-training:1.6.0PAI2011-gpu-py37-cu100-ubuntu16.04

Image versions

This section describes the operating systems, Python versions, and third-party libraries supported by each Alibaba Cloud image.

  • tensorflow-training:1.12.2PAI-cpu-py27-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 2.7.18 Anaconda

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.15

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      backports.weakref 1.0.post1

      certifi 2020.6.20

      crcmod 1.7

      Cython 0.29.14

      enum34 1.1.6

      funcsigs 1.0.2

      futures 3.3.0

      gast 0.4.0

      grpcio 1.27.2

      h5py 2.10.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.1.1

      mkl-fft 1.0.15

      mkl-random 1.1.0

      mkl-service 2.3.0

      mock 3.0.5

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.9.1

      paiio 0.1.0

      pip 9.0.1

      protobuf 3.14.0

      pycryptodome 3.9.7

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.1.12.2pai2011

      requests 2.13.0

      setuptools 36.4.0

      six 1.15.0

      tensorboard 1.12.2

      tensorflow 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      Werkzeug 1.0.1

      wheel 0.35.1

  • tensorflow-training:1.12.2PAI-mkl-cpu-py27-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 2.7.18 Anaconda

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.15

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      backports.weakref 1.0.post1

      certifi 2020.6.20

      crcmod 1.7

      Cython 0.29.14

      enum34 1.1.6

      funcsigs 1.0.2

      futures 3.3.0

      gast 0.4.0

      grpcio 1.27.2

      h5py 2.10.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.1.1

      mkl-fft 1.0.15

      mkl-random 1.1.0

      mkl-service 2.3.0

      mock 3.0.5

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.9.1

      paiio 0.1.0

      pip 9.0.1

      protobuf 3.14.0

      pycryptodome 3.9.7

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.1.12.2pai2011

      requests 2.13.0

      setuptools 36.4.0

      six 1.15.0

      tensorboard 1.12.2

      tensorflow 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      Werkzeug 1.0.1

      wheel 0.35.1

  • tensorflow-training:1.12.2PAI-gpu-py27-cu100-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 2.7.18 Anaconda

    • CUDA version: 10.0

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.15

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      backports.weakref 1.0.post1

      certifi 2020.6.20

      crcmod 1.7

      Cython 0.29.14

      enum34 1.1.6

      funcsigs 1.0.2

      futures 3.3.0

      gast 0.4.0

      grpcio 1.27.2

      h5py 2.10.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.1.1

      mkl-fft 1.0.15

      mkl-random 1.1.0

      mkl-service 2.3.0

      mock 3.0.5

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.9.1

      paiio 0.1.0

      pip 9.0.1

      protobuf 3.14.0

      pycryptodome 3.9.7

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.gpu.1.12.2pai2011

      requests 2.13.0

      setuptools 36.4.0

      six 1.15.0

      tensorboard 1.12.2

      tensorflow-gpu 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      Werkzeug 1.0.1

      wheel 0.35.1

      subprocess32 3.5.4

      tao-wrapper 0.1.1

      whale 0.0.2

  • tensorflow-training:1.12.2PAI-cpu-py36-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 3.6.12 Anaconda

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.29

      aliyun-python-sdk-core-v3 2.13.11

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      cached-property 1.5.2

      certifi 2020.12.5

      crcmod 1.7

      Cython 0.29.21

      gast 0.4.0

      grpcio 1.31.0

      h5py 3.1.0

      importlib-metadata 3.4.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.12.1

      paiio 0.1.0

      pip 20.2.4

      protobuf 3.14.0

      pycryptodome 3.9.9

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.1.12.2pai2011

      requests 2.13.0

      setuptools 50.3.1.post20201107

      six 1.15.0

      tensorboard 1.12.2

      tensorflow 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      typing-extensions 3.7.4.3

      Werkzeug 1.0.1

      wheel 0.35.1

      zipp 3.4.0

  • tensorflow-training:1.12.2PAI-mkl-cpu-py36-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 3.6.12 Anaconda

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.29

      aliyun-python-sdk-core-v3 2.13.11

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      cached-property 1.5.2

      certifi 2020.12.5

      crcmod 1.7

      Cython 0.29.21

      gast 0.4.0

      grpcio 1.31.0

      h5py 3.1.0

      importlib-metadata 3.4.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.12.1

      paiio 0.1.0

      pip 20.2.4

      protobuf 3.14.0

      pycryptodome 3.9.9

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.1.12.2pai2011

      requests 2.13.0

      setuptools 50.3.1.post20201107

      six 1.15.0

      tensorboard 1.12.2

      tensorflow 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      typing-extensions 3.7.4.3

      Werkzeug 1.0.1

      wheel 0.35.1

      zipp 3.4.0

  • tensorflow-training:1.12.2PAI-gpu-py36-cu100-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 3.6.12 Anaconda

    • CUDA version: 10.0

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.29

      aliyun-python-sdk-core-v3 2.13.11

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      cached-property 1.5.2

      certifi 2020.12.5

      crcmod 1.7

      Cython 0.29.21

      gast 0.4.0

      grpcio 1.31.0

      h5py 3.1.0

      importlib-metadata 3.4.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.12.1

      paiio 0.1.0

      pip 20.2.4

      protobuf 3.14.0

      pycryptodome 3.9.9

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.gpu.1.12.2pai2011

      requests 2.13.0

      setuptools 50.3.1.post20201107

      six 1.15.0

      tensorboard 1.12.2

      tensorflow-gpu 1.12.2PAI2011

      termcolor 1.1.0

      toposort 1.5

      typing-extensions 3.7.4.3

      Werkzeug 1.0.1

      wheel 0.35.1

      zipp 3.4.0

      subprocess32 3.5.4

      tao-wrapper 0.1.1

      whale 0.0.2

  • tensorflow-training:1.15.0PAI-gpu-py27-cu100-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 2.7.18 Anaconda

    • CUDA version: 10.0

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.15

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      backports.weakref 1.0.post1

      certifi 2020.6.20

      crcmod 1.7

      Cython 0.29.14

      enum34 1.1.6

      funcsigs 1.0.2

      functools32 3.2.3.post2

      futures 3.3.0

      gast 0.2.2

      google-pasta 0.2.0

      opt-einsum 2.3.2

      tensorflow-estimator 1.15.1

      grpcio 1.27.2

      h5py 2.10.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.1.1

      mkl-fft 1.0.15

      mkl-random 1.1.0

      mkl-service 2.3.0

      mock 3.0.5

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.9.1

      paiio 0.1.0

      pip 9.0.1

      protobuf 3.14.0

      pycryptodome 3.9.7

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.gpu.1.15.0

      requests 2.13.0

      setuptools 44.1.1

      six 1.15.0

      tensorboard 1.15.0

      tensorflow-gpu 1.15.0

      termcolor 1.1.0

      toposort 1.5

      Werkzeug 1.0.1

      wheel 0.35.1

      subprocess32 3.5.4

      tao-wrapper 0.1.1

      whale 0.0.2

      wrapt 1.12.1

  • tensorflow-training:1.15.0PAI-gpu-py36-cu100-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 3.6.12 Anaconda

    • CUDA version: 10.0

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aliyun-python-sdk-core 2.13.29

      aliyun-python-sdk-core-v3 2.13.11

      aliyun-python-sdk-kms 2.14.0

      astor 0.8.1

      cached-property 1.5.2

      certifi 2020.12.5

      crcmod 1.7

      Cython 0.29.21

      gast 0.2.2

      grpcio 1.31.0

      h5py 3.1.0

      importlib-metadata 3.4.0

      jmespath 0.10.0

      Keras-Applications 1.0.8

      Keras-Preprocessing 1.1.2

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      numpy 1.16.4

      opencv-python 4.2.0.32

      oss2 2.12.1

      paiio 0.1.0

      pip 20.2.4

      protobuf 3.14.0

      pycryptodome 3.9.9

      pyodps 0.10.4

      pypai 1.1.0+tensorflow.gpu.1.15.0

      requests 2.13.0

      setuptools 50.3.1.post20201107

      six 1.15.0

      tensorboard 1.15.0

      tensorflow-gpu 1.15.0

      termcolor 1.1.0

      toposort 1.5

      typing-extensions 3.7.4.3

      Werkzeug 1.0.1

      wheel 0.35.1

      zipp 3.4.0

      subprocess32 3.5.4

      tao-wrapper 0.1.1

      whale 0.0.2

      google-pasta 0.2.0

      opt-einsum 3.3.0

      tensorflow-estimator 1.15.1

      wrapt 1.12.1

  • pytorch-training:1.3.1PAI-gpu-py37-cu100-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 3.7.4

    • CUDA version: 10.0

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aiohttp 3.7.3

      apex 0.1

      asn1crypto 1.2.0

      async-timeout 3.0.1

      attrs 20.3.0

      blinker 1.4

      cachetools 4.2.0

      certifi 2020.12.5

      cffi 1.13.0

      cryptography 2.8

      click 7.1.2

      conda 4.9.2

      conda-package-handling 1.6.0

      future 0.18.2

      grpcio 1.31.0

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      importlib-metadata 2.0.0

      idna 2.8

      multidict 4.7.6

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      nvidia-dali 0.15.0

      numpy 1.19.2

      oauthlib 3.1.0

      PySocks 1.7.1

      Pillow 8.1.0

      pip 20.2.4

      protobuf 3.13.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycosat 0.6.3

      pycparser 2.19

      PyJWT 2.0.0

      pyOpenSSL 19.0.0

      ruamel-yaml 0.15.46

      requests 2.22.0

      requests-oauthlib 1.3.0

      rsa 4.7

      six 1.15.0

      sailfish 1.0.1

      setuptools 50.3.1.post20201107

      typing-extensions 3.7.4.3

      tensorboard 2.3.0

      tensorboard-plugin-wit 1.6.0

      torch 1.3.1+ali

      torchsummary 1.5.1

      torchvision 0.4.2

      tqdm 4.36.1

      urllib3 1.24.2

      Werkzeug 1.0.1

      wheel 0.35.1

      yarl 1.5.1

      zipp 3.4.0

  • pytorch-training:1.4.0PAI-gpu-py37-cu100-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 3.7.4

    • CUDA version: 10.0

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aiohttp 3.7.3

      apex 0.1

      asn1crypto 1.2.0

      async-timeout 3.0.1

      attrs 20.3.0

      blinker 1.4

      cachetools 4.2.0

      certifi 2020.12.5

      cffi 1.13.0

      cryptography 2.8

      click 7.1.2

      conda 4.9.2

      conda-package-handling 1.6.0

      future 0.18.2

      grpcio 1.31.0

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      importlib-metadata 2.0.0

      idna 2.8

      multidict 4.7.6

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      nvidia-dali 0.15.0

      numpy 1.19.2

      oauthlib 3.1.0

      PySocks 1.7.1

      Pillow 8.1.0

      pip 20.2.4

      protobuf 3.13.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycosat 0.6.3

      pycparser 2.19

      PyJWT 2.0.0

      pyOpenSSL 19.0.0

      ruamel-yaml 0.15.46

      requests 2.22.0

      requests-oauthlib 1.3.0

      rsa 4.7

      six 1.15.0

      setuptools 50.3.1.post20201107

      typing-extensions 3.7.4.3

      tensorboard 2.3.0

      tensorboard-plugin-wit 1.6.0

      torch 1.4.0+ali

      torchsummary 1.5.1

      torchvision 0.5.0

      tqdm 4.36.1

      urllib3 1.24.2

      wheel 0.35.1

      Werkzeug 1.0.1

      yarl 1.5.1

      zipp 3.4.0

  • pytorch-training:1.5.1PAI-gpu-py37-cu100-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 3.7.4

    • CUDA version: 10.0

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aiohttp 3.7.3

      apex 0.1

      asn1crypto 1.2.0

      async-timeout 3.0.1

      attrs 20.3.0

      blinker 1.4

      cachetools 4.2.0

      certifi 2020.12.5

      cffi 1.13.0

      cryptography 2.8

      click 7.1.2

      conda 4.9.2

      conda-package-handling 1.6.0

      future 0.18.2

      grpcio 1.31.0

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      importlib-metadata 2.0.0

      idna 2.8

      multidict 4.7.6

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      nvidia-dali 0.15.0

      numpy 1.19.2

      oauthlib 3.1.0

      PySocks 1.7.1

      Pillow 8.1.0

      pip 20.2.4

      protobuf 3.13.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycosat 0.6.3

      pycparser 2.19

      PyJWT 2.0.0

      pyOpenSSL 19.0.0

      rsa 4.7

      requests 2.22.0

      requests-oauthlib 1.3.0

      ruamel-yaml 0.15.46

      six 1.15.0

      sailfish 1.0.1

      setuptools 50.3.1.post20201107

      typing-extensions 3.7.4.3

      tensorboard 2.3.0

      tensorboard-plugin-wit 1.6.0

      torch 1.5.1+ali

      torchsummary 1.5.1

      torchvision 0.6.1

      tqdm 4.36.1

      urllib3 1.24.2

      wheel 0.35.1

      Werkzeug 1.0.1

      yarl 1.5.1

      zipp 3.4.0

  • pytorch-training:1.6.0PAI-gpu-py37-cu100-ubuntu16.04

    • Operating system: Ubuntu 16.04.6 LTS

    • Python version: 3.7.4

    • CUDA version: 10.0

    • Third-party libraries: The following table lists the third-party libraries and versions.

      Third-party library and version

      absl-py 0.11.0

      aiohttp 3.7.3

      asn1crypto 1.2.0

      async-timeout 3.0.1

      attrs 20.3.0

      blinker 1.4

      cachetools 4.2.0

      certifi 2020.12.5

      cffi 1.13.0

      cryptography 2.8

      click 7.1.2

      conda 4.9.2

      conda-package-handling 1.6.0

      future 0.18.2

      grpcio 1.31.0

      google-auth 1.24.0

      google-auth-oauthlib 0.4.2

      importlib-metadata 2.0.0

      idna 2.8

      multidict 4.7.6

      Markdown 3.3.3

      mkl-fft 1.2.0

      mkl-random 1.1.1

      mkl-service 2.3.0

      nvidia-dali 0.15.0

      numpy 1.19.2

      oauthlib 3.1.0

      PySocks 1.7.1

      Pillow 8.1.0

      pip 20.2.4

      protobuf 3.13.0

      pyasn1 0.4.8

      pyasn1-modules 0.2.8

      pycosat 0.6.3

      pycparser 2.19

      PyJWT 2.0.0

      pyOpenSSL 19.0.0

      ruamel-yaml 0.15.46

      requests 2.22.0

      requests-oauthlib 1.3.0

      rsa 4.7

      six 1.15.0

      setuptools 50.3.1.post20201107

      typing-extensions 3.7.4.3

      tensorboard 2.3.0

      tensorboard-plugin-wit 1.6.0

      torch 1.6.0+ali

      torchsummary 1.5.1

      torchvision 0.7.0

      tqdm 4.36.1

      urllib3 1.24.2

      Werkzeug 1.0.1

      wheel 0.35.1

      yarl 1.5.1

      zipp 3.4.0

Custom image

Custom images that you uploaded to PAI. If you choose to use a custom image, we recommend that you go to the AI Computing Asset Management > Images page and add the custom image as an AI asset. This way, the image can be used by multiple training jobs. For more information, see View and add images.

Important

If you use a custom image to submit training jobs that run on LINGJUN resources, take note of the usage notes. For more information, see RDMA (intelligent computing LINGJUN resources).

Step 3: Prepare a dataset

Before you submit a deep learning job, you need to upload the dataset required by the job to an OSS bucket or a NAS file system and register the dataset so that the job can use the dataset.

Supported dataset types

Datasets of the following types are supported: OSS, General-purpose NAS, Extreme NAS, CPFS, and CPFS for Lingjun.

  • You can enable the dataset acceleration feature for datasets of the OSS and CPFS type. When you submit a distributed training job, you can use the dataset acceleration feature to improve data read efficiency.

  • If you use LINGJUN resources to run DLC jobs, you can enable dataset acceleration only for OSS datasets.

Create a dataset

For information about how to configure the parameters, see Create and manage datasets. Take note of the following items:

  • When you create a dataset for training jobs, you need to select Alibaba Cloud Storage Service and set Property to Folder.

  • Compared to NAS, OSS is a distributed object storage service instead of a file system. When you use OSS to store data, some file system features are not supported. For example, you cannot append data to or overwrite existing files in OSS buckets.

  • If you select a CPFS dataset, you also need to configure the virtual private cloud (VPC). The VPC must be the same as the one that you configured for the CPFS dataset. Otherwise, exceptions may occur and the DLC training jobs are removed from the queue after you submit the jobs.

Step 4: Prepare a code build

Before you submit a deep learning job, you need to add the code required by the job to a code build. We recommend that you go to the AI Computing Asset Management > Source Code Repositories page and add the code build as an AI asset. This way, the code build can be used by multiple training jobs. For more information, see Code builds.

References

After you complete the preparations, you can create a training task. For more information, see Submit a training job.