TensorFlow 1.15.0 of Python 3.0 is a built-in component of E-MapReduce (EMR) Data Science clusters. You can use this component without additional configurations. On the master node of a Data Science cluster, you can purchase only vCPU resources to compute TensorFlow jobs. On a core node of a Data Science cluster, you can purchase vCPU or vGPU resources to compute TensorFlow jobs. This topic describes how to view the TensorFlow version, switch the TensorFlow version, and install a Python package.

Usage guide

View the TensorFlow version

  1. Log on to the master node of your cluster in SSH mode. For more information, see Connect to the master node of an EMR cluster in SSH mode.
  2. Run the pip3 list command to view the TensorFlow version.
    list

Switch the TensorFlow version

  1. Download a compressed package that is used to switch the TensorFlow version.
    In this example, the package name is install_tf_header.tar.gz.
  2. Use a file transfer tool to upload install_tf_header.tar.gz to a directory of the master node in your Data Science cluster.
    Note In this example, the compressed package is uploaded to the /root directory.
  3. Log on to the master node of your cluster in SSH mode. For more information, see Connect to the master node of an EMR cluster in SSH mode.
  4. Run the following commands to switch the TensorFlow version:
    1. Decompress the package.
      tar -zxvf install_tf_header.tar.gz
    2. Switch the TensorFlow version.
      • Command syntax
        sh install_tf_header.sh <version>

        version specifies the destination version.

      • Example: Run the following command to switch the TensorFlow version to 2.0.3:
        sh install_tf_header.sh 2.0.3
  5. Run the pip3 list command to view the TensorFlow version.
    version_

    The TensorFlow version is switched to 2.0.3.

Install a Python package

  1. Download a Python package.
    In this example, the name of the Python package is install_app_onds.tar.gz.
  2. Use a file transfer tool to upload install_app_onds.tar.gz to a directory of the master node in your Data Science cluster.
    Note In this example, the package is uploaded to the /root directory.
  3. Log on to the master node of your cluster in SSH mode. For more information, see Connect to the master node of an EMR cluster in SSH mode.
  4. Run the following commands to install the Python package on all nodes of your Data Science cluster:
    1. Decompress the package.
      tar -zxvf install_app_onds.tar.gz
    2. Install the Python package.
      • Command syntax
        sh install_app_onds.sh <package_name> <version>
        where:
        • package_name specifies the name of the Python package that you want to install.
        • version specifies the version of the Python package that you want to install.
      • Example: Run the following command to install the GNU Readline package of version 8.0.0 on all nodes of your Data Science cluster:
        sh install_app_onds.sh gnureadline 8.0.0