PAI-TensorFlow allows you to read data from Object Storage Service (OSS) buckets and MaxCompute tables.
GPU-accelerated servers will be phased out. You can submit TensorFlow tasks that run on CPU servers. If you want to use GPU-accelerated instances for model training, go to Deep Learning Containers (DLC) to submit jobs. For more information, see Submit training jobs.
Read OSS data
Procedure | Description |
Upload data to OSS. Before you use deep learning frameworks to process the data, you need to upload the data to an OSS bucket. |
|
Grant permissions on OSS | To read data from an OSS bucket by using Platform for AI (PAI), you need to assign the AliyunODPSPAIDefaultRole role to the account that you use. For more information, see Grant the permissions that are required to use Machine Learning Designer. |
Authorize a RAM role
| You can authorize a RAM role to allow PAI to access OSS. For more information, see the "Grant your RAM user or RAM role the permissions to access OSS" section in the Grant the permissions that are required to use Machine Learning Designer topic. |
Use PAI-TensorFlow to read OSS data. | Connect the Read File Data component to the TensorFlow component. |
The following table describes the permissions of the default role AliyunODPSPAIDefaultRole.
Permission | Description |
oss:PutObject | Uploads an object. |
oss:GetObject | Queries an object. |
oss:ListObjects | Queries objects. |
oss:DeleteObjects | Deletes an object. |
How PAI-TensorFlow reads OSS data:
Inefficient I/O approaches
You can run TensorFlow code on your on-premises machine or in the cloud in a distributed manner. The following list describes the differences between the two approaches:
Read data from your on-premises machine: The server directly obtains graphs from the client for computing.
Read data from the cloud: The server obtains graphs and distributes these graphs to workers for computing.
Usage notes
Do not use built-in approaches of Python to read data from your on-premises machine.
PAI supports the built-in I/O approaches of Python. To use these approaches, you must compress the data source and code into a package and upload this package to OSS. This approach writes data to the memory for computing and is inefficient. We recommend that you do not use this approach. The following sample code provides an example on how this approach works:
import csv csv_reader=csv.reader(open('csvtest.csv')) for row in csv_reader: print(row)
Do not use third-party libraries to read data.
You can read data by using third-party libraries, such as TFLearn and pandas. However, third-party libraries are encapsulated in a Python package to read data. In this case, data reading in PAI is inefficient.
Do not perform preload operations to read data.
You may find that GPUs are not significantly faster than on-premises CPUs. The possible cause is that I/O operations waste resources. A preload operation reads data to the memory, and then performs session operations, such as feeding, for computing. A preload operation wastes computing resources and cannot process large amounts of data due to the memory limit.
For example, if a hard disk contains an image dataset, you must load the image dataset before computing starts. The loading requires 0.1s and the computing requires 0.9s. The GPU is idle for 0.1s every second. This reduces efficiency.
Efficient I/O approaches
An efficient I/O approach converts data to Operations (ops) and calls the session.run method to read the data. A read thread loads the images from the source file system to a memory queue, and a compute thread directly retrieves the data from the memory queue for computing. This prevents computing resources from being idled and wasted.
The following sample code shows how to read data by using ops:
import argparse import tensorflow as tf import os FLAGS=None def main(_): dirname = os.path.join(FLAGS.buckets, "csvtest.csv") reader=tf.TextLineReader() filename_queue=tf.train.string_input_producer([dirname]) key,value=reader.read(filename_queue) record_defaults=[[''],[''],[''],[''],['']] d1, d2, d3, d4, d5= tf.decode_csv(value, record_defaults, ',') init=tf.initialize_all_variables() with tf.Session() as sess: sess.run(init) coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess,coord=coord) for i in range(4): print(sess.run(d2)) coord.request_stop() coord.join(threads) if __name__ == '__main__': parser = argparse.ArgumentParser() parser.add_argument('--buckets', type=str, default='', help='input data path') parser.add_argument('--checkpointDir', type=str, default='', help='output model path') FLAGS, _ = parser.parse_known_args() tf.app.run(main=main)
Description of parameters in the preceding code:
dirname: the path of the OSS object. The value of this parameter can be an array.
reader: PAI-TensorFlow provides APIs for different types of readers. You can select a reader based on your business requirements.
tf.train.string_input_producer: converts the object into a queue.
tf.decode_csv: provides an op to split data. You can use this op to obtain specific parameters in each row.
To retrieve data by using an op, you must call the tf.train.Coordinator() and tf.train.start_queue_runners(sess=sess,coord=coord) methods in a session.
Read MaxCompute data
You can use the TensorFlow component in Machine Learning Designer to read data from and write data to MaxCompute.
The following table uses the iris sample dataset to describe how to read MaxCompute data.
Procedure | Description |
Connect components. | Drag the components to the canvas and connect the components. |
Configure the Read MaxCompute Table component. | Drag the Read MaxCompute Table component to the canvas and click the component. In the Table Name field on the Select Table tab in the right-side pane, enter the following code to obtain the data:
Data obtained: The following figure describes data formats. |
Configure the TensorFlow component. |
If both input and output are MaxCompute tables, you need to only connect the MaxCompute tables to Input port 2 and Output port 5. To read data from and write data to MaxCompute tables, you need to create tables and configure data sources, code files, and output model paths.
PAI command
Replace the descriptions in braces {} with actual values. |
Read data from MaxCompute tables. | We recommend that you call the TabelRecordDataset method to read and write MaxCompute data. For more information about this method and examples, see TableRecordDataset. |
Write data to MaxCompute tables. |