This topic describes how to use data conversion features of Machine Learning Platform for AI (PAI) to convert images to TFRecord files. This way, you can use the TFRecord files to train models by using training components provided by PAI. If you use the smart labeling feature of PAI to label data, PAI generates a labeled dataset. Then, you can call the data conversion component to convert the labeled dataset to a TFRecord file. If you use other platforms to label data, you must run PAI commands to convert the labeled data to a labeled dataset supported by PAI. Then, you can convert the labeled dataset to a TFRecord file.
Convert a labeled dataset for single-label or multi-label image classification
pai -name easy_vision_ext
-Dbuckets='oss://{bucket_name}.{oss_host}/{path}/'
-Darn='acs:ram::*******:role/aliyunodpspaidefaultrole'
-DossHost='{oss_host}'
-Dcmd convert
-Dlabel_file 'oss://{bucket_name}/path/to/your/{label_file}'
-Dconvert_param_config '
--class_list_file oss://{bucket_name}/path/to/your/{class_list_file}
--max_image_size 600
--write_parallel_num 8
--num_samples_per_tfrecord 128
--test_ratio 0.1
--model_type CLASSIFICATION
'
-Doutput_tfrecord 'oss://{bucket_name}/path/to/output/data_prefix'
-Dcluster='{
\"worker\" : {
\"count\" : 1,
\"cpu\" : 800
}
}'
Convert a labeled dataset for text detection or recognition
pai -name easy_vision_ext
-Dbuckets='oss://{bucket_name}.{oss_host}/{path}/'
-Darn='acs:ram::*******:role/aliyunodpspaidefaultrole'
-DossHost='{oss_host}'
-Dcmd convert
-Dlabel_file 'oss://{bucket_name}/path/to/your/{label_file}'
-Dconvert_param_config '
--model_type TEXT_END2END
--default_class text
--max_image_size 2000
--char_replace_map_path oss://{bucket_name}/path/to/your_char_replace_map
--default_char_dict_path oss://{bucket_name}/path/to/your_char_dict
--test_ratio 0.1
--write_parallel_num 8
--num_samples_per_tfrecord 64
'
-Doutput_tfrecord 'oss://pai-vision-data-hz/test/convert/recipt_text_end2end/data'
PAI command parameters
Parameter | Required | Description | Format | Default value |
---|---|---|---|---|
cmd | Yes | The operation that you want to perform. Set the value to convert. | STRING | convert |
buckets | No | The name of the Object Storage Service (OSS) bucket. Add a forward slash (/) at the end of the bucket name. If you specify multiple buckets, separate the bucket names with commas (,). | "oss://bucket_name/? role_arn=xxx&host=yyy" "oss://bucket_1/? role_arn=xxx&host=yyy,oss://bucket_2/" | N/A |
label_file | Yes | The OSS path of the labeled dataset. For more information, see Overview. | oss://your_bucket/xxx.csv | N/A |
convert_param_config | No | The information about the conversion task. For more information, see the following table. You can also replace convert_param_config with convert_config. | --parama valuea --paramb valueb | "" |
output_tfrecord | No | The OSS path of the TFRecord file. | oss://your_dir/prefix | "" |
cluster | No | The information about the workers that are used to perform conversion in a distributed manner. | JSON string | "{\"worker\":{\"count\":3, \"cpu\": 800, \"gpu\":0, \"memory\": 20000}}" |
Parameter | Required | Description | Format | Default value |
---|---|---|---|---|
model_type | Yes | The type of models to which the converted data applies. Valid values:
Note If the value of model_type is set to TEXT_END2END or TEXT_RECOGNITION, the char_replace_map_path and default_char_dict_path parameters take effect. If the value of model_type is set to VIDEO_CLASSIFICATION, the decode_type, sample_fps, reshape_size, decode_batch_sizev, and decode_keep_size parameters take effect.
|
STRING | N/A |
class_list_file | No | The OSS path of the category file. The file contains a list of category names. The category name may alternatively be presented in the following format: Category name: Name of the mapping category. | oss://path/to/your/classlit | " |
test_ratio | No | The ratio used for dividing the set of test data into different subsets. Valid values: 0 to 1. If the value is set to 0, the total set of test data is used for training. If the value is set to 0.1, 10% of the test data is used for verification. | FLOAT | 0.1 |
max_image_size | No | The maximum pixel value for the longest edge of the image. If you have set the parameter and the size of an image exceeds the upper limit, the image is resized and saved to the TFRecord file. This reduces storage space and accelerates data reading. | INT | N/A |
max_test_image_size | No | The maximum pixel value for the longest edge of the image that is used for testing. The value of this parameter is the same as that of max_image_size. This parameter is used for configuring test data. | INT | ${max_image_size} |
default_class | No | The name of the default category. A category name not included in class_list will be named as the default category. | STRING | None |
error_class | No | The name of the invalid category. Objects or bounding boxes that belong to this category are not used for training. | STRING | N/A |
ignore_class | No | The name of the category that is used only for model detection. Bounding boxes that belong to this category are not used for training. | STRING | N/A |
converter_class | No | The name of the conversion class. | STRING | QinceConverter |
seperator | No | The separator used in the split() method to break the label file into substrings. | STRING | N/A |
image_format | No | The encoding format of the images in the TFRecord file. | STRING | jpg |
read_parallel_num | No | The number of concurrent reads. | INT | 10 |
write_parallel_num | No | The number of concurrent writes to the TFRecord file. | INT | 1 |
num_samples_per_tfrecord | No | The number of images saved in each TFRecord file. | INT | 256 |
user_defined_converter_path | No | The HTTP or OSS path of the user-defined converter code. Example: http://path/to/your/converter.py. | STRING | N/A |
user_defined_generator_path | No | The HTTP or OSS path of the user-defined generator code. Example: http://path/to/your/generator.py. | STRING | N/A |
generator_class | No | The name of the user-defined generator class. | STRING | N/A |
char_replace_map_path | No | The OSS path of the CSV file for character map replacement. The file contains two
columns named original and replaced.
|
STRING | N/A |
default_char_dict_path | No | The OSS path of the file for mapping characters to IDs. The file contains rows of characters. One row represents one character. The ID of a character equals to the row number minus 1. | STRING | N/A |
decode_type | No | The video decoding format. Valid values:
|
INT | 4 |
sample_fps | No | The number of frames extracted for sampling per second. | FLOAT | 5 |
reshape_size | No | The size of the output frame, in pixels. | INT | 224 |
decode_batch_size | No | The number of images contained in each batch for decoding. | INT | 10 |
decode_keep_size | No | The number of overlapped frames in different batches. | INT | 0 |