The prediction component can use models trained by EasyVision to perform offline prediction. This topic describes the input data formats and command parameters for offline prediction.

Overview

EasyVision can read data from and write the results to MaxCompute tables. EasyVision can also read OSS files for prediction, and then write the results back to OSS files. The offline prediction process can be viewed as an assembly line process. Each atomic operation can be processed asynchronously and concurrently in multiple threads on each worker. During the I/O operation, each worker obtains input data by using shards and write data to corresponding output shards. For example, when images are read from a table for model prediction, the system splits data in the input table based on the number of workers. Each worker reads its own data, decodes Base64 data, performs model prediction, and writes the results to the output table. Base64 decoding and model prediction are performed asynchronously by multiple threads. This can make full use of the CPU and GPU computing power for concurrent processing. The following figure shows the Base64 decoding and model prediction processes.Offline prediction process

EasyVision provides video-level prediction models to process video data. You can also call image-related models to predict video frames. The offline processing framework of EasyVision automatically decodes video data, predicts single-frame images, and summarizes all video frame results.

You may want to load trained models for offline prediction. The default prediction code provided by EasyVision cannot meet such requirements. Therefore, EasyVision allows you to customize prediction code and reuse the existing I/O feature of ev_predict. You can perform the offline prediction of your model by downloading and decoding data. EasyVision also allows you to insert a custom process before prediction. Input data is processed before being sent to the predictor. For more information, see Custom input data.

Input data formats

  • Data read from MaxCompute tables
    The input table can contain one or more columns. One column is for the Base64-encoded URLs or binary data of images. The data in this column is of the string type. Schema example:
    +------------------------------------------------------------------------------------+
    | Field           | Type       | Label | Comment                                     |
    +------------------------------------------------------------------------------------+
    | id              | string     |       |                                             |
    | url             | string     |       |                                             |
    +------------------------------------------------------------------------------------+
  • Data read from OSS files
    Each column of the OSS input file is a URL or an OSS endpoint. Example:
    oss://your/path/to/image.jpg
    http://your.path/to/image.jpg
  • Custom input data

    When data is read from a MaxCompute table, only the URL or Base64-encoded data of the image is obtained. When data is read from an OSS file, only the URL of the image is obtained for you to download and decode the image. Either method can only obtain NumPy arrays of the image. The {"image": np.ndarray} format is used by various processes and predictors during prediction. As more and more users employ custom predictors and processes, this single input format cannot meet requirements. Therefore, the OSS reading mode is modified to support custom data formats.

    The custom format can be the original OSS file format or JSON string format. Each line in the OSS file is a JSON string. You can enter multiple key-value pairs. All key-value pairs are saved in a dictionary and transferred to custom predictors and processes. You can easily obtain the corresponding values based on custom keys.

    If the value is an OSS path or URL, the system automatically uses multiple threads to download the file content and converts the value to a Python file-like object. You can directly call the file method such as read() or readlines() to obtain the file content. If the value points to a file with an image extension, the system automatically decodes the image. You obtain the value from the input_data dictionary based on the key. The value is of the numpy.ndarray type.

    Input data example:
    {"image":"http://your/path/to/image.jpg", "prior":"oss://your/path/to/prior.txt", "config": {"key1":1, "key2":"value2"}}
    {"image":"http://your/path/to/image.jpg", "prior":"oss://your/path/to/prior.txt", "config": {"key2":1, "key2":"value2"}}
    The preceding input data will be converted into data in the input_data dictionary.
    • The value of the image field is the decoded data of an image.
    • The value of the prior field is a file.
    • The value of the config field is a dictionary of JSON strings.
    The input_data dictionary is in the following format. For all custom processes and predictors, you can find their values based on their keys.
    input_dict = {
      "image": np.ndarray,
      "prior" : file_like_object,
      "config": {"key1":1, "key2":"value2"}
    }
    Note All built-in predictors of EasyVision use the image key to obtain input images. If you want to use a custom input format to call the built-in predictors of EasyVision, the image key must be used for image data.

Parameters

Parameter Required Description Type Default value
model_path Yes The OSS path of the model. Example: "oss://your_bucket/your_model_dir". STRING No default value
model_type Yes The type of the model. Valid values:
  • feature_extractor: feature extraction
  • classifier: image classification
  • multilabel_classifier: multi-label classification
  • detector: object detection
  • text_detector: text detection
  • text_recognizer: text recognition
  • text_detection_recognition: text detection and recognition
  • text_spotter: end-to-end text recognition
  • segmentor: image segmentation
  • self_define: custom prediction
If model_type is set to self_define, the prediction classes in the specified user_predictor_cls are loaded.
STRING No default value
buckets Yes The OSS bucket information. If you use a custom model, you must specify the OSS bucket information to store the model. Example: "oss://{bucket_name}.{oss_host}/{path}". STRING No default value
arn Yes The RAM information. Example: "acs:ram::*********:role/aliyunodpspaidefaultrole" STRING No default value
feature_name No The name of the feature. This parameter is required when model_type is set to feature_extractor. Example: resnet_v1_50/logits. STRING Empty string ("")
input_table No The name of the input table. Example: "odps://prj_name/tables/table_name" for a non-partition table and "odps://prj_name/tables/table_name/pt=xxx" for a partition table. STRING Empty string ("")
image_col No The name of the column that contains the image data. STRING "image"
image_type No The data format of an image. Valid values:
  • base64: indicates that the image is stored in the table as Base64-encoded data.
  • url: indicates that the URL or OSS path of the image is stored in the table.
STRING "base64"
reserved_columns No The names of reserved data columns. Separate multiple names with commas (,). Example: "col1,col2,col3". STRING Empty string ("")
result_column No The name of the result column. STRING "prediction_result"
output_table No The output table, which is in the same format as the input table. If the table does not exist, an output table is automatically created and partitions are created for it. You can also create an output table and partitions in advance. STRING Empty string ("")
lifecycle No The lifecycle of the output table. INT 10
num_worker No The number of prediction workers. More workers can accelerate the overall speed of offline prediction. INT 2
cpuRequired No The CPU resources for a worker. 100 stands for one CPU. INT 1600
gpuRequired No The GPU resources for a worker. 100 stands for one GPU card. Up to 100 GPU cards can be specified. A value of 0 indicates that the CPU cluster is used. INT 100
input_oss_file No The path of the input OSS file. A line in the file can be in one of the following formats:
  • The OSS path or URL of the image to be predicted. Example: oss://your_bucket/filelist.txt.
  • A JSON string. For more information, see Custom input data.
STRING Empty string ("")
output_oss_file No The path of the output OSS file to store prediction results. The system generates num_worker result files prefixed with this file name. These files will be merged into a result file later. STRING Empty string ("")
output_dir No The folder where the output files are stored. Example: "oss://your_bucket/dir". If you use a custom output format, all result image files are saved to this folder. STRING Empty string ("")
user_resource No The path of the upload resource, which can be a TAR.GZ, ZIP, or Python file. OSS paths or HTTP URLs are supported. Example: oss://xxx/a.tar.gz or http://a.com/c.zip. STRING Empty string ("")
user_predictor_cls No The module path of the class to which the custom predictor belongs. If you implement Predictor A in module.py, the module path of Predictor A is module.A. STRING Empty string ("")
user_process_config No The configurations of the custom process. The following fields are used to configure a custom process. You can also add other custom fields.
  • job_name: the name of the custom process.
  • num_threads: the number of concurrent threads for a custom process.
  • batch_size: the batch size of the data to be processed.
  • user_process_cls: the module path of the class to which the custom process belong. For example, if you implement Process A in module.py, the module path of Process A is module.A.
Example: '["job_name":"myprocess","user_process_cls": module.ClassA "num_threads":2, "batch_size":1]}'
JSON string Empty string ("")
queue_size No The length of the cache queue. INT 1024
batch_size No The batch size for prediction. INT 1
preprocess_thread_num No The number of concurrent preprocessing (image decoding and download) threads. INT 4
predict_thread_num No The number of concurrent threads for prediction. INT 2
is_input_video No Specifies whether the input is a video. Valid values:
  • true
  • false
BOOL false
use_image_predictor No Specifies whether the predictor supports only image input. BOOL true
decode_type No The method to decode video. Valid values:
  • 1: Intra only
  • 2: Keyframe only
  • 3: Without bidir
  • 4: Decode all
INT 4
sample_fps No The frequency to extract frames. FLOAT 5
reshape_size No The size of the output frame. -1 indicates the original size. INT -1
decode_batch_size No The batch size for decoding. INT 10
decode_keep_size No The number of overlapped frames in different batches. INT 0
enableDynamicCluster No Specifies whether to enable Dynamic Cluster and allow failover in case of a single worker. If errors occur frequently, you can turn on this switch. Valid values:
  • true
  • false
BOOL false
useSparseClusterSchema No Specifies whether to enable Sparse Cluster. If enableDynamicCluster is set to true, you must set this parameter to true. Valid values:
  • true
  • false
BOOL false