Detect and recognize text in images using trained models for batch processing.
Data format
For more information, see Input data formats.
Run predictions
Invoke PAI commands using the SQL Script component, MaxCompute client, or ODPS SQL node in DataWorks. For more information, see Connect using local client (odpscmd) or Develop an ODPS SQL task.
pai -name ev_predict_ext
-Dmodel_path='Path to model'
-Dmodel_type='text_spotter'
-Dinput_oss_file='oss://path/to/filelist.txt'
-Doutput_oss_file='oss://path/to/result.txt'
-Dimage_type='url'
-Dnum_worker=2
-DcpuRequired=800
-DgpuRequired=100
-Dbuckets='OSS buckets'
-Darn='RoleArn'
-DossHost='OSS domain name'
See Parameters for parameter descriptions.
Output format
Each line contains an image path and prediction result in JSON format.
oss://path/to/your/image1.jpg, JSON result string
oss://path/to/your/image2.jpg, JSON result string
oss://path/to/your/image3.jpg, JSON result string
Result structure:
{
"detection_keypoints": [[[243.57516479492188, 198.84210205078125], [243.91038513183594, 247.62425231933594], [385.5513916015625, 246.61660766601562], [385.2197570800781, 197.79345703125]], [[292.2718200683594, 114.44700622558594], [292.2237243652344, 164.684814453125], [571.1962890625, 164.931640625], [571.2444458007812, 114.67433166503906]]],
"detection_boxes": [[243.5308074951172, 197.69570922851562, 385.59625244140625, 247.7247772216797], [292.1929931640625, 114.28043365478516, 571.2748413085938, 165.09771728515625]],
"detection_scores": [0.9942291975021362, 0.9940272569656372],
"detection_classes": [1, 1],
"detection_classe_names": ["text", "text"],
"detection_texts_ids" : [[1,2,2008,12], [1,2,2008,12]],
"detection_texts": ["This is an example", "This is an example"],
"detection_texts_scores" : [0.88, 0.88]
}
Output parameters:
|
Parameter |
Description |
Shape |
Data type |
|
detection_boxes |
Bounding box coordinates in [top, left, bottom, right] format |
[num_detections, 4] |
FLOAT |
|
detection_scores |
Detection confidence score |
num_detections |
FLOAT |
|
detection_classes |
Detection category ID |
num_detections |
INT |
|
detection_class_names |
Detection category name |
num_detections |
STRING |
|
detection_keypoints |
Four corner points in (y,x) coordinate format |
[num_detections, 4, 2] |
FLOAT |
|
detection_texts_ids |
Character ID array for recognized text |
[num_detections, max_text_length] |
INT |
|
detection_texts |
Recognized text content |
[num_detections] |
STRING |
|
detection_texts_scores |
Recognition confidence score |
[num_detections] |
FLOAT |