This topic describes the /green/image/asyncscan operation that you can call to submit asynchronous optical character recognition (OCR) tasks. You can submit the OCR tasks to detect and obtain text in images.

Submit asynchronous OCR tasks

Operation: /green/image/asyncscan

You can call this operation to submit asynchronous OCR tasks. For more information about how to construct an HTTP request, see Request structure. You can also select an existing HTTP request. For more information, see SDK overview.

  • Billing method:

    You are charged for calling this operation. For more information about the billing method, see Content Moderation Pricing.

  • Response time:

    The maximum response time that is allowed for a synchronous moderation request is 6s. If the moderation is not complete within 6s, a timeout error is returned. If you do not need to obtain moderation results in real time, you can send asynchronous moderation requests. In other cases, we recommend that you send synchronous moderation requests because synchronous moderation operations are easier to call. We recommend that you set the timeout period to 6s for calling synchronous moderation operations.

  • Return results:

    If you send asynchronous moderation requests, the moderation results are not returned in real time. To obtain moderation results, you can poll the moderation results at regular intervals or enable callback notification. The moderation results are retained for up to 1 hour.

    • Enable callback notification to obtain OCR results: When you submit asynchronous OCR tasks, you can specify a callback URL for receiving OCR results in the callback parameter of the OCR request. For more information about the callback parameter, see Request parameters.
    • Poll OCR results at regular intervals: You do not need to specify the callback parameter when you submit asynchronous OCR tasks. After you submit the tasks, you can call the /green/image/results operation to query OCR results. For more information about the /green/image/results operation, see Query asynchronous OCR results.
  • Limits on images:
    • The images must use HTTP or HTTPS URLs.
    • The images must be in the PNG, JPG, JPEG, BMP, GIF, or WEBP format.
    • An image can be up to 10 MB in size. This rule is applicable to both synchronous and asynchronous moderation operations. If you have special requirements, for example, you want to moderate images larger than 10 MB in size, submit a ticket to raise the threshold.
    • The duration for downloading an image is limited to 3s. If an image fails to be downloaded within 3s, a timeout error is returned.
    • We recommend that you submit images of at least 256 × 256 pixels to ensure the moderation effects.
    • The response time of an operation for moderating images depends on the duration for downloading these images. Make sure that you use a stable and reliable storage service to store the images to be moderated. We recommend that you use Alibaba Cloud Object Storage Service (OSS) or Content Delivery Network (CDN).

Request parameters

Parameter Type Required Description
bizType String No The business scenario. You can create a business scenario in the Alibaba Cloud Content Moderation console. For more information, see Customize policies for machine-assisted moderation. Alternatively, you can submit a ticket to ask Alibaba Cloud engineers to help you create a business scenario.
scenes String array Yes The moderation scenario. Set the value to ocr.
callback String No The callback URL for notifying you of asynchronous moderation results. HTTP and HTTPS URLs are supported. If you do not specify this parameter, you must poll moderation results at regular intervals.
If you specify the callback parameter in the moderation request, make sure that the specified HTTP or HTTPS URL meets the following requirements: supports the POST method, uses UTF-8 to encode the transmitted data, and supports the checksum and content parameters. To send moderation results to the specified callback URL, Content Moderation returns the checksum and content parameters in callback notifications based on the following rules and format:
  • checksum: The string in the UID + Seed + Content format that is generated by the Secure Hash Algorithm 256 (SHA-256) algorithm. UID indicates the ID of your Alibaba Cloud account. You can query the ID in the Alibaba Cloud console. To prevent data tampering, you can use the SHA-256 algorithm to generate a string when your server receives a callback notification and verify the string against the received checksum parameter.
    Note UID cannot be the ID of a RAM user.
  • content: The JSON-formatted string to be parsed to the callback data in the JSON format. For more information about the format of the content parameter, see the sample success responses of each operation that you can call to query asynchronous moderation results.
Note If your server receives a callback notification, it sends HTTP status code 200 to Content Moderation. If your server fails to receive a callback notification, it sends other HTTP status codes to Content Moderation. After Content Moderation receives an HTTP status code other than 200, Content Moderation continues to push the callback notification until your server receives it. Content Moderation can push a callback notification repeatedly for up to 16 times. After 16 times, Content Moderation stops pushing the callback notification. In this case, we recommend that you check the status of the callback URL.
seed String No A random string that is used to generate a signature for the callback notification request. This parameter is required if you specify the callback parameter.
tasks JSON array Yes The list of OCR tasks. Each element in the JSON array is an OCR task structure, namely, each element corresponds to an image. The JSON array can contain a maximum of 100 elements, namely, you can submit a maximum of 100 images at a time. For more information about the structure of each element, see task.
Table 1. task
Parameter Type Required Description
dataId String No The ID of the image to be moderated. Make sure that each ID is unique in a request.
url String Yes The URL of the image to be moderated.
interval Integer No The interval between two frames that are consecutively captured. This parameter is dedicated for GIF or long image moderation.
  • A GIF image can be regarded as an array of frames. One frame is captured for moderation from every n frames. The n is specified by the interval parameter. The system captures frames from GIF images only when this parameter is specified.
  • Long images can be in portrait or horizontal mode.
    • For a long portrait image, of which the height is greater than 400 pixels and the ratio of height to width is greater than 2.5, you can divide the height by the width and round up the result to the nearest integer as the total number of frames.
    • For a long horizontal image, of which the width is greater than 400 pixels and the ratio of width to height is greater than 2.5, you can divide the width by the height and round up the result to the nearest integer as the total number of frames.

By default, only the first frame of a GIF image or a long image is moderated. Instead of having all the frames moderated, you can use the interval parameter to specify the interval between two frames that the system consecutively captures. This helps reduce moderation costs.

Note The interval and maxFrames parameters must be used in pairs. Assume that the interval parameter is set to 2 and the maxFrames parameter is set to 100 for moderating a GIF image or a long image. In this case, one out of every two frames is moderated and a maximum of 100 frames are moderated. The fee is calculated based on the actual number of moderated frames.
maxFrames Integer No The maximum number of frames to be captured. This parameter is dedicated for GIF or long image moderation. Default value: 1.

If the value of the interval parameter multiplied by that of the maxFrames parameter is smaller than the total number of frames in a GIF image or a long image, the interval for capturing frames is automatically changed to the integer rounded up from the result of dividing the total number of frames in the image by the value of the maxFrames parameter. This helps improve the overall moderation effects.

Response parameters

Parameter Type Description
code Integer The HTTP status code returned for the OCR task.
msg String The message returned for the OCR task.
dataId String The ID of the moderation object.
Note If you specify the dataId parameter in the moderation request, the dataId parameter is returned in the response.
taskId String The ID of the OCR task.
url String The URL of the moderated image.
extras JSON structure The extra parameters that you specified in the extras parameter of the moderation request.
Note This parameter may be subject to changes. Use the latest value of this parameter.

Examples

Sample requests
{
    "scenes": [
        "ocr"
    ],
    "tasks": [
        {
            "dataId": "test_data_xxxx",
            "url": "https://test_image_xxxx.png"
        }
    ]
}
Sample success responses
{
    "code": 200,
    "msg": "OK",
    "requestId": "92AD868A-F5D2-4AEA-96D4-E1273B8E074C",
    "data": [
        {
            "code": 200,
            "msg": "OK",
            "dataId": "test_data_xxxx",
            "taskId": "aaa25f95-4892-4d6b-aca9-7939bc6e9baa-1486198766695",
            "url": "https://test_image_xxxx.png"
        }
    ]
}

Query asynchronous OCR results

Operation: /green/image/results

You can call this operation to query asynchronous OCR results. For more information about how to construct an HTTP request, see Request structure. You can also select an existing HTTP request. For more information, see SDK overview.

  • Billing method:

    This operation is free of charge.

  • Response time:

    We recommend that you query moderation results at least 30s after you send an asynchronous moderation request. Content Moderation retains moderation results for up to 4 hours. If you query moderation results after 4 hours, the results are deleted.

Request parameters

Parameter Type Required Description
body JSON array Yes The list of IDs of asynchronous moderation tasks that you want to query. You can specify up to 1,000 task IDs.

Response parameters

Parameter Type Description
code Integer The HTTP status code returned for the OCR task.
msg String The message returned for the OCR task.
dataId String The ID of the moderation object.
Note If you specify the dataId parameter in the moderation request, the dataId parameter is returned in the response.
taskId String The ID of the OCR task.
url String The URL of the moderated image.
results Array The return results of the OCR task. If HTTP status code 200 is returned after a successful call, the array in the return results contains one or more elements. Each element is a structure. For more information about the structure of each element, see result.
Table 2. result
Parameter Type Description
scene String The moderation scenario. The value is fixed to ocr.
label String The category of the OCR results. Valid values:
  • normal: The image does not contain text.
  • ocr: The image contains text.
suggestion String The machine-assisted moderation result of the moderated image. Valid values:
  • pass: The image does not require further actions.
  • review: The image requires human review.
rate Floating point The probability that the moderated image falls into the detected category. You can ignore this parameter in the OCR scenario.
ocrLocations Array The information about the single text entry in the moderated static image, which includes the text, text size, and text location. For more information about the structure, see ocrLocation.
ocrData Array The combination of all text in the moderated static image. In general, the text combination is stored as the first element of the array.
frames Array The frames that are captured from the moderated GIF image and the text that is detected in each frame.
Table 3. ocrLocation
Parameter Type Description
text String The single text entry that is detected in the moderated image.
x Floating point The distance between the upper-left corner of the text area and the y-axis, with the upper-left corner of the image being the coordinate origin. Unit: pixels.
y Floating point The distance between the upper-left corner of the text area and the x-axis, with the upper-left corner of the image being the coordinate origin. Unit: pixels.
w Floating point The width of the text area. Unit: pixels.
h Floating point The height of the text area. Unit: pixels.
Table 4. ocrDetailInfo
Parameter Type Description
wordNum Integer The number of phrases.
wordsInfo Structure The information about phrases. For more information about the structure, see wordsInfo.
Table 5. wordsInfo
Parameter Type Description
charInfo Array The information about words of the phrase. For more information about the structure, see charInfo.
direction Integer The direction of the phrase. Valid values:
  • 0: The phrase is horizontally positioned.
  • 1: The phrase is vertically positioned.
pos Array The coordinates of the phrase. For more information about the structure, see pos.
prob Integer The confidence level.
word String The words of the phrase.
Table 6. charInfo
Parameter Type Description
h Integer The height of the word. Unit: pixels.
prob Integer The confidence level.
w Integer The width of the word. Unit: pixels.
word String The content of the word.
x Integer The x-coordinate of the word. Unit: pixels.
y Integer The y-coordinate of the word. Unit: pixels.
Table 7. pos
Parameter Type Description
x Integer The x-coordinate of the phrase. Unit: pixels.
y Integer The y-coordinate of the phrase. Unit: pixels.

Examples

Sample requests
[
    "aaa25f95-4892-4d6b-aca9-7939bc6e9baa-1486198766695"
]
Sample success responses
{
    "code": 200,
    "data": [
        {
            "code": 200,
            "dataId": "test_data_xxxx",
            "extras": {

            },
            "msg": "OK",
            "results": [
                {
                    "label": "ocr",
                    "ocrData": [
                        "This topic describes how to call an operation to submit asynchronous image moderation tasks."
                    ],
                    "ocrLocations": [
                        {
                            "h": 19,
                            "text": "This topic describes how to call an operation to submit asynchronous image moderation tasks.",
                            "w": 362,
                            "x": 31,
                            "y": 11
                        }
                    ],
                    "rate": 99.91,
                    "scene": "ocr",
                    "suggestion": "review"
                }
            ],
            "taskId": "aaa25f95-4892-4d6b-aca9-7939bc6e9baa-1486198766695",
            "url": "https://test_image_xxxx.png"
        }
    ],
    "msg": "OK",
    "requestId": "992C7849-AA45-4055-8F82-8D44D64C15E3"
}