This topic describes how to call an operation to submit asynchronous optical character recognition (OCR) tasks and obtain OCR results in an asynchronous way. It intends to help you construct an HTTP call request.

  • For more information about how to construct an HTTP request, see Request structure.
  • You can also select an existing HTTP request. For more information, see SDK overview.

Submit asynchronous OCR tasks

Operation: /green/image/asyncscan

You can call this operation to submit asynchronous OCR tasks to detect text in images.

Note You are charged for calling this operation. For more information about the billing method, see Content Moderation pricing.

Notes for synchronous and asynchronous moderation

The maximum response time allowed for a synchronous moderation request is 6s. If the moderation is not completed within 6s, a timeout error is returned. If you do not have a high demand on getting the results in real time, you can send asynchronous moderation requests. In other cases, we recommend that you send synchronous moderation requests because it is easier to call the operations for synchronous moderation. We recommend that you set the timeout period to 6s for calling the operations for synchronous moderation.

If you send asynchronous moderation requests, the moderation results are not returned in real time. You can poll the moderation results of asynchronous requests at regular intervals or configure a callback notification. The moderation results are retained for up to 1 hour.

Limits on images
  • The images must use HTTP or HTTPS URLs.
  • The images must be in the following formats: PNG, JPG, JPEG, BMP, GIF, or WEBP.
  • The size of an image can be up to 10 MB. This rule is applicable to both synchronous and asynchronous calls. If you have any special requirements, for example, moderating images greater than 10 MB in size, you can submit a ticket to raise the threshold.
  • The duration for downloading an image is limited to 3s. If an image fails to be downloaded within 3s, a timeout error is returned.
  • We recommend that you submit images of at least 256 × 256 pixels to guarantee the moderation effects.
  • The response time of an operation for moderating images depends on the download duration of images. Make sure that you use a stable and reliable storage service to store the images to be moderated. We recommend that you use Alibaba Cloud Object Storage Service (OSS) or Content Delivery Network (CDN).

Request parameters

Parameter Type Required Description
bizType String No The business scenario. You can customize content moderation policies to apply different moderation standards or algorithm policies to different business scenarios. You can create a business scenario by specifying bizType in the Alibaba Cloud Content Moderation console. Alternatively, you can submit a ticket to ask Alibaba Cloud engineers to help you create a business scenario.
scenes String array Yes The moderation scenario. Set the value to ocr.
callback String No The callback URL for notifying you of asynchronous OCR results. HTTP and HTTPS URLs are supported. If you do not specify this parameter, you must query the results at regular intervals.
seed String No A random string used to generate a signature for the callback notification request. This parameter is required if you specify the callback parameter.
tasks JSON array Yes The list of OCR tasks. Each element in the JSON array is an OCR task structure, that is, each element corresponds to an image. The JSON array can contain a maximum of 100 elements, that is, you can detect text in a maximum of 100 images at a time. For more information about the structure of each element, see task.
Table 1. task
Parameter Type Required Description
dataId String No The ID of the image to be moderated. Make sure that each ID is unique in a request.
url String Yes The URL of the image to be moderated.
interval Integer No The interval between two frames that are consecutively captured. This parameter is dedicated for GIF or long image moderation. A GIF image can be treated as an array of frames. One frame is captured for moderation from every n frames, where n is specified by the interval parameter. The system captures frames from GIF images only when this parameter is specified. Long images can be in either portrait or horizontal mode.
  • For a long portrait image, of which the height is greater than 400 pixels and the ratio of height to width is greater than 2.5, you can divide the height by the width and round up the result to the nearest integer as the total number of frames.
  • For a long horizontal image, of which the width is greater than 400 pixels and the ratio of width to height is greater than 2.5, you can divide the width by the height and round up the result to the nearest integer as the total number of frames.
Note By default, only the first frame of a GIF or long image is moderated. Instead of having all the frames moderated, you can set the interval parameter to specify the interval between two frames that the system consecutively captures. This can help reduce costs. This parameter must be used together with the maxFrames parameter. Assume that the interval parameter is set to 2 and the maxFrames parameter is set to 100 for moderating a GIF or long image. In this case, one out of every two frames is moderated and a maximum of 100 frames can be moderated. The fee is calculated based on the actual number of moderated frames.
maxFrames Integer No The maximum number of frames to be captured. This parameter is dedicated for GIF or long image moderation. Default value: 1.

If the product of interval and maxFrames is smaller than the total number of frames in a GIF or long image, the interval for capturing frames is automatically changed to the integer rounded up from the result of dividing the total number of frames in the image by maxFrames. This helps improve the overall moderation effect.

Callback notification

If you set the callback parameter in the request parameters, make sure that the HTTP or HTTPS URL specified in the callback parameter meets the following requirements: supports the POST method, uses UTF-8 to encode the transmitted data, and supports the checksum and content parameters. To call the operation to return the moderation result, Content Moderation sets the checksum and content parameters based on the generation rules and format described as follows.
Note The callback is successful only when the operation returns the HTTP status code 200. In case the callback fails, the operation tries to resend the moderation result to you for up to 16 times. If the callback fails for 16 times, the operation stops resending the moderation result. We recommend that you check the status of the callback URL.
Parameter Type Description
checksum String The string in the <UID> + <Seed> + <Content> format generated by the Secure Hash Algorithm 256 (SHA-256) algorithm. UID indicates the ID of your Alibaba Cloud account. You can query the ID in the Alibaba Cloud console. To prevent data tampering, you can use the SHA-256 algorithm to generate a string when your server receives a callback notification and verify the string against the received checksum parameter.
Note UID cannot be the ID of a Resource Access Management (RAM) user.
content String The JSON-formatted string to be parsed to the callback data in JSON format. The format of the content parameter is described in the following section.
Example of the content parameter:
{
    "code": 200,
    "data": [
        {
            "code": 200,
            "dataId": "964719bf-30b1-4180-ba22-09e56d530bfb",
            "extras": {

            },
            "msg": "OK",
            "results": [
                {
                    "bankCardInfo": {
                        "bankCardNum": "6225768888888888"
                    },
                    "label": "ocr",
                    "rate": 99.91,
                    "scene": "ocr",
                    "suggestion": "review"
                }
            ],
            "taskId": "imguVUiCvA4NZ5jeaGJCS9IG-1pfBHc",
            "url": "https://aip.bdstatic.com/portal/dist/1531393832694/ai_images/technology/ocr-cards/bankcard/demo-card-1.png"
        }
    ],
    "msg": "success",
    "requestId": "1a2faf93-dd41-47d8-95ad-bdf6226540e4"
}

Response parameters

Parameter Type Required Description
code Integer Yes The HTTP status code returned for the OCR task.
msg String Yes The message returned for the OCR task.
dataId String No The ID of the moderated image, which you specify in the dataId parameter of the OCR request.
taskId String Yes The ID of the OCR task.
url String Yes The URL of the moderated image, which you specify in the url parameter of the OCR request.

Examples

Sample requests
{
    "scenes": [
        "ocr"
    ],
    "tasks": [
        {
            "dataId": "test4lNSMdggA0c56MMvfY1234-abcdpx",
            "url": "https://img.alicdn.com/tfs/TB1urBOQFXXXXbMXFXXXXXXXXXX-1442-257.png"
        }
    ]
}
Sample success responses
{
    "code": 200,
    "msg": "OK",
    "requestId": "92AD868A-F5D2-4AEA-96D4-E1273B8E074C",
    "data": [
        {
            "code": 200,
            "msg": "OK",
            "dataId": "test4lNSMdggA0c56MMvfY1234-abcdpx",
            "taskId": "aaa25f95-4892-4d6b-aca9-7939bc6e9baa-1486198766695",
            "url": "https://img.alicdn.com/tfs/TB1urBOQFXXXXbMXFXXXXXXXXXX-1442-257.png"
        }
    ]
}

Query asynchronous OCR results

Operation: /green/image/results

You can call this operation to query asynchronous OCR results. We recommend that you set the interval for querying the results to 30s. Content Moderation retains the results for a maximum of 1 hour. If you set the interval to a value greater than 1 hour, the results will be deleted before you query them.

Note This operation is free of charge.

Request parameters

Parameter Type Required Description
body JSON array Yes The list of IDs of OCR tasks to query. You can specify up to 1,000 task IDs.

Response parameters

Parameter Type Required Description
code Integer Yes The HTTP status code returned for the OCR task.
msg String Yes The message returned for the OCR task.
dataId String No The ID of the moderated image, which you specify in the dataId parameter of the OCR request.
taskId String Yes The ID of the OCR task.
url String No The URL of the moderated image, which you specify in the url parameter of the OCR request.
results Array No The return results of the OCR task. If HTTP status code 200 is returned, indicating a successful call, the array in the return results may contain one or more elements. Each element is a structure. For more information about the structure of each element, see result.
Table 2. result
Parameter Type Required Description
scene String Yes The moderation scenario. Valid value: ocr.
label String Yes The category of the OCR results. Valid values:
  • normal: The image is normal and does not contain text.
  • ocr: The image contains text.
suggestion String Yes The machine-assisted moderation result of the moderated image. Valid values:
  • pass: The image is normal and does not require further actions.
  • review: The image requires human review.
rate Floating point Yes The probability that the moderated image falls into the detected category. You can ignore this parameter in the OCR scenario.
ocrLocations Array No The information about the single piece of text in the moderated static image, including the text, text size, and text location. For more information about the structure, see ocrLocation.
ocrData Array No The combination of all text in the moderated static image. Generally, the text combination is stored in the first element of the array.
frames Array No The frames captured from the moderated GIF image and text detected in each frame.
Table 3. ocrLocation
Parameter Type Required Description
text String Yes The single piece of text detected in the moderated image.
x Floating point Yes The distance between the upper-left corner of the text area and the y-axis, with the upper-left corner of the image being the coordinate origin.
y Floating point Yes The distance between the upper-left corner of the text area and the x-axis, with the upper-left corner of the image being the coordinate origin.
w Floating point Yes The width of the text area.
h Floating point Yes The height of the text area.

Examples

Sample requests
[
    "aaa25f95-4892-4d6b-aca9-7939bc6e9baa-1486198766695"
]
Sample success responses
{
    "code": 200,
    "data": [
        {
            "code": 200,
            "dataId": "675876e4-6698-4f92-8bd9-19f5e20035fb",
            "extras": {

            },
            "msg": "OK",
            "results": [
                {
                    "label": "ocr",
                    "ocrData": [
                        "This topic describes how to call an operation to submit asynchronous image moderation tasks."
                    ],
                    "ocrLocations": [
                        {
                            "h": 19,
                            "text": "This topic describes how to call an operation to submit asynchronous image moderation tasks.",
                            "w": 362,
                            "x": 31,
                            "y": 11
                        }
                    ],
                    "rate": 99.91,
                    "scene": "ocr",
                    "suggestion": "review"
                }
            ],
            "taskId": "img7Iljku7br7u66Z29EsepH9-1pfBUQ",
            "url": "http://green-system.oss-cn-hangzhou.aliyuncs.com/green_demo_image/2018-08-02/1533193802331_123.jpg?Expires=1533280202&OSSAccessKeyId=xxxxxxxxx&Signature=xxxxxxxxxx"
        }
    ],
    "msg": "OK",
    "requestId": "992C7849-AA45-4055-8F82-8D44D64C15E3"
}