All Products
Search
Document Center

Content Moderation:Submits synchronous image OCR tasks

Last Updated:Dec 13, 2024

Submits optical character recognition (OCR) tasks and obtains OCR results in real time. You can call this operation to submit the OCR tasks to detect and obtain text in images.

Description

Operation: /green/image/scan

You can call this operation to submit OCR tasks and obtain OCR results in real time. For more information about how to construct an HTTP request, see Request structure. You can also select an existing HTTP request. For more information, see SDK overview.

  • Billing method:

    You are charged for calling this operation. For more information about the billing methods, see

  • Response timeout:

    The maximum response time that is allowed for a synchronous moderation request is 6 seconds. If the moderation is not completed within 6 seconds, a timeout error is returned. If you do not require moderation results in real time, you can send asynchronous moderation requests. In most cases, we recommend that you send synchronous moderation requests because synchronous moderation operations are easier to call. We recommend that you set the timeout period to 6 seconds for calling synchronous moderation operations.

  • Returned results:

    In general, moderation results are returned within 1 second after you send a synchronous moderation request. The time may increase in special scenarios where a large number of requests are to be processed in the system, the size of images is large, or the images contain a large number of words. The speed of OCR is inversely relevant to the number of words in the images. If the images to be moderated contain a large number of words, we recommend that you send asynchronous moderation requests.

  • Limits on images:

    • The URLs of images must be HTTP or HTTPS URLs.

    • The images must be in PNG, JPG, JPEG, BMP, GIF, or WEBP format.

    • An image can be up to 20 MB in size. The limit for the image size is applicable to both synchronous and asynchronous moderation operations.

    • The duration for downloading an image is limited to 3 seconds. If an image fails to be downloaded within 3 seconds, a timeout error is returned.

    • We recommend that you submit images of at least 256 × 256 pixels to ensure the moderation effect.

    • The response time of an operation for moderating images varies based on the duration for downloading these images. Make sure that you use a stable and reliable storage service to store the images to be moderated. We recommend that you use Object Storage Service (OSS) or Content Delivery Network (CDN).

QPS limits

You can call this operation up to 10 times per second per account. If the number of calls per second exceeds the limit, throttling is triggered. As a result, your business may be affected. We recommend that you take note of the limit when you call this operation.

Request parameters

Parameter

Type

Required

Example

Description

bizType

String

No

default

The business scenario. You can create a business scenario in the

Content Moderation console. For more information, see Customize policies for machine-assisted moderation.

scenes

StringArray

Yes

["ocr"]

The moderation scenario. Set the value to ocr.

tasks

JSONArray

Yes

The list of objects that you want to moderate. The JSON array can contain one or more elements. Each element is a structure. The JSON array can contain up to 100 elements. In other words, you can submit up to 100 moderation objects at a time. To submit 100 moderation objects at a time, you must raise the relevant concurrency limit to a number greater than 100. For more information about the structure, see task.

Table 1. task

Parameter

Type

Required

Example

Description

dataId

String

No

test_data_xxxx

The data ID. Make sure that each ID is unique in a request.

url

String

Yes

https://aliyundoc.com/test_image_xxxx.png

The HTTP or HTTPS URL that can be accessed over the Internet. The URL is up to 2,048 characters in length.

interval

Integer

No

2

The interval between two frames that are consecutively captured. This parameter is dedicated for GIF or long image moderation.

  • A GIF image can be regarded as an array of frames. One frame is captured for moderation from every n frames, where n is specified by the interval parameter. The system captures frames from GIF images only when this parameter is specified.

  • Long images can be in portrait or horizontal mode.

    • To moderate a long portrait image, you can calculate the total number of frames in the following way: divide the height by the width and round the result to the nearest integer. In a long portrait image, the height is greater than 400 pixels, and the ratio of height to width is greater than 2.5:1.

    • To moderate a long horizontal image, you can calculate the total number of frames in the following way: divide the width by the height and round the result to the nearest integer. In a long horizontal image, the width is greater than 400 pixels, and the ratio of width to height is greater than 2.5:1.

By default, only the first frame of a GIF image or a long image is moderated. You can use the interval parameter to specify the interval between two frames that the system consecutively captures. This helps reduce moderation costs.

Note

The interval and maxFrames parameters must be used in pairs. For example, the interval parameter is set to 2, and the maxFrames parameter is set to 100 for moderating a GIF image or a long image. In this example, one out of every two frames is moderated, and a maximum of 100 frames are moderated. The fee is calculated based on the actual number of moderated frames.

maxFrames

Integer

No

100

The maximum number of frames to be captured. This parameter is dedicated for GIF or long image moderation. Default value: 1.

If the value of the interval parameter multiplied by that of the maxFrames parameter is smaller than the total number of frames in a GIF image or a long image, the interval for capturing frames is automatically changed to the integer rounded up from the result of dividing the total number of frames in the image by the value of the maxFrames parameter. This helps improve the overall moderation effects.

Response parameters

Parameter

Type

Example

Description

code

Integer

200

The returned HTTP status code.

msg

String

OK

The message that is returned for the request.

dataId

String

test_data_xxxx

The ID of the moderation object.

Note

If you set the dataId parameter in the moderation request, the value of the dataId request parameter is returned here.

taskId

String

img5A@k7a@B4q@6K@d9nfKgOs-1s****

The ID of the moderation task.

url

String

https://aliyundoc.com/test_image_xxxx.png

The HTTP or HTTPS URL that can be accessed over the Internet. The URL is up to 2,048 characters in length.

results

Array

The returned results. If the HTTP status code 200 is returned, the array in the returned results contains one or more elements. Each element is a structure. For more information about the structure of each element, see result.

Table 2. result

Parameter

Type

Example

Description

scene

String

ocr

The moderation scenario. Set the value to ocr.

label

String

ocr

The category of the moderation result. Valid values:

  • normal: The image does not contain text.

  • ocr: The image contains text.

suggestion

String

review

The recommended subsequent operation. Valid values:

  • pass: The image does not require further actions.

  • review: The image requires manual review.

rate

Float

99.91

The probability that the moderated image falls into the detected category. You can ignore this parameter in the OCR scenario.

ocrLocations

Array

The information about the single text entry in the moderated static image, which includes the text, text size, and text location. For more information about the structure, see ocrLocation.

Note

If no text is detected in the moderated image, this parameter is not returned.

ocrData

Array

["hello, this is a test text."]

The combination of all text in the moderated static image. In general, the text combination is stored as the first element of the array.

Note

If no text is detected in the moderated image, this parameter is not returned.

frames

Array

xxx

The frames that are captured from the moderated animated image and the text that is detected in each frame.

Note

If no more than one frame is captured, this parameter is not returned.

Table 1. ocrLocation
ParameterTypeExampleDescription
textStringhelloThe single text entry that is detected in the moderated image.
xFloat41The distance between the upper-left corner of the text area and the y-axis, with the upper-left corner of the image being the coordinate origin. Unit: pixels.
yFloat84The distance between the upper-left corner of the text area and the x-axis, with the upper-left corner of the image being the coordinate origin. Unit: pixels.
wFloat83The width of the text area. Unit: pixels.
hFloat26The height of the text area. Unit: pixels.

Table 3. ocrDetailInfo

Table 4. wordsInfo

Examples

Sample requests

http(s)://[Endpoint]/green/image/scan
&<Common request parameters>
{
    "scenes": [
        "ocr"
    ],
    "tasks": [
        {
            "dataId": "test_data_xxxx",
            "url": "https://aliyundoc.com/test_image_xxxx.png"
        }
    ]
}

Sample success responses

{
    "code": 200,
    "data": [
        {
            "code": 200,
            "dataId": "test_data_xxxx",
            "extras": {

            },
            "msg": "OK",
            "results": [
                {
                    "label": "ocr",
                    "ocrData": [
                        "hello, this is a test text."
                    ],
                    "ocrLocations": [
                        {
                            "h": 26,
                            "text": "hello",
                            "w": 83,
                            "x": 41,
                            "y": 84
                        },
                        {
                            "h": 25,
                            "text": " this is a test text.",
                            "w": 95,
                            "x": 78,
                            "y": 114
                        }
                    ],
                    "rate": 99.91,
                    "scene": "ocr",
                    "suggestion": "review"
                }
            ],
            "taskId": "img5A@k7a@B4q@6K@d9nfKgOs-1s****",
            "url": "https://aliyundoc.com/test_image_xxxx.png"
        }
    ],
    "msg": "OK",
    "requestId": "C4AB08A9-AD75-4410-859B-0B9EF6DFC3C4"
}