This topic describes the /green/webpage/scan operation that you can call to moderate web pages and obtain moderation results in real time. Web page moderation helps you detect image and text violations on a web page, and returns moderation results and the categories of moderation results. You can call this operation to moderate web pages that use HTTP or HTTPS URLs.

Operation description

Operation: /green/webpage/scan

You can call this operation to submit web page moderation tasks and obtain moderation results in real time. For more information about how to construct an HTTP request, see Request structure. You can also select an existing HTTP request. For more information, see SDK overview.

Billing method:

You are charged for calling this operation. For more information about the billing method, seeContent Moderation Pricing.

Request parameters

Parameter Type Required Example Description
bizType String No default The business scenario. You can create a business scenario in the Alibaba Cloud Content Moderation console. For more information, see Customize policies for machine-assisted moderation. You can also submit a ticket to ask Alibaba Cloud engineers to help you create a business scenario.
textScenes StringArray No ["antispam"] The moderation scenario of the text to be moderated on a web page. Set the value to antispam.
Note You must specify at least one of the textScenes and imageScenes parameters.
imageScenes StringArray No ["porn","ad"] The moderation scenario of the images to be moderated on a web page. Valid values:
  • porn: pornography detection
  • ad: ad violation detection
  • terrorism: terrorist content detection
  • live: undesirable scene detection
Note You must specify at least one of the textScenes and imageScenes parameters.
tasks JSONArray Yes The list of moderation tasks. The value is a JSON array that can contain one to five elements. Each element is a structure. For more information about the structure, see task.
returnHighlightHtml Boolean No false Specifies whether to highlight violations. Valid values:
  • true: highlights violations.
  • false: does not highlight violations. This is the default value.
Table 1. task
Parameter Type Required Example Description
dataId String No test4lNSMdggA0c56MMvfYoh4e-1mwxpx The ID of the moderation object.

The ID can contain letters, digits, underscores (_), hyphens (-), and periods (.) and can be up to 128 characters in length. This ID uniquely identifies your business data.

url String No http://www.test.html The URL of the web page. You can moderate web pages that use HTTP or HTTPS URLs.
Note You must specify one of the url and content parameters.
content String No <html>hello,world! </html> The plaintext in the HTML format of the web page.
Note You must specify one of the url and content parameters.

Response parameters

Parameter Type Example Description
code Integer 200 The returned HTTP status code.

For more information, see Common response parameters.

msg String OK The message that is returned for the request.
taskId String wp5$7n$hD74qu4CrNWZlR7Sr-1ttC3T The ID of the moderation task.
dataId String test4lNSMdggA0c56MMvfYoh4e-1mwxpx The ID of the moderation object.
Note If you set the dataId parameter in the moderation request, the dataId parameter is returned in the response.
suggestion String block The recommended subsequent operation for you to perform. Valid values:
  • pass: The moderation object does not require further actions.
  • review: The moderation object contains suspected violations and requires human review.
  • block: The moderation object contains violations. We recommend that you delete or block the object.
riskFrequency JSONObject { "porn":123, "terrorism":44} The category and frequency of the risky content that the moderated web page hits. The value is presented by using key-value pairs. key indicates the category of moderation results, and value indicates the number of moderation results for the corresponding category.

For more information about the sample categories of moderation results, see text labels and image labels.

textResults JSONArray The text moderation results.

This parameter is returned only if you specify the textScenes parameter. The value of the textScenes parameter is a JSON array. For more information about the structure of each element in the array, see textResults.

imageResults JSONArray The image moderation results.

This parameter is returned only if you specify the imageScenes parameter. The value of the imageScenes parameter is a JSON array. For more information about the structure of each element in the array, see imageResults.

highlightHtml String <html>xxx</html> The highlighted text in the HTML format.
Table 2. textResults
Parameter Type Example Description
code Integer 200 The returned HTTP status code.
msg String OK The message that is returned for the request.
dataId String test4lNSMdggA0c56MMvfYoh4e-1mwxpx The ID of the moderation object.
Note If you set the dataId parameter in the moderation request, the dataId parameter is returned in the response.
taskId String wp5$7n$hD74qu4CrNWZlR7Sr-1ttC3T.txt-0 The ID of the moderation task.
results JSONArray The return results. If HTTP status code 200 is returned after a successful call, the array in the return results contains one or more elements. Each element is a structure. For more information about the structure, see result.
Table 3. result
Parameter Type Example Description
scene String antispam The moderation scenario of the moderated text on the web page, which is the same as the value of the textScenes parameter that you specify in the moderation request.
suggestion String block The recommended subsequent operation for you to perform. Valid values:
  • pass: The moderation object does not require further actions.
  • review: The moderation object contains suspected violations and requires human review.
  • block: The moderation object contains violations. We recommend that you delete or block the object.
label String politics The category of the moderation result for the moderated text. Valid values:
  • normal: normal
  • spam: junk content
  • ad: ad
  • politics: political content
  • terrorism: terrorist content
  • abuse: abuse
  • porn: pornographic content
  • flood: excessive junk content
  • contraband: prohibited content
  • meaningless: meaningless content
  • customized: custom content, such as a custom term
rate Float 99.91 The score of the confidence level. Valid values: 0 to 100. A greater value indicates a higher confidence level.
If a value of pass is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content is normal. If a value of review or block is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content contains violations.
Notice This score is for reference only. We strongly recommend that you do not use this score in your business. We recommend that you use the values that are returned for the suggestion, label, and sublabel parameters to determine whether the content contains violations. The sublabel parameter is returned by specific operations.
details JSONArray The details of the risky content that the moderated text hits. A text entry can hit multiple pieces of risky content. For more information about the structure, see detail.
Table 4. detail
Parameter Type Example Description
label String politics The category of the risky content that the moderated text hits. Valid values:
  • spam: junk content
  • ad: ad
  • politics: political content
  • terrorism: terrorist content
  • abuse: abuse
  • porn: pornographic content
  • flood: excessive junk content
  • contraband: prohibited content
  • meaningless: meaningless content
  • customized: custom content, such as a custom term
contexts JSONArray The context information of the risky content that the moderated text hits. For more information about the structure, see context.
Table 5. context
Parameter Type Example Description
context String Part-time job The context of the risky content that the moderated text hits. If the text hits a term or text pattern in your custom text library, the term or text pattern is returned.
libName String Ad library 1 This parameter is returned if the moderated text hits a term or text pattern in the custom text library. The name of the custom text library.
libCode String 12232 This parameter is returned if the moderated text hits a term or text pattern in the custom text library. The code of the custom text library.
ruleType String content The behavior rule. This parameter is returned if the moderated text hits the behavior rule. Valid values:
  • user_id: the ID of the user.
  • ip: the IP address of the user.
  • content: the same text content.
  • similar_content: the similar text content.
  • imei: the International Mobile Equipment Identity (IMEI) of the device.
  • imsi: the International Mobile Subscriber Identity (IMSI) of the device.
Table 6. imageResults
Parameter Type Example Description
code Integer 200 The returned HTTP status code.

For more information, see Common response parameters.

msg String OK The returned message.
dataId String test4lNSMdggA0c56MMvfYoh4e-1mwxpx The ID of the moderation object.
Note If you set the dataId parameter in the moderation request, the dataId parameter is returned in the response.
taskId String wp5$7n$hD74qu4CrNWZlR7Sr-1ttC3T.txt-0 The ID of the moderation task.
url String http://xxxxx.jpg The URL of the moderation object.
results JSONArray The return results. If HTTP status code 200 is returned after a successful call, the array in the return results contains one or more elements. Each element is a structure. For more information about the structure, see result.
Table 7. result
Parameter Type Example Description
scene String porn The moderation scenario of the moderated image, which you specify in the moderation request. Valid values:
  • porn: pornography detection
  • terrorism: terrorist content detection
  • ad: ad violation detection
  • live: undesirable scene detection
label String politics The category of the moderation result for the moderated image. Valid values vary based on the specified moderation scenario.
  • If the imageScenes parameter is set to porn, the valid values are:
    • normal: normal
    • sexy: sexy content
    • porn: pornographic content
  • If the imageScenes parameter is set to terrorism, the valid values are:
    • normal: normal
    • bloody: bloody content
    • explosion: explosion and smoke
    • outfit: special costume
    • logo: logo
    • weapon: weapon
    • politics: political content
    • violence: violence
    • crowd: crowd
    • parade: parade
    • carcrash: car accident
    • flag: flag
    • location: landmark
    • others: other specified content
  • If the imageScenes parameter is set to ad, the valid values are:
    • normal: normal
    • politics: political content in text
    • porn: pornographic content in text
    • abuse: abuse in text
    • terrorism: terrorist content in text
    • contraband: prohibited content in text
    • spam: junk content in text
    • npx: illegal ad
    • qrcode: QR code
    • programCode: mini program code
    • ad: other ads
    Note By default, only normal and ad can be returned. If you want to use other categories, submit a ticket.
  • If the imageScenes parameter is set to live, the valid values are:
    • normal: normal
    • meaningless: no content in the image
    • PIP: small picture
    • smoking: smoking content
    • drivelive: live broadcasting in a running vehicle
suggestion String block The recommended subsequent operation for you to perform. Valid values:
  • pass: The moderation object does not require further actions.
  • review: The moderation object contains suspected violations and requires human review.
  • block: The moderation object contains violations. We recommend that you delete or block the object.
rate Float 99.91 The score of the confidence level. Valid values: 0 to 100. A greater value indicates a higher confidence level.
If a value of pass is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content is normal. If a value of review or block is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content contains violations.
Notice This score is for reference only. We strongly recommend that you do not use this score in your business. We recommend that you use the values that are returned for the suggestion, label, and sublabel parameters to determine whether the content contains violations. The sublabel parameter is returned by specific operations.
hintWordsInfo JSONArray [{"context":"Sensitive words"}] The information about the term that the detected ad or illegal text in the moderated image hits.
Note This parameter is applicable only to ad violation detection.
sfaceData JSONArray The information about the detected terrorist content in the moderated image. For more information about the structure, see Table 9.
Note This parameter is applicable only to terrorist content detection.
ocrData StringArray ["xxxxx", "yyyy"] The information about the detected complete text in the moderated image.
Note By default, this parameter is not returned. If you want this parameter to be returned, submit a ticket.
Table 8. frame
Parameter Type Example Description
rate Float 99.91 The score of the confidence level. Valid values: 0 to 100. A higher confidence level indicates higher reliability of the moderation result. We recommend that you do not use this score in your business.
url String http://www.test.html The temporary access URL of the truncated frame. The URL is valid for 5 minutes.
Table 9. sfaceData
Parameter Type Description
x Float The distance between the upper-left corner of the face area and the y-axis, with the upper-left corner of the image being the coordinate origin. Unit: pixels.
y Float The distance between the upper-left corner of the face area and the x-axis, with the upper-left corner of the image being the coordinate origin. Unit: pixels.
w Float The width of the face area. Unit: pixels.
h Float The height of the face area. Unit: pixels.
faces JSONArray The information about the recognized face. The array contains the following parameters:
  • name: the name of the recognized face. The value is a string.
  • rate: the score of the confidence level. Valid values: 0 to 100. A greater value indicates a higher confidence level. A higher confidence level indicates higher reliability of the facial recognition result. The value is a floating-point number.
  • id: the ID of the recognized face. The value is a string.

Examples

Sample requests
{
    "textScenes": [
        "antispam"
    ],
    "imageScenes": [
        "porn"
    ],
    "tasks": [
        {
            "dataId": "test4lNSMdggA0c56MMvfYoh4e-1mwxpx",
            "url": "http://www.test.html"
        }
    ]
}
Sample success responses
{
    "msg": "OK",
    "code": 200,
    "data": [
        {
            "msg": "OK",
            "code": 200,
            "textResults": [
                {
                    "msg": "OK",
                    "code": 200,
                    "results": [
                        {
                            "rate": 99.91,
                            "suggestion": "block",
                            "details": [
                                {
                                    "contexts": [
                                        {
                                            "context": "xxxxx",
                                            "positions": [
                                                {
                                                    "startPos": 242616,
                                                    "endPos": 242624
                                                }
                                            ]
                                        }
                                    ],
                                    "label": "politics"
                                }
                            ],
                            "label": "politics",
                            "scene": "antispam"
                        }
                    ],
                    "taskId": "wp5$7n$hD74qu4CrNWZlR7Sr-1ttC3T.txt-0"
                }
            ],
            "riskFrequency": {
                "politics": 1
            },
            "suggestion": "block",
            "taskId": "wp5$7n$hD74qu4CrNWZlR7Sr-1ttC3T"
        }
    ],
    "requestId": "B8C1C6BF-0D0A-4317-967E-2DC738CDEAEA"
}