This topic describes the /green/webpage/scan operation that you can call to moderate web pages and obtain moderation results in real time. Web page moderation helps you detect image and text violations on a web page, and returns moderation results and the categories of moderation results. You can call this operation to moderate web pages that use HTTP or HTTPS URLs.
Operation description
Operation: /green/webpage/scan
You can call this operation to submit web page moderation tasks and obtain moderation results in real time. For more information about how to construct an HTTP request, see Request structure. You can also select an existing HTTP request. For more information, see SDK overview.
Billing method:
You are charged for calling this operation. For more information about the billing method, seeContent Moderation Pricing.
Request parameters
Parameter | Type | Required | Example | Description |
---|---|---|---|---|
bizType | String | No | default | The business scenario. You can create a business scenario in the Alibaba Cloud Content Moderation console. For more information, see Customize policies for machine-assisted moderation. You can also submit a ticket to ask Alibaba Cloud engineers to help you create a business scenario. |
textScenes | StringArray | No | ["antispam"] | The moderation scenario of the text to be moderated on a web page. Set the value to
antispam.
Note You must specify at least one of the textScenes and imageScenes parameters.
|
imageScenes | StringArray | No | ["porn","ad"] | The moderation scenario of the images to be moderated on a web page. Valid values:
Note You must specify at least one of the textScenes and imageScenes parameters.
|
tasks | JSONArray | Yes | The list of moderation tasks. The value is a JSON array that can contain one to five elements. Each element is a structure. For more information about the structure, see task. | |
returnHighlightHtml | Boolean | No | false | Specifies whether to highlight violations. Valid values:
|
Parameter | Type | Required | Example | Description |
---|---|---|---|---|
dataId | String | No | test4lNSMdggA0c56MMvfYoh4e-1mwxpx | The ID of the moderation object.
The ID can contain letters, digits, underscores (_), hyphens (-), and periods (.) and can be up to 128 characters in length. This ID uniquely identifies your business data. |
url | String | No | http://www.test.html | The URL of the web page. You can moderate web pages that use HTTP or HTTPS URLs.
Note You must specify one of the url and content parameters.
|
content | String | No | <html>hello,world! </html> | The plaintext in the HTML format of the web page.
Note You must specify one of the url and content parameters.
|
Response parameters
Parameter | Type | Example | Description |
---|---|---|---|
code | Integer | 200 | The returned HTTP status code.
For more information, see Common response parameters. |
msg | String | OK | The message that is returned for the request. |
taskId | String | wp5$7n$hD74qu4CrNWZlR7Sr-1ttC3T | The ID of the moderation task. |
dataId | String | test4lNSMdggA0c56MMvfYoh4e-1mwxpx | The ID of the moderation object.
Note If you set the dataId parameter in the moderation request, the dataId parameter is returned in the response.
|
suggestion | String | block | The recommended subsequent operation for you to perform. Valid values:
|
riskFrequency | JSONObject | { "porn":123, "terrorism":44} | The category and frequency of the risky content that the moderated web page hits.
The value is presented by using key-value pairs. key indicates the category of moderation results, and value indicates the number of moderation results for the corresponding category.
For more information about the sample categories of moderation results, see text labels and image labels. |
textResults | JSONArray | The text moderation results.
This parameter is returned only if you specify the textScenes parameter. The value of the textScenes parameter is a JSON array. For more information about the structure of each element in the array, see textResults. |
|
imageResults | JSONArray | The image moderation results.
This parameter is returned only if you specify the imageScenes parameter. The value of the imageScenes parameter is a JSON array. For more information about the structure of each element in the array, see imageResults. |
|
highlightHtml | String | <html>xxx</html> | The highlighted text in the HTML format. |
Parameter | Type | Example | Description |
---|---|---|---|
code | Integer | 200 | The returned HTTP status code. |
msg | String | OK | The message that is returned for the request. |
dataId | String | test4lNSMdggA0c56MMvfYoh4e-1mwxpx | The ID of the moderation object.
Note If you set the dataId parameter in the moderation request, the dataId parameter is returned in the response.
|
taskId | String | wp5$7n$hD74qu4CrNWZlR7Sr-1ttC3T.txt-0 | The ID of the moderation task. |
results | JSONArray | The return results. If HTTP status code 200 is returned after a successful call, the array in the return results contains one or more elements. Each element is a structure. For more information about the structure, see result. |
Parameter | Type | Example | Description |
---|---|---|---|
scene | String | antispam | The moderation scenario of the moderated text on the web page, which is the same as the value of the textScenes parameter that you specify in the moderation request. |
suggestion | String | block | The recommended subsequent operation for you to perform. Valid values:
|
label | String | politics | The category of the moderation result for the moderated text. Valid values:
|
rate | Float | 99.91 | The score of the confidence level. Valid values: 0 to 100. A greater value indicates a higher confidence level.
If a value of pass is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content
is normal. If a value of review or block is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content
contains violations.
Notice This score is for reference only. We strongly recommend that you do not use this score
in your business. We recommend that you use the values that are returned for the suggestion, label, and sublabel parameters to determine whether the content contains violations. The sublabel parameter
is returned by specific operations.
|
details | JSONArray | The details of the risky content that the moderated text hits. A text entry can hit multiple pieces of risky content. For more information about the structure, see detail. |
Parameter | Type | Example | Description |
---|---|---|---|
label | String | politics | The category of the risky content that the moderated text hits. Valid values:
|
contexts | JSONArray | The context information of the risky content that the moderated text hits. For more information about the structure, see context. |
Parameter | Type | Example | Description |
---|---|---|---|
context | String | Part-time job | The context of the risky content that the moderated text hits. If the text hits a term or text pattern in your custom text library, the term or text pattern is returned. |
libName | String | Ad library 1 | This parameter is returned if the moderated text hits a term or text pattern in the custom text library. The name of the custom text library. |
libCode | String | 12232 | This parameter is returned if the moderated text hits a term or text pattern in the custom text library. The code of the custom text library. |
ruleType | String | content | The behavior rule. This parameter is returned if the moderated text hits the behavior
rule. Valid values:
|
Parameter | Type | Example | Description |
---|---|---|---|
code | Integer | 200 | The returned HTTP status code.
For more information, see Common response parameters. |
msg | String | OK | The returned message. |
dataId | String | test4lNSMdggA0c56MMvfYoh4e-1mwxpx | The ID of the moderation object.
Note If you set the dataId parameter in the moderation request, the dataId parameter is returned in the response.
|
taskId | String | wp5$7n$hD74qu4CrNWZlR7Sr-1ttC3T.txt-0 | The ID of the moderation task. |
url | String | http://xxxxx.jpg | The URL of the moderation object. |
results | JSONArray | The return results. If HTTP status code 200 is returned after a successful call, the array in the return results contains one or more elements. Each element is a structure. For more information about the structure, see result. |
Parameter | Type | Example | Description |
---|---|---|---|
scene | String | porn | The moderation scenario of the moderated image, which you specify in the moderation
request. Valid values:
|
label | String | politics | The category of the moderation result for the moderated image. Valid values vary based
on the specified moderation scenario.
|
suggestion | String | block | The recommended subsequent operation for you to perform. Valid values:
|
rate | Float | 99.91 | The score of the confidence level. Valid values: 0 to 100. A greater value indicates a higher confidence level.
If a value of pass is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content
is normal. If a value of review or block is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content
contains violations.
Notice This score is for reference only. We strongly recommend that you do not use this score
in your business. We recommend that you use the values that are returned for the suggestion, label, and sublabel parameters to determine whether the content contains violations. The sublabel parameter
is returned by specific operations.
|
hintWordsInfo | JSONArray | [{"context":"Sensitive words"}] | The information about the term that the detected ad or illegal text in the moderated
image hits.
Note This parameter is applicable only to ad violation detection.
|
sfaceData | JSONArray | The information about the detected terrorist content in the moderated image. For more
information about the structure, see Table 9.
Note This parameter is applicable only to terrorist content detection.
|
|
ocrData | StringArray | ["xxxxx", "yyyy"] | The information about the detected complete text in the moderated image.
Note By default, this parameter is not returned. If you want this parameter to be returned,
submit a ticket.
|
Parameter | Type | Example | Description |
---|---|---|---|
rate | Float | 99.91 | The score of the confidence level. Valid values: 0 to 100. A higher confidence level indicates higher reliability of the moderation result. We recommend that you do not use this score in your business. |
url | String | http://www.test.html | The temporary access URL of the truncated frame. The URL is valid for 5 minutes. |
Parameter | Type | Description |
---|---|---|
x | Float | The distance between the upper-left corner of the face area and the y-axis, with the upper-left corner of the image being the coordinate origin. Unit: pixels. |
y | Float | The distance between the upper-left corner of the face area and the x-axis, with the upper-left corner of the image being the coordinate origin. Unit: pixels. |
w | Float | The width of the face area. Unit: pixels. |
h | Float | The height of the face area. Unit: pixels. |
faces | JSONArray | The information about the recognized face. The array contains the following parameters:
|
Examples
{
"textScenes": [
"antispam"
],
"imageScenes": [
"porn"
],
"tasks": [
{
"dataId": "test4lNSMdggA0c56MMvfYoh4e-1mwxpx",
"url": "http://www.test.html"
}
]
}
{
"msg": "OK",
"code": 200,
"data": [
{
"msg": "OK",
"code": 200,
"textResults": [
{
"msg": "OK",
"code": 200,
"results": [
{
"rate": 99.91,
"suggestion": "block",
"details": [
{
"contexts": [
{
"context": "xxxxx",
"positions": [
{
"startPos": 242616,
"endPos": 242624
}
]
}
],
"label": "politics"
}
],
"label": "politics",
"scene": "antispam"
}
],
"taskId": "wp5$7n$hD74qu4CrNWZlR7Sr-1ttC3T.txt-0"
}
],
"riskFrequency": {
"politics": 1
},
"suggestion": "block",
"taskId": "wp5$7n$hD74qu4CrNWZlR7Sr-1ttC3T"
}
],
"requestId": "B8C1C6BF-0D0A-4317-967E-2DC738CDEAEA"
}