Submits asynchronous video moderation tasks and retrieves results using the Content Moderation API. Supported scenarios include pornography detection, terrorism and political content detection, ad and text violation detection, undesirable scene detection, logo detection, and audio violation detection.
Submit an asynchronous video moderation task
Endpoint: POST /green/video/asyncscan
Submits one or more video moderation tasks for asynchronous processing. Results are not returned immediately — retrieve them by polling or by configuring a callback URL.
For HTTP request construction, see Request structure. To use a pre-built client, see SDK overview.
Billing
Charges apply per moderation scenario. For multi-scenario detection, fees accumulate — each scenario is charged by multiplying the number of moderated video frames by the unit price for that scenario. If audio detection is also enabled, an additional fee applies: video duration × unit price for audio violation detection.
Detection objects
Moderation supports video files and video streams:
Video files: Submit a sequence of video frame images, or provide a video URL.
Video streams: Provide a stream URL using a supported protocol.
Retrieve results
Asynchronous moderation results are not returned in the response. Use one of two methods to retrieve them:
Callback (recommended): Include the
callbackparameter in the request. AI Guardrails pushes results to your endpoint automatically when detection is complete.Polling: Omit the
callbackparameter, then call/green/video/resultsperiodically to retrieve results. Query at least 30 seconds after submitting the task. Results are retained for up to 4 hours.
Moderation results are retained for up to 1 hour.
Video requirements
Video files:
URL protocol: HTTP or HTTPS
Supported formats: AVI, FLV, MP4, MPG, ASF, WMV, MOV, WMA, RMVB, RM, FLASH, TS
Maximum file size: 200 MB. For files larger than 200 MB, segment the video before submitting. To increase the size limit, contact technical support via DingTalk Group (group number: 35573806).
Use a stable storage service. Object Storage Service (OSS) is recommended.
Video streams:
Supported protocols: RTMP, HLS, HTTP-FLV, RTSP
Maximum stream duration per task: 24 hours. The task ends automatically after 24 hours.
Detection scenarios
scenes value | Scenario | Result labels |
|---|---|---|
porn | Pornography detection | Normal, Pornographic |
terrorism | Terrorism and political content detection | Normal, Terrorism and political content |
live | Undesirable scene detection | Normal, Undesirable scene (e.g., black screen or white screen) |
Logo | Logo detection | Normal, Logo |
ad | Ad and text violation detection | Normal, Ad or text violation |
For audio detection, use the audioScenes parameter with value antispam. Audio detection is supported only by this asynchronous operation and requires the url parameter (not frames). Result labels: Normal, Spam, Ad, Political, Terrorism, Abuse, Pornographic, Flooding, Contraband, Custom.
The default language for audio detection is Chinese. To detect English audio content, contact your account manager.
QPS limits
| Limit | Value |
|---|---|
| Maximum calls per second | 50 |
| Maximum concurrent moderation tasks | 20 |
To increase the concurrent task limit, contact your business manager.
If you do not need real-time moderation, enable offline moderation. In offline moderation mode, the system starts processing within 24 hours after you submit the task.
Request parameters
| Name | Type | Required | Example | Description |
|---|---|---|---|---|
bizType | String | No | default | The business scenario. Create business scenarios in the Content Moderation console. For more information, see Customize policies for machine-assisted moderation. |
live | Boolean | No | false | Specifies whether the detection target is a live stream. Valid values: false (default) — detects a video-on-demand (VOD) file; true — detects a live stream. |
offline | Boolean | No | false | Specifies whether to use offline moderation mode. Valid values: false (default) — real-time mode, requests exceeding the concurrency limit are rejected; true — offline mode, tasks are queued and processed within 24 hours. Applies to video files only. |
scenes | StringArray | Yes | ["porn"] | The detection scenarios. Valid values: porn, terrorism, live, Logo, ad. |
audioScenes | StringArray | No | ["antispam"] | The audio detection scenario. The only valid value is antispam. If omitted, only video frames are analyzed. Requires the url parameter in the task — not compatible with frames. |
callback | String | No | https://example.com/callback | The callback URL for receiving moderation results. Supports HTTP and HTTPS. The endpoint must support POST requests, UTF-8 encoding, and the checksum and content parameters. If omitted, poll /green/video/results for results. |
seed | String | No | abc**** | A random string used to generate a signature for callback notification requests. Up to 64 characters; can contain letters, digits, and underscores (_). Required when callback is set. |
cryptType | String | No | SHA256 | The encryption algorithm for callback notification content. Valid values: SHA256 (HMAC-SHA256, default); SM3 (HMAC-SM3, returns a lowercase hexadecimal string — for example, encrypting abc returns 66c7f0f462eeedd9d1f2d46bdc10e4e24167c4875cf2f7a2297da02b8f4ba8e0). |
tasks | JSONArray | Yes | — | The detection tasks. Maximum 100 tasks per request. Each element is a task object — see the task parameters table below. To submit 100 tasks, increase the concurrent task limit above 100. |
Task parameters
| Name | Type | Required | Example | Description |
|---|---|---|---|---|
clientInfo | JSONObject | No | {"userId":"12023****","userNick":"Mike","userType":"others"} | The client information. For the structure, see the "Common request parameters" section in Common parameters. Takes priority over the global clientInfo parameter. |
dataId | String | No | videoId**** | The ID of the moderation object. Up to 128 characters; can contain letters, digits, underscores (_), hyphens (-), and periods (.). |
liveId | String | No | liveId**** | The ID of the live stream. Used to prevent duplicate detection tasks. If provided, AI Guardrails checks for an existing task by matching UID, bizType, and liveId. If a matching task is in progress, the existing taskId is returned and no new task is started. |
url | String | No | http://example.com/a.flv | The HTTP or HTTPS URL of the video. Up to 2,048 characters. Required when frames is not set. |
frames | JSONArray | No | — | The video frames to analyze. Each element is a frame object — see the frame parameters table below. Required when url is not set. |
framePrefix | String | No | http://example.com/video/ | The URL prefix for frame URLs. The full frame URL is constructed as framePrefix + frame.url. |
interval | Integer | No | 1 | The interval between frame captures. Unit: seconds. Valid values: 1–600. Default value: 1. |
maxFrames | Integer | No | 200 | The maximum number of frames to capture. Valid values: 5–3600. Default value: 200. Applies to video files (live=false) only — has no effect on live streams. With an OSS URL (oss://) and ApsaraVideo Media Processing (MPS) authorization, up to 20,000 frames can be captured at no additional cost. To increase the default limit, open a ticket via Support and Services. |
Frame parameters
| Name | Type | Required | Example | Description |
|---|---|---|---|---|
url | String | No | http://example.com/0B860000586C0A0300038A0460000 | The URL of the video frame. Combined with framePrefix to form the full URL: framePrefix + frame.url. |
offset | Integer | No | 10 | The timestamp of the frame relative to the start of the video. Unit: seconds. |
Callback notifications
When callback is set, AI Guardrails sends a POST request to the callback URL with two parameters:
`checksum`: A signature string in
UID + Seed + Contentformat, generated using the algorithm specified bycryptType.UIDmust be your Alibaba Cloud account ID — not a RAM user ID. To verify the callback, generate the same string on your server and compare it to the receivedchecksum. Obtain your account ID from the Alibaba Cloud Management Console.`content`: A JSON-formatted string containing moderation results. The structure matches the success response from the result query operation.
AI Guardrails retries failed callback deliveries up to 16 times. After 16 attempts, the callback is abandoned. If callbacks are not received, check the status of the callback URL.
Return HTTP 200 from your server to confirm successful receipt.
Response parameters
| Name | Type | Example | Description |
|---|---|---|---|
taskId | String | taskId**** | The ID of the detection task. Use this ID to query results. |
dataId | String | videoId**** | The ID of the moderation object, echoed from the request. |
Examples
Request — submit video frame images
POST http(s)://[Endpoint]/green/video/asyncscan
&<Common request parameters>
{
"scenes": ["porn"],
"tasks": [
{
"dataId": "videoId****",
"frames": [
{"offset": 10, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A0460000"},
{"offset": 20, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A0460001"},
{"offset": 30, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A0460002"},
{"offset": 40, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A0460003"},
{"offset": 50, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A0460003"},
{"offset": 60, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A046000x"}
]
}
]
}Request — submit a video file with audio detection
POST http(s)://[Endpoint]/green/video/asyncscan
&<Common request parameters>
{
"scenes": ["porn"],
"audioScenes": ["antispam"],
"tasks": [
{
"dataId": "videoId****",
"url": "http://www.aliyundoc.com/a.mp4",
"interval": 1,
"maxFrames": 200
}
]
}Request — submit a live video stream
POST http(s)://[Endpoint]/green/video/asyncscan
&<Common request parameters>
{
"scenes": ["porn"],
"live": true,
"tasks": [
{
"dataId": "videoId****",
"url": "http://www.aliyundoc.com/a.flv",
"interval": 1,
"maxFrames": 200
}
]
}Response
{
"code": 200,
"msg": "OK",
"requestId": "requestID****",
"data": [
{
"dataId": "videoId****",
"taskId": "taskId****"
}
]
}Query asynchronous video moderation results
Endpoint: POST /green/video/results
Queries the results of one or more asynchronous video moderation tasks submitted via /green/video/asyncscan.
For HTTP request construction, see Request structure. To use a pre-built client, see SDK overview.
Billing
This operation is free of charge.
Query timing
Query at least 30 seconds after submitting an async moderation request. Results are retained for up to 4 hours; after that, they are deleted.
For live stream tasks, a response code of 280 means detection is in progress. A code of 200 means detection is complete. When a task is in progress, results reflect findings from the start of detection to the current time.
QPS limits
Maximum 50 calls per second per account.
Request parameters
| Name | Type | Required | Example | Description |
|---|---|---|---|---|
body | JSONArray | Yes | ["taskId**", "taskId**"] | The list of task IDs to query. Maximum 100 IDs per request. Obtain task IDs from the response of the submit operation. |
Response parameters
| Name | Type | Example | Description |
|---|---|---|---|
code | Integer | 200 | The HTTP status code. For a list of status codes, see Common error codes. |
msg | String | OK | The response message. |
dataId | String | videoId**** | The ID of the moderation object, echoed from the submit request. |
taskId | String | taskId**** | The ID of the detection task. |
results | JSONArray | — | The video frame moderation results. Returned when code is 200. Each element is a result object — see the result fields table below. |
audioScanResults | JSONArray | — | The audio moderation results. Each element is an audio scan result object — see the audio scan result fields table below. |
Result fields
| Name | Type | Example | Description |
|---|---|---|---|
scene | String | porn | The detection scenario. Valid values: porn, terrorism, live, Logo, ad. |
label | String | porn | The moderation result label. Labels vary by scenario — see the label values table below. |
sublabel | String | porn | The subcategory of the result. Returned for porn and terrorism scenarios only; not returned by default. |
suggestion | String | block | The recommended action. Valid values: pass — content is normal, no action needed; review — uncertain, manual review required; block — violation detected, delete the content or restrict access. |
rate | Float | 99.2 | The confidence score. Range: 0–100. A higher score indicates higher confidence. Use suggestion, label, and sublabel to determine whether content contains violations — do not rely on rate alone. |
frames | JSONArray | — | Information about video frames that contain violations. Each element is a frame object — see the frame result fields table below. |
hintWordsInfo | JSONArray | — | Information about risky keywords detected in the video. Returned only for the ad scenario. Each element is a hint words info object — see the hint words info fields table. |
logoData | JSONArray | — | Information about detected logos. Returned only for the logo scenario. Each element is a logo data object — see the logo data fields table. |
sfaceData | JSONArray | — | Information about detected faces related to political content. Returned only for the terrorism scenario. Each element is an sface data object — see the sface data fields table. |
Label values by scenario:
| Scenario | Labels |
|---|---|
porn | normal, porn |
terrorism | normal, terrorism |
live | normal, live |
logo | normal, logo |
ad | normal, ad |
Frame result fields
| Name | Type | Example | Description |
|---|---|---|---|
url | String | http://example.com/0B860000586C0A0 | The URL of the video frame. |
offset | Integer | 50 | The timestamp of the frame relative to the start of the video. Unit: seconds. |
label | String | porn | The moderation result label for this frame. Labels vary by scenario — see below. |
rate | Float | 99.1 | The confidence score for this frame. Range: 0–100. Use for reference only. |
Frame label values by scenario:
| Scenario | Frame labels |
|---|---|
porn | normal, sexy, porn |
terrorism | normal, bloody, explosion, outfit, Logo, weapon, politics, Violence, crowd, parade, carcrash, flag, location, drug, gamble, others |
ad | normal, politics, porn, abuse, terrorism, contraband, spam, npx, qrcode, programCode, ad |
live | normal, meaningless, PIP, smoking, drivelive, drug, gamble |
logo | normal, TV, trademark |
Audio scan result fields
| Name | Type | Example | Description |
|---|---|---|---|
scene | String | antispam | The audio detection scenario. The only valid value is antispam. |
label | String | customized | The overall audio moderation result label. Valid values: normal, spam, ad, politics, Terrorism, Abuse, porn, flood, Contraband, customized. |
suggestion | String | block | The recommended action. Valid values: pass, review, block. |
rate | Float | 99.91 | The confidence score. Range: 0–100. Use suggestion and label to determine violations — do not rely on rate alone. |
details | JSONArray | — | The sentence-level audio detection results. Each element corresponds to one sentence — see the detail fields table below. |
Detail fields
| Name | Type | Example | Description |
|---|---|---|---|
startTime | Integer | 24 | The start time of the sentence. Unit: seconds. |
endTime | Integer | 60 | The end time of the sentence. Unit: seconds. |
text | String | Computer | The transcribed text of the audio segment. |
label | String | normal | The moderation result label for this sentence. Valid values: normal, spam, ad, politics, Terrorism, Abuse, porn, flood, Contraband, customized. |
keyword | String | Enable | The custom keyword that was matched, if any. |
libName | String | Manual | The name of the keyword library that contained the matched keyword. |
Logo data fields
| Name | Type | Example | Description |
|---|---|---|---|
type | String | TV | The type of the detected logo. Valid value: TV (station logo). |
name | String | ***TV | The name of the detected logo. |
x | Float | 140 | The distance from the left edge of the logo area to the y-axis. Origin: upper-left corner of the image. Unit: pixels. |
y | Float | 68 | The distance from the top edge of the logo area to the x-axis. Origin: upper-left corner of the image. Unit: pixels. |
w | Float | 106 | The width of the logo area. Unit: pixels. |
h | Float | 106 | The height of the logo area. Unit: pixels. |
Sface data fields
| Name | Type | Example | Description |
|---|---|---|---|
x | Float | 444 | The distance from the left edge of the face area to the y-axis. Origin: upper-left corner of the image. |
y | Float | 174 | The distance from the top edge of the face area to the x-axis. Origin: upper-left corner of the image. |
w | Float | 467 | The width of the face area. |
h | Float | 467 | The height of the face area. |
smileRate | Float | 0 | The probability that the person is smiling. |
glasses | Boolean | false | Specifies whether the person is wearing glasses. |
faces | Array | — | Information about detected faces. Each element is a face object — see the face fields table below. |
Face fields
| Name | Type | Example | Description |
|---|---|---|---|
name | String | xxxx | The name of the matched person. |
rate | Float | 97.03 | The similarity score. |
id | String | AliFace_001**** | The face ID. |
Hit library info fields (hitLibInfo)
| Name | Type | Example | Description |
|---|---|---|---|
context | String | xxxx | The custom text content that was hit. |
libCode | String | 69751 | The code of the library that contains the custom text content that was hit. |
libName | String | Manual | The name of the library that contains the custom text content that was hit. |
Hint words info fields
| Name | Type | Example | Description |
|---|---|---|---|
context | String | xxxx | The risky keyword that was matched in the text. |
Examples
Request
POST http(s)://[Endpoint]/green/video/results
&<Common request parameters>
[
"taskId****",
"taskId****"
]Response — video frames only
{
"code": 200,
"msg": "OK",
"requestId": "requestID****",
"data": [
{
"code": 200,
"msg": "OK",
"dataId": "videoId****",
"taskId": "taskId****",
"results": [
{
"label": "porn",
"rate": 99.2,
"scene": "porn",
"suggestion": "block"
}
]
}
]
}Response — video frames and audio
{
"code": 200,
"msg": "OK",
"requestId": "requestID****",
"data": [
{
"code": 200,
"msg": "OK",
"dataId": "videoId****",
"taskId": "taskId****",
"results": [
{
"label": "porn",
"rate": 99.2,
"scene": "porn",
"suggestion": "block"
}
],
"audioScanResults": [
{
"scene": "antispam",
"label": "customized",
"suggestion": "block",
"rate": 99.91,
"details": [
{
"startTime": 0,
"endTime": 24,
"text": "Computer",
"label": "customized"
},
{
"startTime": 24,
"endTime": 60,
"text": "Computer",
"label": "normal"
}
]
}
]
}
]
}