All Products
Search
Document Center

AI Guardrails:Asynchronous detection

Last Updated:Mar 31, 2026

Submits asynchronous video moderation tasks and retrieves results using the Content Moderation API. Supported scenarios include pornography detection, terrorism and political content detection, ad and text violation detection, undesirable scene detection, logo detection, and audio violation detection.

Submit an asynchronous video moderation task

Endpoint: POST /green/video/asyncscan

Submits one or more video moderation tasks for asynchronous processing. Results are not returned immediately — retrieve them by polling or by configuring a callback URL.

For HTTP request construction, see Request structure. To use a pre-built client, see SDK overview.

Billing

Charges apply per moderation scenario. For multi-scenario detection, fees accumulate — each scenario is charged by multiplying the number of moderated video frames by the unit price for that scenario. If audio detection is also enabled, an additional fee applies: video duration × unit price for audio violation detection.

Detection objects

Moderation supports video files and video streams:

  • Video files: Submit a sequence of video frame images, or provide a video URL.

  • Video streams: Provide a stream URL using a supported protocol.

Retrieve results

Asynchronous moderation results are not returned in the response. Use one of two methods to retrieve them:

  • Callback (recommended): Include the callback parameter in the request. AI Guardrails pushes results to your endpoint automatically when detection is complete.

  • Polling: Omit the callback parameter, then call /green/video/results periodically to retrieve results. Query at least 30 seconds after submitting the task. Results are retained for up to 4 hours.

Moderation results are retained for up to 1 hour.

Video requirements

Video files:

  • URL protocol: HTTP or HTTPS

  • Supported formats: AVI, FLV, MP4, MPG, ASF, WMV, MOV, WMA, RMVB, RM, FLASH, TS

  • Maximum file size: 200 MB. For files larger than 200 MB, segment the video before submitting. To increase the size limit, contact technical support via DingTalk Group (group number: 35573806).

  • Use a stable storage service. Object Storage Service (OSS) is recommended.

Video streams:

  • Supported protocols: RTMP, HLS, HTTP-FLV, RTSP

  • Maximum stream duration per task: 24 hours. The task ends automatically after 24 hours.

Detection scenarios

scenes valueScenarioResult labels
pornPornography detectionNormal, Pornographic
terrorismTerrorism and political content detectionNormal, Terrorism and political content
liveUndesirable scene detectionNormal, Undesirable scene (e.g., black screen or white screen)
LogoLogo detectionNormal, Logo
adAd and text violation detectionNormal, Ad or text violation

For audio detection, use the audioScenes parameter with value antispam. Audio detection is supported only by this asynchronous operation and requires the url parameter (not frames). Result labels: Normal, Spam, Ad, Political, Terrorism, Abuse, Pornographic, Flooding, Contraband, Custom.

The default language for audio detection is Chinese. To detect English audio content, contact your account manager.

QPS limits

LimitValue
Maximum calls per second50
Maximum concurrent moderation tasks20

To increase the concurrent task limit, contact your business manager.

If you do not need real-time moderation, enable offline moderation. In offline moderation mode, the system starts processing within 24 hours after you submit the task.

Request parameters

NameTypeRequiredExampleDescription
bizTypeStringNodefaultThe business scenario. Create business scenarios in the Content Moderation console. For more information, see Customize policies for machine-assisted moderation.
liveBooleanNofalseSpecifies whether the detection target is a live stream. Valid values: false (default) — detects a video-on-demand (VOD) file; true — detects a live stream.
offlineBooleanNofalseSpecifies whether to use offline moderation mode. Valid values: false (default) — real-time mode, requests exceeding the concurrency limit are rejected; true — offline mode, tasks are queued and processed within 24 hours. Applies to video files only.
scenesStringArrayYes["porn"]The detection scenarios. Valid values: porn, terrorism, live, Logo, ad.
audioScenesStringArrayNo["antispam"]The audio detection scenario. The only valid value is antispam. If omitted, only video frames are analyzed. Requires the url parameter in the task — not compatible with frames.
callbackStringNohttps://example.com/callbackThe callback URL for receiving moderation results. Supports HTTP and HTTPS. The endpoint must support POST requests, UTF-8 encoding, and the checksum and content parameters. If omitted, poll /green/video/results for results.
seedStringNoabc****A random string used to generate a signature for callback notification requests. Up to 64 characters; can contain letters, digits, and underscores (_). Required when callback is set.
cryptTypeStringNoSHA256The encryption algorithm for callback notification content. Valid values: SHA256 (HMAC-SHA256, default); SM3 (HMAC-SM3, returns a lowercase hexadecimal string — for example, encrypting abc returns 66c7f0f462eeedd9d1f2d46bdc10e4e24167c4875cf2f7a2297da02b8f4ba8e0).
tasksJSONArrayYesThe detection tasks. Maximum 100 tasks per request. Each element is a task object — see the task parameters table below. To submit 100 tasks, increase the concurrent task limit above 100.

Task parameters

NameTypeRequiredExampleDescription
clientInfoJSONObjectNo{"userId":"12023****","userNick":"Mike","userType":"others"}The client information. For the structure, see the "Common request parameters" section in Common parameters. Takes priority over the global clientInfo parameter.
dataIdStringNovideoId****The ID of the moderation object. Up to 128 characters; can contain letters, digits, underscores (_), hyphens (-), and periods (.).
liveIdStringNoliveId****The ID of the live stream. Used to prevent duplicate detection tasks. If provided, AI Guardrails checks for an existing task by matching UID, bizType, and liveId. If a matching task is in progress, the existing taskId is returned and no new task is started.
urlStringNohttp://example.com/a.flvThe HTTP or HTTPS URL of the video. Up to 2,048 characters. Required when frames is not set.
framesJSONArrayNoThe video frames to analyze. Each element is a frame object — see the frame parameters table below. Required when url is not set.
framePrefixStringNohttp://example.com/video/The URL prefix for frame URLs. The full frame URL is constructed as framePrefix + frame.url.
intervalIntegerNo1The interval between frame captures. Unit: seconds. Valid values: 1–600. Default value: 1.
maxFramesIntegerNo200The maximum number of frames to capture. Valid values: 5–3600. Default value: 200. Applies to video files (live=false) only — has no effect on live streams. With an OSS URL (oss://) and ApsaraVideo Media Processing (MPS) authorization, up to 20,000 frames can be captured at no additional cost. To increase the default limit, open a ticket via Support and Services.

Frame parameters

NameTypeRequiredExampleDescription
urlStringNohttp://example.com/0B860000586C0A0300038A0460000The URL of the video frame. Combined with framePrefix to form the full URL: framePrefix + frame.url.
offsetIntegerNo10The timestamp of the frame relative to the start of the video. Unit: seconds.

Callback notifications

When callback is set, AI Guardrails sends a POST request to the callback URL with two parameters:

  • `checksum`: A signature string in UID + Seed + Content format, generated using the algorithm specified by cryptType. UID must be your Alibaba Cloud account ID — not a RAM user ID. To verify the callback, generate the same string on your server and compare it to the received checksum. Obtain your account ID from the Alibaba Cloud Management Console.

  • `content`: A JSON-formatted string containing moderation results. The structure matches the success response from the result query operation.

AI Guardrails retries failed callback deliveries up to 16 times. After 16 attempts, the callback is abandoned. If callbacks are not received, check the status of the callback URL.

Return HTTP 200 from your server to confirm successful receipt.

Response parameters

NameTypeExampleDescription
taskIdStringtaskId****The ID of the detection task. Use this ID to query results.
dataIdStringvideoId****The ID of the moderation object, echoed from the request.

Examples

Request — submit video frame images

POST http(s)://[Endpoint]/green/video/asyncscan
&<Common request parameters>

{
    "scenes": ["porn"],
    "tasks": [
        {
            "dataId": "videoId****",
            "frames": [
                {"offset": 10, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A0460000"},
                {"offset": 20, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A0460001"},
                {"offset": 30, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A0460002"},
                {"offset": 40, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A0460003"},
                {"offset": 50, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A0460003"},
                {"offset": 60, "url": "http://www.aliyundoc.com/0B860000586C0A0300038A046000x"}
            ]
        }
    ]
}

Request — submit a video file with audio detection

POST http(s)://[Endpoint]/green/video/asyncscan
&<Common request parameters>

{
    "scenes": ["porn"],
    "audioScenes": ["antispam"],
    "tasks": [
        {
            "dataId": "videoId****",
            "url": "http://www.aliyundoc.com/a.mp4",
            "interval": 1,
            "maxFrames": 200
        }
    ]
}

Request — submit a live video stream

POST http(s)://[Endpoint]/green/video/asyncscan
&<Common request parameters>

{
    "scenes": ["porn"],
    "live": true,
    "tasks": [
        {
            "dataId": "videoId****",
            "url": "http://www.aliyundoc.com/a.flv",
            "interval": 1,
            "maxFrames": 200
        }
    ]
}

Response

{
    "code": 200,
    "msg": "OK",
    "requestId": "requestID****",
    "data": [
        {
            "dataId": "videoId****",
            "taskId": "taskId****"
        }
    ]
}

Query asynchronous video moderation results

Endpoint: POST /green/video/results

Queries the results of one or more asynchronous video moderation tasks submitted via /green/video/asyncscan.

For HTTP request construction, see Request structure. To use a pre-built client, see SDK overview.

Billing

This operation is free of charge.

Query timing

Query at least 30 seconds after submitting an async moderation request. Results are retained for up to 4 hours; after that, they are deleted.

For live stream tasks, a response code of 280 means detection is in progress. A code of 200 means detection is complete. When a task is in progress, results reflect findings from the start of detection to the current time.

QPS limits

Maximum 50 calls per second per account.

Request parameters

NameTypeRequiredExampleDescription
bodyJSONArrayYes["taskId**", "taskId**"]The list of task IDs to query. Maximum 100 IDs per request. Obtain task IDs from the response of the submit operation.

Response parameters

NameTypeExampleDescription
codeInteger200The HTTP status code. For a list of status codes, see Common error codes.
msgStringOKThe response message.
dataIdStringvideoId****The ID of the moderation object, echoed from the submit request.
taskIdStringtaskId****The ID of the detection task.
resultsJSONArrayThe video frame moderation results. Returned when code is 200. Each element is a result object — see the result fields table below.
audioScanResultsJSONArrayThe audio moderation results. Each element is an audio scan result object — see the audio scan result fields table below.

Result fields

NameTypeExampleDescription
sceneStringpornThe detection scenario. Valid values: porn, terrorism, live, Logo, ad.
labelStringpornThe moderation result label. Labels vary by scenario — see the label values table below.
sublabelStringpornThe subcategory of the result. Returned for porn and terrorism scenarios only; not returned by default.
suggestionStringblockThe recommended action. Valid values: pass — content is normal, no action needed; review — uncertain, manual review required; block — violation detected, delete the content or restrict access.
rateFloat99.2The confidence score. Range: 0–100. A higher score indicates higher confidence. Use suggestion, label, and sublabel to determine whether content contains violations — do not rely on rate alone.
framesJSONArrayInformation about video frames that contain violations. Each element is a frame object — see the frame result fields table below.
hintWordsInfoJSONArrayInformation about risky keywords detected in the video. Returned only for the ad scenario. Each element is a hint words info object — see the hint words info fields table.
logoDataJSONArrayInformation about detected logos. Returned only for the logo scenario. Each element is a logo data object — see the logo data fields table.
sfaceDataJSONArrayInformation about detected faces related to political content. Returned only for the terrorism scenario. Each element is an sface data object — see the sface data fields table.

Label values by scenario:

ScenarioLabels
pornnormal, porn
terrorismnormal, terrorism
livenormal, live
logonormal, logo
adnormal, ad

Frame result fields

NameTypeExampleDescription
urlStringhttp://example.com/0B860000586C0A0The URL of the video frame.
offsetInteger50The timestamp of the frame relative to the start of the video. Unit: seconds.
labelStringpornThe moderation result label for this frame. Labels vary by scenario — see below.
rateFloat99.1The confidence score for this frame. Range: 0–100. Use for reference only.

Frame label values by scenario:

ScenarioFrame labels
pornnormal, sexy, porn
terrorismnormal, bloody, explosion, outfit, Logo, weapon, politics, Violence, crowd, parade, carcrash, flag, location, drug, gamble, others
adnormal, politics, porn, abuse, terrorism, contraband, spam, npx, qrcode, programCode, ad
livenormal, meaningless, PIP, smoking, drivelive, drug, gamble
logonormal, TV, trademark

Audio scan result fields

NameTypeExampleDescription
sceneStringantispamThe audio detection scenario. The only valid value is antispam.
labelStringcustomizedThe overall audio moderation result label. Valid values: normal, spam, ad, politics, Terrorism, Abuse, porn, flood, Contraband, customized.
suggestionStringblockThe recommended action. Valid values: pass, review, block.
rateFloat99.91The confidence score. Range: 0–100. Use suggestion and label to determine violations — do not rely on rate alone.
detailsJSONArrayThe sentence-level audio detection results. Each element corresponds to one sentence — see the detail fields table below.

Detail fields

NameTypeExampleDescription
startTimeInteger24The start time of the sentence. Unit: seconds.
endTimeInteger60The end time of the sentence. Unit: seconds.
textStringComputerThe transcribed text of the audio segment.
labelStringnormalThe moderation result label for this sentence. Valid values: normal, spam, ad, politics, Terrorism, Abuse, porn, flood, Contraband, customized.
keywordStringEnableThe custom keyword that was matched, if any.
libNameStringManualThe name of the keyword library that contained the matched keyword.

Logo data fields

NameTypeExampleDescription
typeStringTVThe type of the detected logo. Valid value: TV (station logo).
nameString***TVThe name of the detected logo.
xFloat140The distance from the left edge of the logo area to the y-axis. Origin: upper-left corner of the image. Unit: pixels.
yFloat68The distance from the top edge of the logo area to the x-axis. Origin: upper-left corner of the image. Unit: pixels.
wFloat106The width of the logo area. Unit: pixels.
hFloat106The height of the logo area. Unit: pixels.

Sface data fields

NameTypeExampleDescription
xFloat444The distance from the left edge of the face area to the y-axis. Origin: upper-left corner of the image.
yFloat174The distance from the top edge of the face area to the x-axis. Origin: upper-left corner of the image.
wFloat467The width of the face area.
hFloat467The height of the face area.
smileRateFloat0The probability that the person is smiling.
glassesBooleanfalseSpecifies whether the person is wearing glasses.
facesArrayInformation about detected faces. Each element is a face object — see the face fields table below.

Face fields

NameTypeExampleDescription
nameStringxxxxThe name of the matched person.
rateFloat97.03The similarity score.
idStringAliFace_001****The face ID.

Hit library info fields (hitLibInfo)

NameTypeExampleDescription
contextStringxxxxThe custom text content that was hit.
libCodeString69751The code of the library that contains the custom text content that was hit.
libNameStringManualThe name of the library that contains the custom text content that was hit.

Hint words info fields

NameTypeExampleDescription
contextStringxxxxThe risky keyword that was matched in the text.

Examples

Request

POST http(s)://[Endpoint]/green/video/results
&<Common request parameters>

[
    "taskId****",
    "taskId****"
]

Response — video frames only

{
    "code": 200,
    "msg": "OK",
    "requestId": "requestID****",
    "data": [
        {
            "code": 200,
            "msg": "OK",
            "dataId": "videoId****",
            "taskId": "taskId****",
            "results": [
                {
                    "label": "porn",
                    "rate": 99.2,
                    "scene": "porn",
                    "suggestion": "block"
                }
            ]
        }
    ]
}

Response — video frames and audio

{
    "code": 200,
    "msg": "OK",
    "requestId": "requestID****",
    "data": [
        {
            "code": 200,
            "msg": "OK",
            "dataId": "videoId****",
            "taskId": "taskId****",
            "results": [
                {
                    "label": "porn",
                    "rate": 99.2,
                    "scene": "porn",
                    "suggestion": "block"
                }
            ],
            "audioScanResults": [
                {
                    "scene": "antispam",
                    "label": "customized",
                    "suggestion": "block",
                    "rate": 99.91,
                    "details": [
                        {
                            "startTime": 0,
                            "endTime": 24,
                            "text": "Computer",
                            "label": "customized"
                        },
                        {
                            "startTime": 24,
                            "endTime": 60,
                            "text": "Computer",
                            "label": "normal"
                        }
                    ]
                }
            ]
        }
    ]
}