Video Moderation Version 2.0 scans ApsaraVideo VOD files and live streams for policy-violating content across both video frames and audio. The service returns risk labels with confidence scores so you can take moderation actions based on your platform's rules.
The service integrates Image Moderation Version 2.0 for frame analysis and Voice Moderation Version 2.0 for audio analysis, letting you reuse configurations already set up for those services.
Services
Video Moderation Version 2.0 provides four services depending on your content type and model preference.
| Service | Service ID | Description | Availability |
|---|---|---|---|
| Video file detection | videoDetection_global | Scans video files for violations in frames and audio | All regions |
| Video file detection (Large model edition) | videoDetectionByVL_global | Uses large model image moderation for frame analysis; 10 ingest endpoints by default | Singapore only |
| Live video stream moderation | liveStreamDetection_global | Scans live video streams for violations in frames and audio | All regions |
| Live video stream moderation (Large model edition) | liveStreamDetectionByVL_global | Uses large model image moderation for frame analysis; 10 ingest endpoints by default | Singapore only |
Large model edition services default to 10 ingest endpoints. Control the number of concurrent calls accordingly.
Upgrade from Video Moderation 1.0
If you are evaluating whether to upgrade, the table below summarizes the differences between versions.
| Video Moderation Version 2.0 | Video Moderation 1.0 | |
|---|---|---|
| Default ingest endpoints | 50 | 20 |
| Default QPS | 100 calls/second | 50 calls/second |
| Max video size | 500 MB | 200 MB |
| Frame detection scope | General-purpose baseline check (via Image Moderation Version 2.0)
| Pornography, terrorism, undesirable scenes, logos, text and image violations |
| Audio detection scope | Multilingual audio and video media detection; multilingual social and entertainment live stream detection (via Voice Moderation Version 2.0) | — |
| Console features | Frame detection service settings, audio detection service settings, snapshot settings, result return settings | Check item settings only |
| Frame billing | Consistent with Image Moderation Version 2.0 pricing | 1.8× Image Moderation 1.0 pricing |
| Audio billing | 10% discount vs. Voice Moderation Version 2.0 | Consistent with Voice Moderation 1.0 |
Version 2.0 increases the default ingest endpoints from 20 to 50, raises the QPS limit from 50 to 100 calls/second, and supports video files up to 500 MB. Frame detection scope is consolidated into a single general-purpose baseline check that covers the content categories handled by Image Moderation Version 2.0.
Billing
Video Moderation Version 2.0 uses pay-as-you-go billing. You are not charged when the service is not called. Usage is metered and billed once every 24 hours.
Pricing
| Moderation type | Business scenarios | Unit price |
|---|---|---|
Standard video image detection (image_standard, video_image_standard) |
| USD 0.60 per 1,000 calls Note Each call to a business scenario in this tier is counted as one transaction. You are billed based on your actual usage. For example, 100 calls to the common baseline detection service cost USD 0.06. |
Premium video image detection (image_advanced, video_image_advanced) |
| USD 1.20 per 1,000 calls Note Each call to a business scenario in this tier is counted as one transaction. You are billed based on your actual usage. For example, 100 calls to the hybrid large-small model image moderation service cost USD 0.12. |
Standard video voice moderation (video_standard) |
| USD 8.10 per 1,000 minutes (equivalent to USD 0.486/hour). |
Video frame detection (General-purpose edition) — image_standard, video_image_standard | General-purpose baseline check (baselineCheck_global) | USD 0.6 / 1,000 calls |
Video frame detection (Premium edition) — image_advanced, video_image_advanced | Large and small model fusion image moderation service (postImageCheckByVL_global) | USD 1.2 / 1,000 calls |
Video audio moderation (General-purpose edition) — video_standard | Multilingual audio and video media detection (audio_multilingual_global); multilingual live video stream detection (stream_multilingual_global) | USD 8.1 / 1,000 minutes (USD 0.486/hour) |
Frame detection and audio moderation are billed independently. If you enable audio moderation, the total cost is:
Total cost = (Number of snapshots × Frame unit price) + (Video duration × Audio unit price)
In the billing details, the 24moderationType field corresponds to the moderation types in the pricing table. To view your bill, go to Bill details.Integrate Video Moderation Version 2.0
SDK and API
Integrate Video Moderation Version 2.0 using the method that fits your workflow:
Console setup
Before making your first API call, configure your video moderation settings in the Content Moderation console. The console lets you:
Configure video frame and audio detection services
Set snapshot intervals and result return preferences
Define different moderation policies for different business scenarios
Review call results and monitor usage
Related resources
Image Moderation Version 2.0 service descriptionservice description
Voice Moderation Version 2.0 service description