All Products
Search
Document Center

AI Guardrails:Introduction and Billing for Video Moderation Premium Edition and 2.0

Last Updated:Mar 31, 2026

Video Moderation Version 2.0 scans ApsaraVideo VOD files and live streams for policy-violating content across both video frames and audio. The service returns risk labels with confidence scores so you can take moderation actions based on your platform's rules.

The service integrates Image Moderation Version 2.0 for frame analysis and Voice Moderation Version 2.0 for audio analysis, letting you reuse configurations already set up for those services.

Services

Video Moderation Version 2.0 provides four services depending on your content type and model preference.

ServiceService IDDescriptionAvailability
Video file detectionvideoDetection_globalScans video files for violations in frames and audioAll regions
Video file detection (Large model edition)videoDetectionByVL_globalUses large model image moderation for frame analysis; 10 ingest endpoints by defaultSingapore only
Live video stream moderationliveStreamDetection_globalScans live video streams for violations in frames and audioAll regions
Live video stream moderation (Large model edition)liveStreamDetectionByVL_globalUses large model image moderation for frame analysis; 10 ingest endpoints by defaultSingapore only
Large model edition services default to 10 ingest endpoints. Control the number of concurrent calls accordingly.

Upgrade from Video Moderation 1.0

If you are evaluating whether to upgrade, the table below summarizes the differences between versions.

Video Moderation Version 2.0Video Moderation 1.0
Default ingest endpoints5020
Default QPS100 calls/second50 calls/second
Max video size500 MB200 MB
Frame detection scopeGeneral-purpose baseline check (via Image Moderation Version 2.0)
  • common baseline detection

  • Multi-Language Detection in Audio and Video

  • Multi-language detection for social and entertainment live streams

Pornography, terrorism, undesirable scenes, logos, text and image violations
    Audio detection scopeMultilingual audio and video media detection; multilingual social and entertainment live stream detection (via Voice Moderation Version 2.0)
    Console featuresFrame detection service settings, audio detection service settings, snapshot settings, result return settingsCheck item settings only
    Frame billingConsistent with Image Moderation Version 2.0 pricing1.8× Image Moderation 1.0 pricing
    Audio billing10% discount vs. Voice Moderation Version 2.0Consistent with Voice Moderation 1.0

    Version 2.0 increases the default ingest endpoints from 20 to 50, raises the QPS limit from 50 to 100 calls/second, and supports video files up to 500 MB. Frame detection scope is consolidated into a single general-purpose baseline check that covers the content categories handled by Image Moderation Version 2.0.

    Billing

    Video Moderation Version 2.0 uses pay-as-you-go billing. You are not charged when the service is not called. Usage is metered and billed once every 24 hours.

    Pricing

    Moderation typeBusiness scenariosUnit price

    Standard video image detection (image_standard, video_image_standard)

    • Common baseline detection: baselineCheck_global

    USD 0.60 per 1,000 calls

    Note

    Each call to a business scenario in this tier is counted as one transaction. You are billed based on your actual usage. For example, 100 calls to the common baseline detection service cost USD 0.06.

    Premium video image detection (image_advanced, video_image_advanced)

    • Hybrid large-small model image moderation service: postImageCheckByVL_global

    USD 1.20 per 1,000 calls

    Note

    Each call to a business scenario in this tier is counted as one transaction. You are billed based on your actual usage. For example, 100 calls to the hybrid large-small model image moderation service cost USD 0.12.

    Standard video voice moderation (video_standard)

    • Audio and video media multi-language detection: audio_multilingual_global

    • Live stream multi-language detection: stream_multilingual_global

    USD 8.10 per 1,000 minutes (equivalent to USD 0.486/hour).

    Video frame detection (General-purpose edition) — image_standard, video_image_standardGeneral-purpose baseline check (baselineCheck_global)USD 0.6 / 1,000 calls
    Video frame detection (Premium edition) — image_advanced, video_image_advancedLarge and small model fusion image moderation service (postImageCheckByVL_global)USD 1.2 / 1,000 calls
    Video audio moderation (General-purpose edition) — video_standardMultilingual audio and video media detection (audio_multilingual_global); multilingual live video stream detection (stream_multilingual_global)USD 8.1 / 1,000 minutes (USD 0.486/hour)

    Frame detection and audio moderation are billed independently. If you enable audio moderation, the total cost is:

    Total cost = (Number of snapshots × Frame unit price) + (Video duration × Audio unit price)

    In the billing details, the 24moderationType field corresponds to the moderation types in the pricing table. To view your bill, go to Bill details.

    Integrate Video Moderation Version 2.0

    SDK and API

    Integrate Video Moderation Version 2.0 using the method that fits your workflow:

    Console setup

    Before making your first API call, configure your video moderation settings in the Content Moderation console. The console lets you:

    • Configure video frame and audio detection services

    • Set snapshot intervals and result return preferences

    • Define different moderation policies for different business scenarios

    • Review call results and monitor usage

    Related resources