All Products
Search
Document Center

AI Guardrails:LLM image moderation

Last Updated:Mar 31, 2026

Image Moderation 2.0 uses a custom-trained Qwen large model combined with expert models to detect non-compliant content in images — including pornography, sexually suggestive material, politically sensitive content, violence, terrorism, contraband, religious content, spam, and other undesirable content. This topic describes how to set up and call the image moderation service.

The large model for image moderation is in active development. Contact your business manager with feedback or suggestions.

How it works

All requests use the ImageModeration API operation with the postImageCheckByVL_global service code. Submit an image by URL, OSS reference, or direct upload. The service runs inference in the Singapore region and returns a list of risk labels with confidence scores and an overall risk level.

For Virginia and Frankfurt regions, inference runs in Singapore, but data and logs are stored locally in the respective regions.

Service selection

ServiceDescriptionSupported regionsUse cases
Image Moderation for Large and Small Model Integration (postImageCheckByVL_global)Combines a large model and expert models to provide more granular labels, such as pornography subcategories, specific behaviors, and specific objects. Offers a wider detection range and richer labels. Provides the best overall performance with low false positive and false negative rates.SingaporeSocial media, live streaming, gaming, e-commerce, and education businesses that require strict risk control and fine-grained policies. Businesses that need detailed risk labels. Highly recommended for new users with high performance requirements.

Prerequisites

Before you begin, ensure that you have:

  • An active Image Moderation 2.0 subscription (pay-as-you-go)

  • An Alibaba Cloud account or a RAM user with the AliyunYundunGreenWebFullAccess policy

  • An AccessKey pair for authentication

Get started

Step 1: Activate the service

Go to the service activation page and activate Image Moderation 2.0. After activation, billing defaults to pay-as-you-go — you are only charged for API calls that return HTTP 200.

Step 2: Grant permissions to a RAM user

Create an AccessKey pair for your Alibaba Cloud account or a RAM user. The RAM user must have the AliyunYundunGreenWebFullAccess policy to call Content Moderation APIs.

  1. Log on to the RAM console using your Alibaba Cloud account or as a RAM administrator.

  2. Create a RAM user. For details, see Create a RAM user.

  3. Attach the AliyunYundunGreenWebFullAccess policy to the RAM user. For details, see Manage permissions for RAM users.

Step 3: Install the SDK

Follow the Image Moderation SDK and integration guide to install the SDK and configure your endpoint.

The service is available in three regions:

RegionPublic endpointVPC endpointSupported service
Asia Pacific SE 1 (Singapore)green-cip.ap-southeast-1.aliyuncs.comgreen-cip-vpc.ap-southeast-1.aliyuncs.compostImageCheckByVL_global
US East 1 (Virginia)green-cip.us-east-1.aliyuncs.comgreen-cip-vpc.us-east-1.aliyuncs.com
Europe Central 1 (Frankfurt)green-cip.eu-central-1.aliyuncs.comgreen-cip-vpc.eu-central-1.aliyuncs.com
Important

In the Virginia and Frankfurt regions, large model inference runs in the Singapore region. Data and logs are stored locally in the respective regions.

Step 4: Configure detection rules (optional)

In the Content Moderation consoleContent Moderation consoleContent Moderation consoleContent Moderation console, adjust detection rules: enable or disable detection categories, configure a custom image library, query detection records, and review usage data. For details, see Console guide.

API reference

API overview

  • Operation: ImageModeration

  • Service code: postImageCheckByVL_globalglobal_global

  • QPS limit: 50 calls/second per user. Exceeding this limit throttles requests. Contact your business manager for a quota increase.

  • Billing: Charged per successful request (HTTP 200). USD 1.20 per 1,000 calls, settled daily.

  • Image Moderation for Large and Small Model Integration:

    Combines large and expert models to detect a wide range of non-compliant content in images, such as pornography, suggestive material, politically sensitive content, violence, terrorism, prohibited items, religious content, spam, and other undesirable content. (Note: All large model inference is processed in the Singapore region.) For details on the detection categories, see Rules.

  • postImageCheckByVL_global: Image Moderation for Large and Small Model Integration

Debug the API

Test the API before integrating using Alibaba Cloud OpenAPI. The tool generates sample code and SDK dependency information.

Important

API calls made through the online debugger count toward your billed usage.

Image requirements

ConstraintLimit
Supported formatsPNG, JPG, JPEG, BMP, WEBP, TIFF, SVG, HEIC (longest edge < 8,192 px), GIF (first frame), ICO (last image)
Max file size20 MB
Max dimensions16,384 px (height or width); 250 million total pixels
Optimal resolutionAt least 200 x 200 px (lower resolutions reduce accuracy)
Download timeout3 seconds
URL restrictionsPublicly accessible; max 2,048 characters; no Chinese characters; one URL per request

Submit an image

Image Moderation 2.0 supports three ways to submit an image. Choose one per request:

MethodRequired parametersNotes
URLimageUrlURL must be publicly accessible
OSS authorizationossBucketName, ossObjectName, ossRegionIdGrant AliyunCIPScanOSSRole on the Cloud Resource Access Authorization page
Local uploadUpload via SDKFile is deleted 30 minutes after upload; does not consume OSS storage. See the Image Moderation SDK guide for code examples.

Request parameters

The request body is a JSON object. For required common request parameters, see the Integration guide.

Top-level parameters

ParameterTypeRequiredExampleDescription
ServiceStringYespostImageCheckByVL_globalThe detection service. Valid value: postImageCheckByVL_global (Image Moderation for Large and Small Model Integration).
ServiceParametersJSONStringYesA JSON string containing the content detection parameters.

ServiceParameters fields

ParameterTypeRequiredExampleDescription
imageUrlStringConditionalhttps://img.alicdn.com/tfs/TB1U4r9AeH2gK0jSZJnXXaT1FXa-2880-480.pngThe URL of the image to moderate. Required when submitting by URL.
ossBucketNameStringConditionalbucket_01The name of the authorized Object Storage Service (OSS) bucket. Required when submitting by OSS.
ossObjectNameStringConditional2022023/04/24/test.jpgThe object key of the image in the OSS bucket. Required when submitting by OSS.
ossRegionIdStringConditionalcn-beijingThe region where the OSS bucket is located. Required when submitting by OSS.
dataIdStringNoimg123****A unique identifier to associate the result with your business data. Accepts letters, digits, underscores (_), hyphens (-), and periods (.). Max 64 characters.
infoTypeStringNocustomImageAdditional information to return. Valid value: customImage (returns custom image library match details).
refererStringNowww.aliyun.comThe Referer request header, used for hotlink protection. Max 256 characters.

Request example

{
    "Service": "postImageCheckByVL_global",
    "ServiceParameters": {
        "imageUrl": "https://img.alicdn.com/tfs/TB1U4r9AeH2gK0jSZJnXXaT1FXa-2880-480.png",
        "dataId": "img0307****"
    }
}

Response parameters

Top-level response fields

ParameterTypeExampleDescription
RequestIdString70ED13B0-BC22-576D-9CCF-1CC12FEAC477The unique request ID generated by Alibaba Cloud for troubleshooting.
CodeInteger200The status code.
MsgStringOKThe response message.
DataObjectThe detection results.

Data fields

ParameterTypeExampleDescription
RiskLevelStringhighThe overall risk level of the image, based on the highest-risk label. Valid values: high, medium, low, none.
DataIdStringimg123******The data ID of the moderated image. Returned only if dataId was specified in the request.
ResultArrayAn array of detected risk labels. Each entry contains Label, Confidence, Description, and RiskLevel.
ExtObjectSupplementary information, including custom image library matches.

Result fields

ParameterTypeExampleDescription
LabelStringviolent_explosionThe risk label. A single image can match multiple labels.
ConfidenceFloat81.22The confidence score (0–100, two decimal places). A higher score indicates higher confidence.
DescriptionStringFireworks contentA human-readable description of the label. Use Label — not Description — to determine what action to take, as this field may change.
RiskLevelStringhighThe risk level for this label, based on configured score thresholds. Valid values: high, medium, low, none.

Ext fields

ParameterTypeDescription
CustomImageJSONArrayDetails about custom image library matches. Returned when a submitted image matches an entry in your custom image library.

CustomImage fields

ParameterTypeExampleDescription
LibIdStringlib0001The ID of the matched custom image library.
LibNameStringCustom Image Library AThe name of the matched custom image library.
ImageIdString20240307The ID of the matched image in the library.

Response example

{
    "RequestId": "70ED13B0-BC22-576D-9CCF-1CC12FEAC477",
    "Code": 200,
    "Msg": "OK",
    "Data": {
        "RiskLevel": "high",
        "DataId": "img0307****",
        "Result": [
            {
                "Label": "violent_explosion",
                "Confidence": 92.40,
                "Description": "Fireworks content",
                "RiskLevel": "high"
            },
            {
                "Label": "violent_burning",
                "Confidence": 67.15,
                "Description": "Burning scenes",
                "RiskLevel": "medium"
            }
        ],
        "Ext": {}
    }
}
Request and response examples are formatted for readability. Actual API responses do not include line breaks or indentation.

Risk labels

The service returns risk labels grouped by category. Each label has a confidence score from 0 to 100 — a higher score indicates higher confidence. Enable or disable individual labels in the Content Moderation consoleContent Moderation consoleContent Moderation consoleContent Moderation console.

Understanding risk levels and handling results

Each label has an individual RiskLevel and Confidence score. The RiskLevel in the Data object reflects the highest risk across all returned labels.

Use RiskLevel to guide your moderation workflow:

Risk levelRecommended action
highBlock or remove content immediately
mediumRoute to manual review
lowProcess only if your use case requires high recall; otherwise treat as no risk
noneNo risk detected

Tuning confidence thresholds: The default risk score thresholds determine when a label is assigned high, medium, or low. Lowering a threshold increases recall (fewer missed violations) but also increases false positives. Raising a threshold improves precision but may miss some violations. Adjust thresholds per label in the Content Moderation consoleContent Moderation consoleContent Moderation consoleContent Moderation console to match your platform's tolerance for false positives versus false negatives.

Store the risk labels and confidence scores returned by the system for a defined period. Use them to prioritize content for manual review, build annotation datasets, and apply tiered governance policies based on risk category.

Label reference

Labels are grouped into the categories below. The _tii suffix indicates that the label applies to text detected within the image (text-in-image), rather than the visual content itself.

Pornographic content (pornographic_*)

LabelDescription
pornographic_adultContentAdult pornographic content
pornographic_cartoonPornographic cartoon content
pornographic_adultToysAdult toys
pornographic_artworkPornographic artwork
pornographic_underageChild pornography
pornographic_adultContent_tiiPornographic text in the image
pornographic_suggestive_tiiVulgar text in the image
pornographic_o_tiiLGBT-related text in the image
pornographic_organs_tiiText describing sexual organs in the image
pornographic_adultToys_tiiText about adult toys in the image

Sexually suggestive content (sexual_*)

LabelDescription
sexual_suggestiveContentVulgar or sexually suggestive content
sexual_femaleUnderwearUnderwear or swimwear
sexual_cleavageFemale cleavage
sexual_maleToplessTopless males
sexual_cartoonSexually suggestive cartoon content
sexual_femaleShoulderSuggestive content featuring female shoulders
sexual_femaleLegSuggestive content featuring female legs
sexual_pregnancyMaternity photos or breastfeeding
sexual_feetSuggestive content featuring feet
sexual_kissKissing
sexual_intimacyIntimate behavior
sexual_intimacyCartoonIntimate acts in cartoons or anime

Politically sensitive content (political_*)

LabelDescription
political_historicalNihilityContent related to historical nihilism or sensitive historical events
political_historicalNihility_tiiText related to historical nihilism
political_politicalFigure_1Current or former leaders
political_politicalFigure_2Family members of leaders
political_politicalFigure_3Provincial or municipal government officials
political_politicalFigure_4Foreign leaders or their family members
political_politicalFigure_name_tiiNames of leaders in image text
political_prohibitedPerson_1Disgraced national-level officials
political_prohibitedPerson_2Disgraced provincial- or municipal-level officials
political_prohibitedPerson_tiiNames of disgraced officials in image text
political_taintedCelebrityPublic figures involved in scandals or major negative events
political_taintedCelebrity_tiiNames of scandal-involved celebrities in image text
political_CNFlagThe national flag of China
political_CNMapA map of China
political_logoLogos of banned media outlets
political_outfitMilitary or police uniforms, or combat attire
political_badgeNational or party emblems
political_racism_tiiSpecial expressions in image text. See the Content Moderation consoleContent Moderation consoleContent Moderation consoleContent Moderation console for details.

Violence and terrorism (violent_*)

LabelDescription
violent_explosionFireworks or explosions
violent_armedForcesTerrorist organizations
violent_burningBurning scenes
violent_weaponMilitary equipment
violent_crowdingCrowd gatherings
violent_gunGuns
violent_knivesKnives
violent_horrificHorrific content
violent_naziNazi-related content
violent_bloodyBloody content
violent_extremistGroups_tiiText related to terrorist organizations
violent_extremistIncident_tiiText related to terrorist incidents
violence_weapons_tiiText describing firearms, ammunition, or weapons
violent_ACUCombat uniforms

Contraband (contraband_*)

LabelDescription
contraband_drugIllegal drugs or medication
contraband_drug_tiiText describing illegal drugs
contraband_gambleGambling-related items
contraband_gamble_tiiText describing gambling activities
contraband_certificate_tiiSpam or ads for fake certificates or cash-out services in image text

Religious content (religion_*)

LabelDescription
religion_flagReligious flags or symbols
religion_clothingSpecific attire or symbols. See the Content Moderation consoleContent Moderation consoleContent Moderation consoleContent Moderation console for details.
religion_logo
religion_taboo1_tii
religion_taboo2_tii

Flags

LabelDescription
flag_countryFlag-related content

Spam and promotional content (pt_*)

LabelDescription
pt_logotoSocialNetworkWatermarks from social media platforms
QR codeQR codes
pt_logoLogos
pt_toDirectContact_tiiContact information used for spam in image text
pt_custom_01Custom label 01
pt_custom_02Custom label 02

Inappropriate behavior (inappropriate_*)

LabelDescription
inappropriate_smokingSmoking-related content
inappropriate_drinkingDrinking-related content
inappropriate_tattooTattoos
inappropriate_middleFingerMiddle finger gesture
inappropriate_foodWastingFood waste

Profanity (profanity_*)

LabelDescription
profanity_oral_tiiProfanity or vulgar slang in image text
profanity_offensive_tiiSeverely abusive language in image text

Custom image library labels

Configure a custom image library for any risk category in the console. When a submitted image is highly similar to an image in your library, the system returns the corresponding label with a _lib suffix (for example, violent_explosion_lib). The Confidence score reflects the degree of similarity.

No-risk labels

LabelConfidence scoreDescription
nonLabelNot returnedNo threats detected, or all detection categories are disabled.
nonLabel_lib0–100The image is highly similar to an exempted image in your custom library.

Status codes

Requests are billed only for status code 200.

CodeDescription
200The request succeeded.
400A required request parameter is empty.
401A request parameter is invalid.
402A request parameter exceeds the allowed length. Correct the length and retry.
403The request exceeds the QPS limit. Reduce request concurrency and retry.
404Failed to download the image. Check the image URL or retry.
405Image download timed out. Verify the image is accessible and retry.
406The image file is too large. Resize the image and retry.
407The image format is not supported. Use a supported format and retry.
408The account lacks permission. Verify that the service is activated, the account has no overdue payments, and the RAM user has the required policy.
500A system error occurred.

Billing

Image Moderation 2.0 uses pay-as-you-go billing. Fees are settled daily based on actual usage. Requests that return non-200 status codes are not charged.2.02.02.02.0242.0

Billing categoryIncluded serviceUnit price
Image Moderation advanced (image_advanced)postImageCheckByVL_globalUSD 1.20 per 1,000 calls

Example: 100 calls to the postImageCheckByVL_global service cost USD 0.12.

In bill details, the moderationType field identifies the moderation type. View your bill details.

What's next