All Products
Search
Document Center

Content Moderation:LLM-based text moderation service

Last Updated:Dec 23, 2025

The large language model (LLM)-based text moderation service efficiently and accurately identifies non-compliant content. Compared to traditional text moderation solutions, this service provides more powerful language understanding and analysis capabilities. It can accurately identify complex and subtle non-compliant content and overcomes the limitations of traditional models.

Important

This solution is rapidly evolving. If you have any feedback or suggestions, contact your business manager.

Service description

The following table describes the service that Content Moderation Enhanced Edition provides for LLM-based text moderation.

Service

Content Detection

Use cases

Service name: UGC Text Moderation (LLM)

Service: ugc_moderation_byllm_global

This LLM-based text moderation service is designed for UGC scenarios. It can efficiently and accurately detect various types of non-compliant text content. For a detailed list of check items, see the Content Moderation console.

This service is recommended for all types of text moderation in UGC scenarios.

2. Billing

The LLM-based text moderation service supports the pay-as-you-go billing methods.

Pay-as-you-go

After you activate the Text Moderation service, the default billing method is pay-as-you-go. Fees are settled daily based on your actual usage, and you are not charged if you do not call the service.

Moderation type

Supported business scenarios (services)

Unit price

LLM-based text moderation (text_advanced)

  • UGC Text Moderation (LLM): ugc_moderation_byllm_global

USD 0.6 per 1,000 calls

Note

Each call to a service on the left is counted as one billable item. You are charged based on the actual number of calls. For example, if you call the LLM-based Text Moderation Service in AIGC Scenarios 100 times, you are charged USD 0.06.

Note

For the pay-as-you-go Content Moderation Version 2.0, the billing frequency is once every 24 hours. In the billing details, moderationType corresponds to the Review Type field. You can view the billing details.

3. Risk labels

Label meanings

The LLM-based text moderation service supports over 30 sub-labels across 6 categories, each with a confidence level. If content poses multiple types of risks, the service can return multiple sub-labels. The following tables describe the risk label values, their corresponding confidence score ranges, and their meanings.

Label value (label)

Confidence score range (confidence)

Meaning

pornographic_adult

0 to 100. A higher score indicates a higher confidence level.

Suspected pornographic content

sexual_terms

0 to 100. A higher score indicates a higher confidence level.

Suspected sexual health content

sexual_suggestive

0 to 100. A higher score indicates a higher confidence level.

Suspected vulgar content

sexual_orientation

0 to 100. A higher score indicates a higher confidence level.

Suspected content related to sexual orientation

regional_cn

0 to 100. A higher score indicates a higher confidence level.

Suspected politically sensitive content related to mainland China

regional_illegal

0 to 100. A higher score indicates a higher confidence level.

Suspected illegal political content

regional_controversial

0 to 100. A higher score indicates a higher confidence level.

Suspected political controversy

regional_racism

0 to 100. A higher score indicates a higher confidence level.

Suspected racism

violent_extremist

0 to 100. A higher score indicates a higher confidence level.

Suspected extremist organization

violent_incidents

0 to 100. A higher score indicates a higher confidence level.

Suspected extremist content

violent_weapons

0 to 100. A higher score indicates a higher confidence level.

Suspected weapons and ammunition

violence_unscList

0 to 100. A higher score indicates a higher confidence level.

United Nations sanctions list

contraband_drug

0 to 100. A higher score indicates a higher confidence level.

Suspected drug-related content

contraband_gambling

0 to 100. A higher score indicates a higher confidence level.

Suspected gambling-related content

inappropriate_ethics

0 to 100. A higher score indicates a higher confidence level.

Suspected unethical content

inappropriate_profanity

0 to 100. A higher score indicates a higher confidence level.

Suspected offensive or abusive content

inappropriate_oral

0 to 100. A higher score indicates a higher confidence level.

Suspected vulgar language

inappropriate_religion

0 to 100. A higher score indicates a higher confidence level.

Suspected religious blasphemy

pt_to_contact

0 to 100. A higher score indicates a higher confidence level.

Suspected contact information for advertising

pt_to_sites

0 to 100. A higher score indicates a higher confidence level.

Suspected redirection to external sites

customized

0 to 100. A higher score indicates a higher confidence level.

Hit a custom keyword list

Manage labels

You can enable or disable each risk label in the console. For some risk labels, you can configure more granular detection scopes. For more information, see the Content Moderation console.

  1. In the navigation pane on the left, choose Automated Moderation V2.0 > Text Moderation > Rule Configuration.

  2. On the Rule Management tab, take the LLM moderation solution (aigc_moderation_byllm_global) as an example. In the Actions column, click Manage Detection Rules.

    1. Select the detection type to adjust, such as inappropriate content detection.

    2. Click Edit to enter edit mode and modify the detection status.

    3. Click Save to save the new detection scope. The new detection scope takes about 2 to 5 minutes to take effect in the production environment.

4. Integration guide

Step 1: Activate the service

Visit Activate Service to activate the Text Moderation V2.0 service.

Step 2: Grant permissions to a RAM user

Before you integrate an SDK or call an API, you must grant permissions to a RAM user. When you call an Alibaba Cloud API, you must use an AccessKey pair to complete identity verification. You can create an AccessKey pair for your Alibaba Cloud account or a RAM user. For more information, see Obtain an AccessKey pair.

Procedure

  1. Log on to the RAM console as a RAM administrator.

  2. Create a RAM user.

    For more information, see Create a RAM user.

  3. Grant the AliyunYundunGreenWebFullAccess system policy to the RAM user.

    For more information, see Grant permissions to a RAM user.

    After completing the preceding operations, you can call the Content Moderation API as the RAM user.

Step 3: Install and integrate an SDK

For the SDKs for the Text Moderation Enhanced Edition V2.0 PLUS service and the integration guide, see SDKs and integration guide for Text Moderation Enhanced Edition V2.0 PLUS.

5. API description

Usage notes

You can call this operation to create a text content detection task. For more information about how to construct an HTTP request, see Request structure. You can also directly use a pre-constructed HTTP request. For more information, see Integration guide.

You can run this operation in OpenAPI Explorer without calculating signatures. After a successful call, OpenAPI Explorer automatically generates sample SDK code.

  • API operation: TextModerationPlus

  • Supported regions and endpoints:

Region

Public endpoint

VPC endpoint

Singapore

https://green-cip.ap-southeast-1.aliyuncs.com

https://green-cip-vpc.ap-southeast-1.aliyuncs.com

  • Billing information: This is a paid operation. You are charged only for requests that return an HTTP status code of 200. No fees are charged for requests that return other error codes. For more information about billing methods, see Billing.

QPS limits

The queries per second (QPS) limit for a single user is 20 calls/second. If the number of calls exceeds this limit, throttling is triggered. This may affect your business. If you require a higher QPS limit, contact your business manager.

Request parameters

Name

Type

Required

Example

Description

Service

String

Yes

ugc_moderation_byllm_global

  • ugc_moderation_byllm_global: UGC Text Moderation (LLM)

ServiceParameters

JSONString

Yes

The parameter set required by the moderation service. The value is a JSON string. For a description of each string, see Table ServiceParameters.

Table 1. ServiceParameters

Name

Type

Required

Example

Description

content

String

Yes

Moderation content

The text content to be moderated. The content can be up to 2,000 characters in length.

dataId

String

No

text0424****

The data ID of the moderation object.

The ID can be up to 64 characters in length and can contain uppercase and lowercase letters, digits, underscores (_), hyphens (-), and periods (.). You can use it to uniquely identify your business data.

Response parameters

Name

Type

Example

Description

Code

Integer

200

The status code. For more information, see Code description.

Data

JSONObject

{"Result":[...]}

The moderation result data. For more information, see Data.

Message

String

OK

The response message for the request.

RequestId

String

AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****

The request ID.

Table 2. Data

Name

Type

Example

Description

Result

JSONArray

The results, such as risk labels and confidence scores. For more information, see Result.

RiskLevel

String

high

The risk level, which is returned based on the configured risk scores. Valid values:

  • high: high risk (If a custom keyword list is hit, the risk level is high by default.)

  • medium: medium risk

  • low: low risk

  • none: no risk detected

Note

We recommend that you handle high-risk content directly. Perform a manual review of medium-risk content. Handle low-risk content only when high recall is required. Otherwise, treat low-risk content the same as content with no detected risk. You can configure risk scores in the Content Moderation console.

DataId

String

text0424****

The data ID of the moderation object.

Note

If you specified the dataId parameter in the request, the same dataId is returned here.

Table 3. Result

Name

Type

Example

Description

Label

String

political_xxx

The label returned after the text content detection. Multiple labels and scores may be returned. For more information about supported labels, see Risk labels.

Confidence

Float

81.22

The confidence score. Valid values: 0 to 100. The value is accurate to two decimal places. Some labels do not have confidence scores.

Riskwords

String

AA,BB,CC

The detected sensitive words. Multiple words are separated by commas. Some labels do not return sensitive words.

CustomizedHit

JSONArray

[{"LibName":"...","Keywords":"..."}]

When a custom keyword list is hit, the Label is customized. The name of the custom keyword list and the custom keywords are returned. For more information, see CustomizedHit.

Description

String

Suspected pornographic content

The description of the Label field.

Important

This field explains the Label field and may be subject to change. We recommend that you handle moderation results based on the Label field instead of this field.

Table 4. CustomizedHit

Name

Type

Example

Description

LibName

String

Custom library 1

The name of the custom keyword list.

Keywords

String

Custom word 1,Custom word 2

The custom keywords. Multiple keywords are separated by commas.

Examples

Request example:

{
    "Service": "aigc_moderation_byllm_global",
    "ServiceParameters": {
        "content": "testing content",
        "dataId": "text0424****"
    }
}

Response example:

  • Hit a system policy:

{
    "Code": 200,
    "Data": {
        "Result": [
            {
                "Label": "political_entity",
                "Description":"Suspected political entity",
                "Confidence": 100.0,
                "RiskWords": "Word A,Word B,Word C"
            },
            {
                "Label": "political_figure",
                "Description":"Suspected political figure",
                "Confidence": 100.0,
                "RiskWords": "Word A,Word B,Word C"
            }
        ],
        "RiskLevel": "high",
        "DataId": "text0424****"
    },
    "Message": "OK",
    "RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}
  • Hit a custom keyword list:

{
    "Code": 200,
    "Data": {
        "Result": [
            {
                "Description": "Hit a custom keyword list",
                "CustomizedHit": [
                     {
                        "LibName": "Custom keyword list name 1",
                        "KeyWords": "Custom keyword"
                     }
                ],
                "Confidence": 100,
                "Label": "customized"
             }
        ],
        "RiskLevel": "high",
        "DataId": "text0424****"
    },
    "Message": "OK",
    "RequestId": "AAAAAA-BBBB-CCCCC-DDDD-EEEEEEEE****"
}

Code description

Code

Status code

Description

200

OK

The request is successful.

400

BAD_REQUEST

The request is invalid. This may be caused by incorrect request parameters. Check the request parameters carefully.

408

PERMISSION_DENY

Your account may not be authorized, have an overdue payment, not be activated, or be suspended.

500

GENERAL_ERROR

An error occurred. A temporary server-side error may have occurred. We recommend that you retry the request. If the error persists, contact us through Online Service.

581

TIMEOUT

A timeout occurred. We recommend that you retry the request. If the error persists, contact us through Online Service.

588

EXCEED_QUOTA

The request frequency exceeds the quota.