Alibaba Cloud Content Moderation provides an HTTP or HTTPS protocol-based operation for synchronous audio moderation. This operation converts short speeches into text entries in real time and returns moderation results and the categories of moderation results to help you improve the review efficiency. This topic describes the /green/voice/syncscan operation that you can call to moderate audio and obtain moderation results in real time.

Operation description

Operation: /green/voice/syncscan

You can call this operation to submit audio moderation tasks and obtain moderation results in real time. For more information about how to construct an HTTP request, see Request structure. You can also select an existing HTTP request. For more information, see SDK overview.

Note By default, this operation is used to moderate Chinese audio. The audio can contain only a few English words. If you need to moderate English audio, submit a ticket.
  • Billing method:

    You are charged for calling this operation. For more information about the billing method, seeContent Moderation Pricing.

  • Limits on audio files:
    • The size of an audio file cannot exceed 20 MB.
    • The duration of an audio file cannot exceed 1 minute.
    • Audio files must be in the MP3, WAV, AAC, WMA, OGG, M4A, or M3U8 format.
    • Audio in video files must be in the AVI, FLV, MP4, MPG, ASF, WMV, MOV, RMVB, or RM format.

Request parameters

Parameter Type Required Example Description
bizType String No default The business scenario. You can create a business scenario in the Alibaba Cloud Content Moderation console. For more information, see Customize policies for machine-assisted moderation. You can also submit a ticket to ask Alibaba Cloud engineers to help you create a business scenario.
scenes StringArray Yes antispam The moderation scenario. Set the value to antispam.
tasks JSONArray Yes The list of moderation tasks. The value is a JSON array that can contain one or more elements. Each element in the JSON array is a structure. The JSON array can contain up to 100 elements. The maximum number of elements is specified by a concurrency limit. For more information about the structure of each element, see task.
Table 1. task
Parameter Type Required Example Description
clientInfo JSONObject No {"userId":"120234234","userNick":"Mike","userType":"others"} The information about the client. For more information, see the "Common request parameters" section of the Common parameters topic.
The server determines whether to use the global clientInfo parameter or the clientInfo parameter that is described in this table.
Note The clientInfo parameter in this table takes priority over the global one.
dataId String No abc_123 The ID of the moderation object.

The ID can contain letters, digits, underscores (_), hyphens (-), and periods (.) and can be up to 128 characters in length. This ID uniquely identifies your business data.

url String Yes http://xxxxx.com/test.mp3 The download URL of the audio to be moderated. Set this parameter to an HTTP or HTTPS URL that is accessible from the Internet.

Response parameters

Parameter Type Example Description
code Integer 200 The returned HTTP status code.

For more information, see Common response parameters.

msg String OK The message that is returned for the request.
dataId String abc_123 The ID of the moderation object.
Note If you set the dataId parameter in the moderation request, the dataId parameter is returned in the response.
taskId String vc_f_1OsjIYTukH@4@AXkIQ9xxx-1ov52Y The ID of the moderation task.
url String http://xxxxx.com/test.mp3 The URL of the moderation object.
results JSONArray The return results. If HTTP status code 200 is returned after a successful call, the array in the return results contains one or more elements. Each element is a structure. For more information about the structure, see result.
Table 2. result
Parameter Type Example Description
scene String antispam The moderation scenario, which you specify in the moderation request. The value is fixed to antispam.
label String customized The category of the moderation result. Valid values:
  • normal: normal
  • spam: junk content
  • ad: ad
  • politics: political content
  • terrorism: terrorist content
  • abuse: abuse
  • porn: pornographic content
  • flood: excessive junk content
  • contraband: prohibited content
  • meaningless: meaningless content
  • customized: custom content, such as a custom term
suggestion String block The recommended subsequent operation for you to perform. Valid values:
  • pass: The moderation object does not require further actions.
  • review: The moderation object contains suspected violations and requires human review.
  • block: The moderation object contains violations. We recommend that you delete or block the object.
rate Float 99.91 The score of the confidence level. Valid values: 0 to 100. A greater value indicates a higher confidence level.
If a value of pass is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content is normal. If a value of review or block is returned for the suggestion parameter, a higher confidence level indicates a higher probability that the content contains violations.
Notice This score is for reference only. We strongly recommend that you do not use this score in your business. We recommend that you use the values that are returned for the suggestion, label, and sublabel parameters to determine whether the content contains violations. The sublabel parameter is returned by specific operations.
details JSONArray The details about the text in the moderated audio. The value is a JSON array that contains one or more elements. Each element corresponds to a text entry. For more information about the structure of each element, see detail.
Table 3. detail
Parameter Type Example Description
startTime Integer 0 The start time of the text entry. Unit: seconds.
endTime Integer 4065 The end time of the text entry. Unit: seconds.
text String Disgusting The content of the text entry that is converted from the audio.
label String politics The category of the moderation result. Valid values:
  • normal: normal
  • spam: junk content
  • ad: ad
  • politics: political content
  • terrorism: terrorist content
  • abuse: abuse
  • porn: pornographic content
  • flood: excessive junk content
  • contraband: prohibited content
  • meaningless: meaningless content
  • customized: custom content, such as a custom term
persons JSONArray [{"name":"Celebrity A"}] The result of speaker recognition. If the voiceprint of a celebrity is detected, this parameter is returned.
The array contains the following parameter:
  • name: the name of the detected celebrity. The value is a string.
Note By default, this parameter is not returned. If you want this parameter to be returned, submit a ticket.
keyword String Disgusting The custom term that the text entry hits.
libName String test The name of the custom text library that contains the custom term hit by the text entry.

Examples

Sample requests
{
    "scenes":[
        "antispam"
    ],
    "tasks":[
        {
            "dataId":"abcd-123",
            "url":"http://xxxxx.com/test.mp3"
        }
    ]
}
Sample responses
{
    "msg":"OK",
    "code":200,
    "data":[
        {
            "code":200,
            "dataId":"abcd-123",
            "results":[
                {
                    "rate":99.91,
                    "suggestion":"block",
                    "details":[
                        {
                            "libName":"test",
                            "startTime":0,
                            "endTime":4065,
                            "label":"customized",
                            "text":"Disgusting",
                            "keyword":"Disgusting"
                        },
                        {
                            "startTime":4430,
                            "endTime":10065,
                            "label":"normal",
                            "persons": [
                                {
                                    "name":"Celebrity A"
                                }
                            ],
                            "text":"Test content"
                        },
                        {
                            "libName":"Audio test",
                            "startTime":11670,
                            "endTime":14685,
                            "label":"customized",
                            "text":"Ultra-low discount, big sale",
                            "keyword":"Sale"
                        },
                        {
                            "startTime":14685,
                            "endTime":16065,
                            "label":"ad",
                            "text":"WeChat 12345"
                        },
                    ],
                    "label":"customized"
                }
            ],
            "taskId":"vc_f_1OsjIYTukH@4@AXkIQ9xxx-1ov52Y"
        }
    ],
    "requestId":"5A7A6198-6960-4DDC-B67E-58A111A4B20F"
}