All Products
Search
Document Center

Intelligent Media Services:SubmitAudioProduceJob

Last Updated:Mar 30, 2026

Converts the provided text content into a high-quality audio file.

Try it now

Try this API in OpenAPI Explorer, no manual signing needed. Successful calls auto-generate SDK code matching your parameters. Download it with built-in credential security for local usage.

Test

RAM authorization

The table below describes the authorization required to call this API. You can define it in a Resource Access Management (RAM) policy. The table's columns are detailed below:

  • Action: The actions can be used in the Action element of RAM permission policy statements to grant permissions to perform the operation.

  • API: The API that you can call to perform the action.

  • Access level: The predefined level of access granted for each API. Valid values: create, list, get, update, and delete.

  • Resource type: The type of the resource that supports authorization to perform the action. It indicates if the action supports resource-level permission. The specified resource must be compatible with the action. Otherwise, the policy will be ineffective.

    • For APIs with resource-level permissions, required resource types are marked with an asterisk (*). Specify the corresponding Alibaba Cloud Resource Name (ARN) in the Resource element of the policy.

    • For APIs without resource-level permissions, it is shown as All Resources. Use an asterisk (*) in the Resource element of the policy.

  • Condition key: The condition keys defined by the service. The key allows for granular control, applying to either actions alone or actions associated with specific resources. In addition to service-specific condition keys, Alibaba Cloud provides a set of common condition keys applicable across all RAM-supported services.

  • Dependent action: The dependent actions required to run the action. To complete the action, the RAM user or the RAM role must have the permissions to perform all dependent actions.

Action

Access level

Resource type

Condition key

Dependent action

ice:SubmitAudioProduceJob

*All Resource

*

None None

Request parameters

Parameter

Type

Required

Description

Example

EditingConfig

string

Yes

The audio editing configurations.

  • voice: the voice type.

  • customizedVoice: the ID of the personalized human voice.

  • format: the format of the output file. Valid values: PCM, WAV, and MP3.

  • volume: the volume. Default value: 50. Valid values: 0 to 100.

  • speech_rate: the speech tempo. Default value: 0. Value range: -500 to 500.

  • pitch_rate: the intonation. Default value: 0. Value range: -500 to 500.

Note

If you specify both voice and customizedVoice, customizedVoice takes precedence over voice.

{"voice":"Siqi","format":"MP3","volume":50}

OutputConfig

string

Yes

The output audio configurations.

For example, save the output voice to:http://my_bucket.oss-cn-shanghai.aliyuncs.com/target_audio.mp3,Then this parameter is configured as: { "bucket": "my_bucket", "object": "target_audio" }

InputConfig

string

Yes

The text content. A maximum of 2,000 characters are supported. The Speech Synthesis Markup Language (SSML) is supported.

Test text

Title

string

No

The job title. If you do not specify this parameter, the system generates a title based on the current date.

  • The job title can be up to 128 bytes in length.

  • The value must be encoded in UTF-8.

Task Title. If not provided,Automatically generate default based on datetitle Length does not exceed128bytes UTF8Encoding

Description

string

No

The job description.

  • The job description can be up to 1,024 bytes in length.

  • The value must be encoded in UTF-8.

Task description length does not exceed1024bytes UTF8Encoding

UserData

string

No

The user-defined data in the JSON format, which can be up to 512 bytes in length. You can specify a custom callback URL. For more information, see Configure a callback upon editing completion.

{"NotifyAddress":"http://xx.xx.xxx"} or {"NotifyAddress":"https://xx.xx.xxx"} or {"NotifyAddress":"ice-callback-demo"}

Overwrite

boolean

No

Specifies whether to overwrite the existing Object Storage Service (OSS) object.

true

Response elements

Element

Type

Description

Example

object

The response parameters.

RequestId

string

The request ID.

******11-DB8D-4A9A-875B-275798******

JobId

string

The job ID.

****20b48fb04483915d4f2cd8ac****

State

string

The job state. Valid values:

  • Created

  • Executing

  • Finished

  • Failed

Created

MediaId

string

The ID of the media asset.

****2bcbfcfa30fccb36f72dca22****

You can call the GetSmartHandleJob operation to query the execution details of an intelligent audio production job. The following example shows the result returned by the GetSmartHandleJob operation for a successful job.

{
  "RequestId": "******2D-443C-5043-B0E4-867070******",
  "JobId": "******042d5e4db6866f6289d1******",
  "State": "Finished",
  "SmartJobInfo": {
    "Title": "default_title_2022-01-21T06:15:07Z",
    "JobType": "TextToSpeech",
    "CreateTime": "2022-01-21T06:15:07Z",
    "ModifiedTime": "2022-01-21T06:15:07Z",
    "InputConfig": {
      "InputFile": "Talking about Guo Degang, he is now incredibly popular. Although ticket prices are high, his shows often sell out instantly. In addition, he frequently appears on various comedy programs, where he judges performances by new comedians."
    },
    "EditingConfig": "{\"format\":\"MP3\",\"pitch_rate\":0,\"sample_rate\":16000,\"speech_rate\":0,\"voice\":\"Siqi\",\"volume\":50}",
    "OutputConfig": {
      "Bucket": "your-bucket",
      "Object": "your-audio"
    }
  },
  "JobResult": {
    "MediaId": "******bf47c94e82b3b2014361******",
    "AiResult": "[{\"content\":\"Talking about\",\"from\":0.0,\"to\":0.846},{\"content\":\"he is now incredibly popular\",\"from\":0.846,\"to\":3.386},{\"content\":\"Although ticket prices are high\",\"from\":3.386,\"to\":4.402},{\"content\":\"his shows often sell out instantly\",\"from\":4.402,\"to\":6.265},{\"content\":\"In addition, he frequently appears on various comedy programs, where he judges performances by new comedians\",\"from\":6.265,\"to\":10.33}]"
  }
}

Examples

Success response

JSON format

{
  "RequestId": "******11-DB8D-4A9A-875B-275798******",
  "JobId": "****20b48fb04483915d4f2cd8ac****",
  "State": "Created",
  "MediaId": "****2bcbfcfa30fccb36f72dca22****"
}

Error codes

See Error Codes for a complete list.

Release notes

See Release Notes for a complete list.