All Products
Search
Document Center

Intelligent Media Services:Access TTS models

Last Updated:Apr 10, 2025

Real-time workflows allow you to access text-to-speech (TTS) models based on the specified specifications.

Access self-developed TTS models

You need to implement an HTTP streaming data transmission service that can be accessed over the Internet, and encapsulate your TTS model based on the specified input and output specifications. This way, you can add the self-developed TTS model to the workflow.

  1. Configure the following parameters for the TTS node in the console.

Parameter

Type

Required

Description

Example

Request URL

String

Yes

The HTTPS URL of the self-developed model.

https://www.abc.com

Token

String

No

The authorization token.

AUJH-pfnTNMPBm6iWXcJAcWsrscb5KYaLitQhHBLKrI

Sample rate

Integer

Yes

Unit: Hz. Only the value of 48000 is supported.

48000

Note

Only mono audio data in the S16LE format is supported. If your data is in a different format, you must resample it before sending it through the interface.

  1. When the real-time workflow is running, the data is assembled in a POST request and used to access the HTTPS URL of the self-developed TTS model that you configure. The following table describes the input parameters.

Parameter

Type

Required

Description

Example

Text

String

Yes

The audio text.

Hello

VoiceId

String

No

The voice.

yourVoiceId

SampleRate

Integer

Yes

The sample rate. Unit: Hz

48000

Token

String

No

The authorization token.

AUJH-pfnTNMPBm6iWXcJAcWsrscb5KYaLitQhHBLKrI

ExtendData

String

Yes

The customized TTS extension data, including the instance ID and the user-defined business data that you specify when you start the intelligent agent.

{'InstanceId':'68e00b6640e*****3e943332fee7','ChannelId':'123','SentenceId':'3',UserData':'{"aaaa":"bbbb"}'}

  • InstanceId

String

Yes

The ID of the instance.

68e00b6640e*****3e943332fee7

  • ChannelId

String

Yes

The ID of the channel.

123

  • SentenceId

Int

Yes

The ID of the Q &A session.

Note

For a user's single inquiry, the intelligent agent uses the same SentenceId for its responses.

3

  • Emotion

String

No

The speech emotion. Valid values:

  • neutral

  • happy

  • sad

Note

If this parameter is not provided, the synthesized speech does not contain any emotional attributes.

happy

  • UserData

String

No

The custom business data passed when the instance is launched.

{"aaaa":"bbbb"}

Note

You need to send the generated audio data of the corresponding tone and sample rate to the TTS service by using the HTTP streaming response. Then, the system pushes the audio data to subsequent nodes in real time.

Custom TTS server

Python

The following sample code shows how to customize a TTS server:

from aiohttp import web


async def stream_audio(request):
    data = await request.json()
    text = data.get('Text', "")
    token = data.get('Token', None)
    sample_rate = data.get('SampleRate', 48000)
    extend_data = data.get('ExtendData', "")
    print(f"text:{text}, token:{token}, sample_rate:{sample_rate}, extend_data:{extend_data}")
    # Check whether the token is valid.

    response = web.StreamResponse(
        status=200,
        reason='OK',
        headers={'Content-Type': 'audio/mpeg'}
    )

    # Start the response.
    await response.prepare(request)

    # generate_tts_data is a coroutine that is used to generate audio data.
    async for chunk in generate_tts_data(text, sample_rate):
        await response.write(chunk)

    # Return the response.
    await response.write_eof()

    return response


async def generate_tts_data(text: str, sample_rate: int):
    # Call the TTS service to generate the audio data of the corresponding sample rate.
    # The following sample code shows how to read audio data from the file.
    file_path = '/your_dir/sample.pcm'
    with open(file_path, 'rb') as f:
        while True:
            chunk = f.read(4096) # Read 4 KB of data each time.
            if not chunk:
                break
            yield chunk

app = web.Application()
app.add_routes([web.post('/stream-audio', stream_audio)])

if __name__ == '__main__':
    web.run_app(app)

References

Create and manage a workflow template