All Products
Document Center

Python SDK 2.0

Last Updated: Oct 25, 2019


Download and installation


  • The SDK only supports Python 3.4 and later.
  • Ensure that you have installed Python package tool setuptools. If you do not have installed setuptools, run the following command to install it:

    1. pip install setuptools
  1. Download the Python SDK.
  2. Install the Python SDK. Run the following commands from the SDK directory.

    1. # Create an egg file.
    2. python bdist_egg
    3. # Install the egg file.
    4. python install

Note: The pip and python commands are Python 3 commands.

Key objects

  1. NlsClient: the speech processing client, which is equivalent to a factory for all speech processing classes. You can globally create an NlsClient instance. This object is thread-safe.
  2. SpeechSynthesizer: the speech synthesis object. You can use this object to set request parameters and send a request. This object is not thread-safe.
    • start: the method used to connect the client and the server. The default parameter ping_interval indicates the interval between pings to the server. The ping_timeout parameter indicates the timeout duration for receiving the pong message. Ensure that the value of ping_interval is larger than the value of ping_timeout.
    • wait_completed: the method used to wait until the server completes synthesis or the synthesis times out.
    • close: the method used to stop the network connection to the server.
  3. SpeechSynthesizerCallback: the object of callback functions. You can use this object to trigger callback events for synthesis results and errors.
    • on_binary_data_received: the callback that is fired when the client receives the synthesized audio data.
    • on_completed: the callback that is fired when the client receives the message of synthesis completed.
    • on_task_failed: the callback that is fired when the client receives an error message.
    • on_channel_closed: the callback that is fired when the client receives the message of network disconnected.

Notes on SDK calls

  1. You can globally create an NlsClient object and reuse it if necessary.
  2. The SpeechSynthesizer object cannot be reused. You must create a SpeechSynthesizer object for each speech synthesis task. For example, to process N audio files, you must create N SpeechSynthesizer objects to complete N speech synthesis tasks.
  3. A SpeechSynthesizerCallback object corresponds to a SpeechSynthesizer object. You cannot use a SpeechSynthesizerCallback object for multiple SpeechSynthesizer objects. Otherwise, you may fail to distinguish speech synthesis tasks.

Sample code

Note 1: The demo uses the default Internet access URL built in the SDK to access the speech synthesis service. To use an ECS instance located in China (Shanghai) to access this service over an internal network, you need to set the URL for internal access when you create the NlsClient object.

  1. synthesizer = client.create_synthesizer(callback, "ws://")

Note 2: In the demo, the synthesized audio is stored in a file. If you need to play the audio in real time, we recommend that you use stream playback to receive audio data while playing. This reduces the latency.


  1. # -*- coding: utf-8 -*-
  2. import threading
  3. import ali_speech
  4. from ali_speech.callbacks import SpeechSynthesizerCallback
  5. from ali_speech.constant import TTSFormat
  6. from ali_speech.constant import TTSSampleRate
  7. class MyCallback(SpeechSynthesizerCallback):
  8. # The name parameter is used to specify the file for saving the audio.
  9. def __init__(self, name):
  10. self._name = name
  11. self._fout = open(name, 'wb')
  12. def on_binary_data_received(self, raw):
  13. print('MyCallback.on_binary_data_received: %s' % len(raw))
  14. self._fout.write(raw)
  15. def on_completed(self, message):
  16. print('MyCallback.OnRecognitionCompleted: %s' % message)
  17. self._fout.close()
  18. def on_task_failed(self, message):
  19. print('MyCallback.OnRecognitionTaskFailed-task_id:%s, status_text:%s' % (
  20. message['header']['task_id'], message['header']['status_text']))
  21. self._fout.close()
  22. def on_channel_closed(self):
  23. print('MyCallback.OnRecognitionChannelClosed')
  24. def process(client, appkey, token, text, audio_name):
  25. callback = MyCallback(audio_name)
  26. synthesizer = client.create_synthesizer(callback, "ws://")
  27. synthesizer.set_appkey(appkey)
  28. synthesizer.set_token(token)
  29. synthesizer.set_voice('xiaoyun')
  30. synthesizer.set_text(text)
  31. synthesizer.set_format(TTSFormat.WAV)
  32. synthesizer.set_sample_rate(TTSSampleRate.SAMPLE_RATE_16K)
  33. synthesizer.set_volume(50)
  34. synthesizer.set_speech_rate(0)
  35. synthesizer.set_pitch_rate(0)
  36. try:
  37. ret = synthesizer.start()
  38. if ret < 0:
  39. return ret
  40. synthesizer.wait_completed()
  41. except Exception as e:
  42. print(e)
  43. finally:
  44. synthesizer.close()
  45. def process_multithread(client, appkey, token, number):
  46. thread_list = []
  47. for i in range(0, number):
  48. text = "This is the synthesis of the" + str(i) + "thread."
  49. audio_name = "sy_audio_" + str(i) + ".wav"
  50. thread = threading.Thread(target=process, args=(client, appkey, token, text, audio_name))
  51. thread_list.append(thread)
  52. thread.start()
  53. for thread in thread_list:
  54. thread.join()
  55. if __name__ == "__main__":
  56. client = ali_speech.NlsClient()
  57. # Specify the logging level: DEBUG, INFO, WARNING, or ERROR.
  58. client.set_log_level('INFO')
  59. appkey = 'Your appkey'
  60. token = 'Your token'
  61. text = "Today is Monday. It is a fine day."
  62. audio_name = 'sy_audio.wav'
  63. process(client, appkey, token, text, audio_name)
  64. # The code for multithreading.
  65. # process_multithread(client, appkey, token, 2)