All Products
Search
Document Center

Intelligent Speech Interaction:SDK for Android

更新时间:Nov 08, 2022

The speech synthesis service provides an SDK for Android. This topic describes how to download and install the SDK. This topic also provides sample code for you to use the SDK.

Note

We recommend that you use the latest version of the SDK for Android. The current version is no longer updated. For more information, see NUI SDK for Android.

Prerequisites

  • You understand how the SDK works. For more information, see Overview.

  • A project is created in the Intelligent Speech Interaction console and the appkey of the project is obtained. For more information, see Create a project.

  • An access token that is used to call Intelligent Speech Interaction services is obtained. For more information, see Obtain an access token.

Download and installation

  1. Download the SDK for Android and sample code.

  2. Decompress the downloaded ZIP package. In the app/libs directory, you can find an .aar file, which is the SDK package.

  3. Open the demo project in Android Studio.

    The SpeechSynthesizerActivity.java file contains the sample code for the speech synthesis service.

Key objects

  • NlsClient: the speech processing client. You can use this client to process short sentence recognition, real-time speech recognition, and speech synthesis tasks. This object is thread-safe. You can globally create one NlsClient object.

  • SpeechSynthesizer: the speech synthesis object, which represents a speech synthesis request.

  • SpeechSynthesizerCallback: the object of speech synthesis callback functions. Callbacks are fired when synthesized audio data is returned or any errors occur. You can call this object to customize the callbacks with your own logic.

Call procedure

  1. Create an NlsClient instance.

  2. Define callbacks of the SpeechSynthesizerCallback object to process synthesis results and errors based on your business needs.

  3. Call the NlsClient.createSynthesizerRequest() method to create a SpeechSynthesizer instance.

  4. Set the parameters of the SpeechSynthesizer instance.

    The parameters include the access token, appkey, text, speaker, and speed of the speaker.

  5. Call the SpeechSynthesizer.start() method to connect to the server.

  6. Obtain and play the synthesized speech or process an error by using a callback.

  7. Call the SpeechSynthesizer.stop() method to stop the speech synthesis.

    Note

    To initiate a new request, repeat Step 3 to Step 7.

  8. Call the NlsClient.release() method to release the NlsClient instance.

Proguard configuration

If you use obfuscated code, configure the following command line in the proguard-rules.pro file:

-keep class com.alibaba.idst.util. *{*;}

Sample code

  • Create a speech synthesis request.

    // Create a SpeechSynthesizer object.
    speechSynthesizer = client.createSynthesizerRequest(callback);
    speechSynthesizer.setToken("");
    speechSynthesizer.setAppkey("");
    // Set the audio coding format. You can use audioTrack to play only the synthesized audio data in pulse-code modulation (PCM) format.
    speechSynthesizer.setFormat(SpeechSynthesizer.FORMAT_PCM);
    // The final speech synthesis effect varies with different values of the following parameters.
    // Set the audio sampling rate.
    speechSynthesizer.setSampleRate(SpeechSynthesizer.SAMPLE_RATE_16K);
    // Set the speaker.
    speechSynthesizer.setVoice(SpeechSynthesizer.VOICE_XIAOGANG);
    // Set the speech synthesis method.
    speechSynthesizer.setMethod(SpeechSynthesizer.METHOD_RUS);
    // Set the speed.
    speechSynthesizer.setSpeechRate(100);
    // Specify whether to return the timestamp information for each word in the synthesized speech.
    speechSynthesizer.setEnableSubtitle(true);
    // Set the text from which you want to synthesize speech.
    speechSynthesizer.setText("Welcome to Intelligent Speech Interaction.") ;
    speechSynthesizer.start()
  • Obtain and play the synthesized speech.

    // Obtain the callback of the synthesized audio data and write the audio data speech to the player.
    @Override
    public void OnBinaryReceived(byte[] data, int code)
    {
        Log.d(TAG, "binary received length: " + data.length);
        if (! playing) {
            playing = true;
            audioTrack.play();
        }
        audioTrack.write(data, 0, data.length);
    }
  • Return the timestamp information of the synthesized speech.

    // Call the onMetaInfo method. The timestamp information is returned if you set speechSynthesizer.setEnableSubtitle to true.
    @Override
    public void onMetaInfo(String message, int code) {
        Log.d(TAG,"onMetaInfo " + message + ": " + String.valueOf(code));
    }