All Products
Search
Document Center

Intelligent Speech Interaction:NUI SDK for Android

更新时间:Mar 03, 2023

The speech synthesis service provides a Natural User Interaction (NUI) SDK for Android. This topic describes how to download the NUI SDK for Android, lists the key methods in the SDK, and provides sample code for you to use the SDK.

Prerequisites

  • You understand how the SDK works. For more information, see Overview.

  • The appkey of your project is obtained. For more information, see Create a project.

  • A token used to access the service is obtained. For more information, see Obtain a Token.

Download and install the SDK

  1. Download the NUI SDK for Android and sample code.

  2. Decompress the downloaded package to obtain the demo project and find the SDK package in the app/libs directory, which is an AAR package.

  3. Open the demo project in Android Studio.

    The sample code for the speech synthesis service is stored in the TtsBasicActivity.java file.

Key methods

  • tts_initialize: initializes the SDK.

    /**
         * Initialize the SDK. The SDK uses a singleton pattern. To initialize the SDK again, you must first release the SDK. Do not call the SDK on the user interface (UI) thread. Otherwise, the process may be blocked.
         * @param callback: the event listener callback. For more information, see the following callback methods.
         * @param ticket: the parameters used in the initialization. For more information, see Overview.
         * @param level: the log level to use. The smaller the parameter value is, the more logs are recorded.
         * @param save_log: specifies whether to store logs in files. The debug_path parameter specifies the directory where log files are stored.
         * @return: the returned error code. For more information, see Error codes.
         */
        public synchronized int tts_initialize(INativeTtsCallback callback,
                                                   String ticket,
                                                   final Constants.LogLevel level,
                                                   boolean save_log)

    INativeTtsCallback supports the following callback methods.

    • onTtsEventCallback: reports the occurred event to the server.

      /**
           * Report the occurred event to the server.
           * @param event: the event to be reported by the client. You can view possible events in the following table.
           * @param task_id: the ID of the speech synthesis task.
           * @param ret_code: the returned error code. This parameter is valid for the TTS_EVENT_ERROR event.
           */
          void onTtsEventCallback(TtsEvent event, String task_id, int ret_code);

      The following table lists the possible events in the SDK.

      Name

      Description

      TTS_EVENT_START

      Starts the speech synthesis task and starts the playback of the synthesized speech.

      TTS_EVENT_END

      Ends the playback of the synthesized speech.

      TTS_EVENT_CANCEL

      Cancels the speech synthesis task.

      TTS_EVENT_PAUSE

      Pauses the speech synthesis task.

      TTS_EVENT_RESUME

      Resumes the speech synthesis task.

      TTS_EVENT_ERROR

      Returns an error during speech synthesis.

    • onTtsDataCallback: provides the synthesized speech data.

      /**
           * Provide the synthesized speech data.
           * @param text: (reserved) the text to be processed. This parameter is valid only for local speech synthesis tasks.
           * @param word_idx: the position of the word in the text to be synthesized. This parameter is valid only for local speech synthesis tasks.
           * @param data: the synthesized speech data to be sent to the player.
           */
          void onTtsDataCallback(byte[] text, int word_idx, byte[] data);
  • setparamTts: sets the request parameters.

    /**
         * Set the request parameters in the format of key-value pairs.
         * @param param: the parameter name. For more information, see Overview.
         * @param value: the parameter value. For more information, see Overview.
         * @return: the returned error code. For more information, see Error codes.
         */
        public synchronized int setparamTts(String param, String value);
  • getparamTts: obtains the parameter configuration.

    /**
         * Obtain the parameter configuration.
         * @param param: the parameter name. For more information, see Overview.
         * @return: the returned parameter value.
         */
        public String getparamTts(String param);
  • startTts: starts the speech synthesis task.

    /**
         * Start the speech synthesis task.
         * @param priority: the priority of the speech synthesis task. Set this parameter to 1.
         * @param taskid: the ID of the speech synthesis task. You can set this parameter to a 32-byte universally unique identifier (UUID), or leave this parameter empty. If this parameter is left empty, the SDK generates a task ID.
         * @param text: the source text to be processed.
         * @return: the returned error code. For more information, see Error codes.
         */
        public synchronized int startTts(String priority, String taskid, String text)

  • cancelTts: cancels the speech synthesis task.

    /**
         * Cancel the speech synthesis task.
         * @param taskid: the ID of the task that you want to cancel. If this parameter is left empty, all the speech synthesis tasks are canceled.
         * @return: the returned error code. For more information, see Error codes.
         */
        public synchronized int cancelTts(String taskid)
  • pauseTts: pauses the speech synthesis task.

    /**
         * Pause the speech synthesis task.
         * @return: the returned error code. For more information, see Error codes.
         */
        public synchronized int pauseTts()
  • resumeTts: resumes the speech synthesis task.

    /**
         * Resume the speech synthesis task.
         * @return: the returned error code. For more information, see Error codes.
         */
        public synchronized int resumeTts()
  • tts_release: Releases the SDK.

    /**
         * Release the SDK.
         * @return: the returned error code. For more information, see Error codes.
         */
        public synchronized int tts_release()

Procedure

  1. Initialize the SDK and playback components.

  2. Set request parameters based on your business requirements.

  3. Call the startTts method to start the speech synthesis task.

  4. Write the synthesized speech data that is returned in the callback to the player and start the playback. We recommend that you use stream playback.

  5. The client receives the callback of the completion of the speech synthesis task.

Sample code

Initialize the SDK

CommonUtils.copyAssetsData(this);
int ret = NativeNui.GetInstance().tts_initialize(new INativeTtsCallback() {}, genTicket(path), Constants.LogLevel.LOG_LEVEL_VERBOSE, true);

The genTicket method generates a JSON string that contains the information about the resource directory and user. The user information contains the following parameters:

private String genTicket(String workpath) {
        String str = "";
        try {
            JSONObject object = Auth.getAliYunTicket();
            object.put("workspace", workpath);
            str = object.toString();
        } catch (JSONException e) {
            e.printStackTrace();
        }
        Log.i(TAG, "UserContext:" + str);
        return str;
    }

Start the speech synthesis task

NativeNui.GetInstance().startTts("1", "", ttsText);

Handle callbacks

  • Call the onTtsEventCallback method to control the player based on the status of the speech synthesis task.

    public void onTtsEventCallback(INativeTtsCallback.TtsEvent event) {
                    Log.i(TAG, "tts event:" + event);
                    if (event == INativeTtsCallback.TtsEvent.TTS_EVENT_START) {
                        mAudioTrack.play();
                        Log.i(TAG, "start play");
                    } else if (event == INativeTtsCallback.TtsEvent.TTS_EVENT_END) {
                        Log.i(TAG, "play end");
                    } else if (event == TtsEvent.TTS_EVENT_PAUSE) {
                        mAudioTrack.pause();
                        Log.i(TAG, "play pause");
                    } else if (event == TtsEvent.TTS_EVENT_RESUME) {
                        mAudioTrack.play();
                    }
                }
  • Call the onTtsDataCallback method to write the synthesized speech data to the player and start the playback.

    public void onTtsDataCallback(byte[] text, int work_idx, byte[] data) {
                    if (data.length > 0) {
                        mAudioTrack.setAudioData(data);
                    }
                }

Complete the speech synthesis task

NativeNui.GetInstance().cancelTts("");