All Products
Search
Document Center

Android SDK 2.0

Last Updated: Nov 25, 2019

Note:

Download and installation

  1. Download the Android SDK and sample code.
  2. Decompress the downloaded package and find the demo project in the nls-sdk-android directory. In the app/libs directory, you can find an .aar file, which is the SDK package.
  3. Open the demo project in Android Studio. The sample code for short sentence recognition contains the following two activities:
    • SpeechRecogizerActivity demonstrates how to import an audio file to the SDK for recognition.
    • SpeechRecognizerWithRecorderActivity demonstrates how to use the SDK to record and recognize audio data. We recommend that you use this method.
  4. Before running the sample code, create a project and obtain the project appkey in the Intelligent Speech Interaction console, and obtain the service token. For more information, see relevant documentation.

Key objects

  • NlsClient: the speech processing client, which is equivalent to a factory for all speech processing classes. You can globally create an NlsClient instance.
  • SpeechRecognizer: the short sentence recognition object, representing a speech recognition request. You can record audio data or obtain audio data from an audio file, and send the audio data to the SDK.
  • SpeechRecognizerWithRecorder: the short sentence recognition object, representing a speech recognition request. This object provides recording function based on the SpeechRecognizer object. We recommend that you use this method because it is easy to use.
  • SpeechRecognizerCallback: the object of speech recognition callback functions. These callbacks are fired when recognition results are returned or any errors occur. You can customize these callbacks with your own logic as shown in the demo.
  • SpeechRecognizerWithRecorderCallback: the object of speech recognition callback functions. This object provides callbacks for recorded audio data and recording volume based on the SpeechRecognizerCallback object.

Usage (taking SpeechRecognizer as an example)

  1. Create an NlsClient instance.
  2. Define callbacks of the SpeechRecognizerCallback object to process recognition results and errors as required.
  3. Call the NlsClient.createRecognizerRequest() method to create a SpeechRecognizer instance.
  4. Set SpeechRecognizer parameters, including accessToken and appkey.
  5. Call the SpeechRecognizer.start() method to connect the server.
  6. Collect audio data and call the SpeechRecognizer.sendAudio() method to send the audio data to the server.
  7. Process the recognition result or error using a callback.
  8. Call the SpeechRecognizer.stop() method to stop recognition.
  9. To initiate a new request, repeat step 3 to step 8. The SpeechRecognizer object cannot be reused. You must create a new one.
  10. Call the NlsClient.release() method to release the client instance.

ProGuard configuration

If you use the obfuscating code, configure the following command line in the proguard-rules.pro file:

  1. -keep class com.alibaba.idst.util.*{*;}

Sample code

Create a recognition request.

  1. // Create a SpeechRecognizerCallback object.
  2. SpeechRecognizerCallback callback = new MyCallback();
  3. // Create a recognition request.
  4. speechRecognizer = client.createRecognizerRequest(callback);
  5. speechRecognizer.setUrl("wss://nls-gateway-ap-southeast-1.aliyuncs.com/ws/v1");
  6. // Obtain a dynamic token. For more information, see https://www.alibabacloud.com/help/doc-detail/72153.htm.
  7. speechRecognizer.setToken("");
  8. // Obtain an appkey in the Intelligent Speech Interaction console (https://nls-portal.console.aliyun.com/).
  9. speechRecognizer.setAppkey("");
  10. // Specify the server to return intermediate results. For more information about parameters, see official documentation.
  11. speechRecognizer.enableIntermediateResult(true);
  12. // Start the speech recognition.
  13. int code = speechRecognizer.start();

Collect and send audio data to the recognition server. You can collect the audio data from a file or other sources. If you use the SpeechRecognizerWithRecorder object, you can skip this step because the SDK automatically processes and sends the audio data to the server.

  1. ByteBuffer buf = ByteBuffer.allocateDirect(SAMPLES_PER_FRAME);
  2. while(sending){
  3. buf.clear();
  4. // Collect audio data.
  5. int readBytes = mAudioRecorder.read(buf, SAMPLES_PER_FRAME);
  6. byte[] bytes = new byte[SAMPLES_PER_FRAME];
  7. buf.get(bytes, 0, SAMPLES_PER_FRAME);
  8. if (readBytes>0 && sending){
  9. // Send the audio data to the recognition server.
  10. int code = recognizer.sendAudio(bytes, bytes.length);
  11. if (code < 0) {
  12. Toast.makeText(SpeechRecognizerActivity.this, "Failed to send audio data!", Toast.LENGTH_LONG).show();
  13. break;
  14. }
  15. }
  16. buf.position(readBytes);
  17. buf.flip();
  18. }

Handle the callback.

  1. // Stop the recognition and obtain the recognition result.
  2. @Override
  3. public void onRecognizedCompleted(final String msg, int code)
  4. {
  5. Log.d(TAG,"OnRecognizedCompleted " + msg + ": " + String.valueOf(code));
  6. }
  7. // Return intermediate results. This callback is fired only when required options are enabled.
  8. @Override
  9. public void onRecognizedResultChanged(final String msg, int code)
  10. {
  11. Log.d(TAG,"OnRecognizedResultChanged " + msg + ": " + String.valueOf(code));
  12. }