All Products
Search
Document Center

iOS SDK 2.0

Last Updated: Jun 02, 2020

Note:

Real-time speech recognition

Compared with short sentence recognition, real-time speech recognition supports recognizing uninterrupted long-time speeches, such as conference speeches and interviews. For example, you can use this feature to recognize a conference speech. Real-time speech recognition detects the beginning and end of a sentence, and returns messages that indicate the sentence beginning, sentence end, and recognition result.

SDK download

  • Download the iOS SDK 2.0
  • Use Xcode 9.3 to open the demo project in the NlsDemo.xcodeproj folder. If you use another version of Xcode, you may need to manually import the demo project.
  • The iOS SDK in the demo, that is, NlsDemo.framework or AliyunNlsSdk.framework, supports the following emulators: x86_64, ARMv7, or ARM64.
  • Apple App Store does not support the x86_64 architecture. Therefore, the iOS SDK provides an iOS device framework. You can build the following framework: NlsDemo-iOS/ReleaseFramework-iphoneOS/AliyunNlsSdk.framework.
  • AliyunNlsSdk is a dynamic library. Therefore, you need to add NlsDemo.framework or AliyunNlsSdk.framework as an embedded binary when you import files to your project.

SDK usage

To import the iOS SDK to your project, add target files as embedded binaries.

  1. Import header files AliyunNlsClientAdaptor.h, NlsSpeechTranscriberRequest.h, and TranscriberRequestParam.h in the NlsSdk directory.
  2. Call the NlsSpeechTranscriberDelegate method of the NlsSpeechTranscriberRequest object.
  3. Create an NlsClient object of AliyunNlsClientAdaptor. You can globally create an NlsClient object and reuse it if necessary.
  4. Call the createTranscriberRequest method of the NlsClient object to create the NlsSpeechTranscriberRequest object. The NlsSpeechTranscriberRequest object cannot be reused. You need to create it for each request.
  5. Call the TranscriberRequestParam object to set recognition parameters, such as accessToken and appkey. For more information about parameters, see below.
  6. Call the setTranscriberParams method of the NlsSpeechTranscriberRequest object to pass parameters set by the TranscriberRequestParam object in step 5.
  7. Call the start method or stop method of the NlsSpeechTranscriberRequest object to start or stop recognition.
  8. Call the sendAudio:(NSData *)audioData length:(int)len method of the NlsSpeechTranscriberRequest object to send audio data to the server.
  9. If the server generates any recognition result, the callback set in step 3 is fired, returning the recognition result with a text.

Key objects

  • AliyunNlsClientAdaptor: the speech processing client, which is equivalent to a factory for all speech processing classes. You can globally create an AliyunNlsClientAdaptor instance. This object is thread-safe.
  • NlsSpeechTranscriberRequest: the request object of speech recognition. You can use this object to recognize speech. This object is thread-safe.
  • TranscriberRequestParam: parameters for speech recognition.
  • NlsSpeechTranscriberDelegate: the delegate that defines multiple callback functions of speech recognition. These callbacks are fired when recognition results are returned or any errors occur.

Sample code

  1. #import <Foundation/Foundation.h>
  2. #import "Transcriber.h"
  3. @interface Transcriber()<NlsSpeechTranscriberDelegate,NlsVoiceRecorderDelegate>{
  4. IBOutlet UITextView *textViewTranscriber;
  5. IBOutlet UISwitch *switchTranscriber;
  6. Boolean transcriberStarted;
  7. }
  8. @end
  9. @implementation Transcriber
  10. -(void)viewDidLoad {
  11. [super viewDidLoad];
  12. // 1. Initialize global parameters.
  13. // 1.1 Initialize the client and set transcriberStarted to false.
  14. _nlsClient = [[NlsClientAdaptor alloc]init];
  15. transcriberStarted = false;
  16. // 1.2 Initialize the recorder.
  17. _voiceRecorder = [[NlsVoiceRecorder alloc]init];
  18. _voiceRecorder.delegate = self;
  19. // 1.3 Initialize the recognition object TranscriberRequestParam.
  20. _transRequestParam = [[TranscriberRequestParam alloc]init];
  21. // 1.4 Specify the logging level.
  22. [_nlsClient setLog:NULL logLevel:1];
  23. }
  24. - (IBAction)startTranscriber {
  25. // 2. Create the request object and start recognition.
  26. if(_transcriberRequest!= NULL){
  27. [_transcriberRequest releaseRequest];
  28. _transcriberRequest = NULL;
  29. }
  30. // 2.1 Create the request object and set the NlsSpeechTranscriberDelegate object.
  31. _transcriberRequest = [_nlsClient createTranscriberRequest];
  32. _transcriberRequest.delegate = self;
  33. // 2.2 Set request parameters for the TranscriberRequestParam object.
  34. [_transRequestParam setServiceUrl:@"wss://nls-gateway-ap-southeast-1.aliyuncs.com/ws/v1"];
  35. [_transRequestParam setFormat:@"opu"];
  36. [_transRequestParam setEnableIntermediateResult:true];
  37. // Obtain a dynamic token. For more information, see https://www.alibabacloud.com/help/doc-detail/72153.htm.
  38. // <AccessKeyId><AccessKeySecret>Use your Alibaba Cloud account to generate the AccessKey ID and AccessKey secret in Alibaba Cloud console (https://ak-console.aliyun.com/).
  39. [_transRequestParam setToken:@""];
  40. // Obtain an appkey in the Intelligent Speech Interaction console (https://nls-portal.console.aliyun.com/).
  41. [_transRequestParam setAppkey:@""];
  42. // 2.3 Pass request parameters.
  43. [_transcriberRequest setTranscriberParams:_transRequestParam];
  44. // 2.4 Start recording and recognition, and set transcriberStarted to true.
  45. [_voiceRecorder start];
  46. [_transcriberRequest start];
  47. transcriberStarted = true;
  48. // 2.5 Update the UI.
  49. dispatch_async(dispatch_get_main_queue(), ^{
  50. // The code for UI update.
  51. [self->switchTranscriber setOn:true];
  52. self->textViewTranscriber.text = @"start Recognize!" ;
  53. });
  54. }
  55. - (IBAction)stopTranscriber {
  56. // 3. Stop recognition and recording, and stop the recognition request.
  57. [_voiceRecorder stop:true];
  58. [_transcriberRequest stop];
  59. transcriberStarted= false;
  60. _transcriberRequest = NULL;
  61. dispatch_async(dispatch_get_main_queue(), ^{
  62. // The code for UI update.
  63. [self->switchTranscriber setOn:false];
  64. });
  65. }
  66. /**
  67. * 4. Callbacks of the NlsSpeechTranscriberDelegate object
  68. */
  69. // 4.1 The recognition callback that is fired when the request fails.
  70. - (void)OnTaskFailed:(NlsDelegateEvent)event statusCode:(NSString *)statusCode errorMessage:(NSString *)eMsg {
  71. NSLog(@"OnTaskFailed, statusCode is: %@ error message: %@",statusCode,eMsg);
  72. }
  73. // 4.2 The recognition callback that is fired when the server is disconnected.
  74. - (void)OnChannelClosed:(NlsDelegateEvent)event statusCode:(NSString *)statusCode errorMessage:(NSString *)eMsg {
  75. NSLog(@"OnChannelClosed, statusCode is: %@",statusCode);
  76. }
  77. // 4.3 Start real-time speech recognition.
  78. - (void)OnTranscriptionStarted:(NlsDelegateEvent)event statusCode:(NSString *)statusCode errorMessage:(NSString *)eMsg {
  79. }
  80. // 4.4 The recognition callback that is fired when the server detects the beginning of a sentence.
  81. - (void)OnSentenceBegin:(NlsDelegateEvent)event result:(NSString*)result statusCode:(NSString *)statusCode errorMessage:(NSString *)eMsg {
  82. dispatch_async(dispatch_get_main_queue(), ^{
  83. // The code for UI update.
  84. self->textViewTranscriber.text = result;
  85. NSLog(@"%@", result);
  86. });
  87. }
  88. // 4.5 The recognition callback that is fired when the server detects the end of a sentence.
  89. - (void)OnSentenceEnd:(NlsDelegateEvent)event result:(NSString*)result statusCode:(NSString *)statusCode errorMessage:(NSString *)eMsg {
  90. dispatch_async(dispatch_get_main_queue(), ^{
  91. // The code for UI update.
  92. self->textViewTranscriber.text = result;
  93. NSLog(@"%@", result);
  94. });
  95. }
  96. // 4.6 The recognition callback that is fired when the intermediate result for a sentence is returned.
  97. - (void)OnTranscriptionResultChanged:(NlsDelegateEvent)event result:(NSString *)result statusCode:(NSString *)statusCode errorMessage:(NSString *)eMsg {
  98. dispatch_async(dispatch_get_main_queue(), ^{
  99. // The code for UI update.
  100. self->textViewTranscriber.text = result;
  101. NSLog(@"%@", result);
  102. });
  103. }
  104. // 4.7 The recognition callback that is fired when the recognition is completed.
  105. - (void)OnTranscriptionCompleted:(NlsDelegateEvent)event statusCode:(NSString *)statusCode errorMessage:(NSString *)eMsg {
  106. }
  107. /**
  108. * 5. Set recording callbacks.
  109. */
  110. - (void)recorderDidStart {
  111. NSLog(@"Did start recorder!");
  112. }
  113. - (void)recorderDidStop {
  114. NSLog(@"Did stop recorder!");
  115. }
  116. - (void)voiceDidFail:(NSError *)error {
  117. NSLog(@"Did recorder error!");
  118. }
  119. // 5.1 The recording data callback.
  120. - (void)voiceRecorded:(NSData *)frame {
  121. if (_transcriberRequest != nil && transcriberStarted) {
  122. // Send the data returned by the recording thread to the server.
  123. if ([_transcriberRequest sendAudio:frame length:(short)frame.length] == -1) {
  124. NSLog(@"connect closed ,stop transcriberRequest!");
  125. [self stopTranscriber];
  126. }
  127. }
  128. }
  129. @end