All Products
Search
Document Center

RESTful API 2.0

Last Updated: Dec 16, 2019

Features

The short sentence recognition RESTful API allows you to use the POST method to upload an audio file that is no longer than 1 minute. The server returns the recognition result in JSON format in a response. You must ensure that the connection is not interrupted before the recognition result is returned.

  • Supports the following audio coding formats: pulse-code modulation (PCM) (uncompressed PCM or WAV files), Opus, and 16-bit mono.
  • Supports the following audio sampling rates: 8,000 Hz and 16,000 Hz.
  • Allows you to specify whether to add punctuation marks during post-processing, and whether to convert Chinese numerals to Arabic numerals.
  • Supports configure hotwords and custom models in the console.
  • Recognizes multiple languages. You can specify the language to be recognized by selecting a model when you modify a project in the Intelligent Speech Interaction console.

Interaction process

The client sends an HTTP REST POST request with audio data to the server, and the server returns an HTTP response with the recognition result.

RESTful

Note: The server adds the task_id field to the response message for all responses to indicate the ID of the recognition task. You need to record the value of this field. If an error occurs, you can open a ticket to submit the task ID and error message.

Endpoint

Access type Description URL Host
External access from the Internet This endpoint allows you to access the short sentence recognition service from any host over the Internet. http://nls-gateway-ap-southeast-1.aliyuncs.com/stream/v1/asr nls-gateway-ap-southeast-1.aliyuncs.com

This topic describes how to access the short sentence recognition service over the Internet as an example. To use an ECS instance located in China (Shanghai) to access the short sentence recognition service over an internal network, replace the URL and host for Internet access with those for internal access.

Audio file upload

The following example is an HTTP request of the short sentence recognition RESTful API:

  1. POST /stream/v1/asr?appkey=23****f5&format=pcm&sample_rate=16000&enable_punctuation_prediction=true&enable_inverse_text_normalization=true HTTP/1.1
  2. X-NLS-Token: 450372e4279******bcc2b3c793
  3. Content-type: application/octet-stream
  4. Content-Length: 94616
  5. Host: nls-gateway-ap-southeast-1.aliyuncs.com
  6. [audio data]

A complete request of the short sentence recognition RESTful API must contain the following elements:

HTTP request line

The HTTP request line contains the URL and request parameters.

URL

Protocol URL Method
HTTP/1.1 http://nls-gateway-ap-southeast-1.aliyuncs.com/stream/v1/asr POST

Request parameters

Parameter Type Description
appkey String Required. The appkey of the application. For more information about how to obtain the appkey, see Create a project.
format String Optional. The audio coding format. Valid values: pcm, opus. Default value: pcm.
sample_rate Integer Optional. The audio sampling rate, in Hz. Valid values: 16000 and 8000. Default value: 16000.
enable_punctuation_prediction Boolean Optional. Specifies whether to add punctuation marks during post-processing. Valid values: true and false. Default value: false.
enable_inverse_text_normalization Boolean Optional. Specifies whether to enable inverse text normalization (ITN) during post-processing. Valid values: true and false. Default value: false.
enable_voice_detection Boolean Optional. Specifies whether to enable voice detection. Valid values: true and false. Default value: false. Note: If voice detection is enabled, the server detects whether the uploaded audio file includes any silent fragment. If the server detects any silent fragment, it removes the silent fragment and the subsequent content without recognizing them. The recognition result varies depending on the model.

The complete request link composed of the preceding URL and request parameters is as follows:

  1. http://nls-gateway.cn-shanghai.aliyuncs.com/stream/v1/asr?appkey=Yu1******uncS&format=pcm&sample_rate=16000&vocabulary_id=a17******d6b&customization_id=abd******ed8&enable_punctuation_prediction=true&enable_inverse_text_normalization=true&enable_voice_detection=true

HTTP request header

The HTTP request header consists of keyword/value pairs. Each keyword/value pair occupies a line. The keyword and value in each pair are separated with a colon (:). The following describes the parameters required for an HTTP request.

Parameter Type Required Description
X-NLS-Token String Required The service authentication token. For more information about how to obtain the token, see Obtain a token.
Content-type String Required The content type. Fixed value: application/octet-stream, which indicates that the data in the HTTP request body is a binary stream.
Content-Length Long Required The length of the data in the HTTP request body, that is, the length of the audio file.
Host String Required The server domain name of the HTTP request. Fixed value: nls-gateway-ap-southeast-1.aliyuncs.com.

HTTP request body

The HTTP request body contains binary audio data. Therefore, you must set the Content-Type parameter in the HTTP request header to application/octet-stream.

Response

After the client sends an HTTP request with audio data, the server returns a response with the recognition result in JSON format.

Sample success response

  1. {
  2. "task_id": "cf7b0c5339244ee29cd4e43fb97fd52e",
  3. "result": "Weather in Beijing.",
  4. "status":20000000,
  5. "message":"SUCCESS"
  6. }

Sample error response

The following example indicates an authentication token error:

  1. {
  2. "task_id": "8bae3613dfc54ebfa811a17d8a7a9ae7",
  3. "result": "",
  4. "status": 40000001,
  5. "message": "Gateway:ACCESS_DENIED:The token 'c0c1e860f3*******de8091c68a' is invalid!"
  6. }

Response parameters

Parameter Type Description
task_id String The 32-bit task ID. Record this value for troubleshooting.
result String The speech recognition result.
status Integer The service status code.
message String The service status description.

Service status codes

The status code 20000000 indicates that the request is successful. A status code starting with 4 indicates a client error, and a status code starting with 5 indicates a server error.

Service status code Service status description Solution
20000000 The request is successful.
40000000 The error message returned because a client error has occurred. This is the default client error code. Resolve the error according to the error message or open a ticket.
40000001 The error message returned because the client fails authentication. Check whether the token used by the client is correct and valid.
40000002 The error message returned because the message is invalid. Check whether the message sent by the client meets relevant requirements.
40000003 The error message returned because the parameter value is invalid. Check whether the parameter value is correct.
40000004 The error message returned because the idle status of the client times out. Check whether the client does not send any data to the server for a long time.
40000005 The error message returned because the number of requests exceeds the upper limit. Check whether the number of concurrent connections or the queries per second (QPS) exceeds the upper limit.
41010101 Unsupported sampling rate please note that the sampling rate parameter (8000 or 16000) set in the code must match the model (8k or 16k) corresponding to APPKEY on the console
50000000 The error message returned because a server error has occurred. This is the default server error code. If the error code is occasionally returned, ignore it. If the error code is returned multiple times, open a ticket.
50000001 The error message returned because an internal gRPC call error has occurred. If the error code is occasionally returned, ignore it. If the error code is returned multiple times, open a ticket.

Quick test

Download the nls-sample-16k.wav audio file.

You can run a cURL command to quickly test the short sentence recognition RESTful API.

  1. curl -X POST -H "X-NLS-Token: ${token}" http://nls-gateway-ap-southeast-1.aliyuncs.com/stream/v1/asr?appkey=${appkey} --data-binary @${audio_file}
  2. Example:
  3. curl -X POST -H "X-NLS-Token: 4a036*******531cf" http://nls-gateway-ap-southeast-1.aliyuncs.com/stream/v1/asr?appkey=tt4******3P2u --data-binary @./nls-sample-16k.wav

Audio file description: Download the nls-sample-16k.wav audio file of the universal model. If you use another audio file for the test, specify the audio coding format and sampling rate, and select an applicable model in the Intelligent Speech Interaction console.For more information about model setting, see Manage projects.

Java demo

Dependencies:

  1. <dependency>
  2. <groupId>com.squareup.okhttp3</groupId>
  3. <artifactId>okhttp</artifactId>
  4. <version>3.9.1</version>
  5. </dependency>
  6. <!-- http://mvnrepository.com/artifact/com.alibaba/fastjson -->
  7. <dependency>
  8. <groupId>com.alibaba</groupId>
  9. <artifactId>fastjson</artifactId>
  10. <version>1.2.42</version>
  11. </dependency>

Request and response:

  1. import com.alibaba.fastjson.JSONPath;
  2. import com.alibaba.nls.client.example.utils.HttpUtil;
  3. import java.util.HashMap;
  4. public class SpeechRecognizerRESTfulDemo {
  5. private String accessToken;
  6. private String appkey;
  7. public SpeechRecognizerRESTfulDemo(String appkey, String token) {
  8. this.appkey = appkey;
  9. this.accessToken = token;
  10. }
  11. public void process(String fileName, String format, int sampleRate,
  12. boolean enablePunctuationPrediction,
  13. boolean enableInverseTextNormalization,
  14. boolean enableVoiceDetection) {
  15. /**
  16. * Set the HTTP REST POST request.
  17. * 1. Use HTTP.
  18. * 2. Set the domain name for the speech recognition service to nls-gateway-ap-southeast-1.aliyuncs.com.
  19. * 3. Set the request path of the speech recognition API to /stream/v1/asr.
  20. * 4. Set the required request parameters: appkey, format, and sample_rate.
  21. * 5. Set the optional request parameters: enable_punctuation_prediction, enable_inverse_text_normalization, and enable_voice_detection.
  22. */
  23. String url = "http://nls-gateway-ap-southeast-1.aliyuncs.com/stream/v1/asr";
  24. String request = url;
  25. request = request + "?appkey=" + appkey;
  26. request = request + "&format=" + format;
  27. request = request + "&sample_rate=" + sampleRate;
  28. if (enablePunctuationPrediction) {
  29. request = request + "&enable_punctuation_prediction=" + true;
  30. }
  31. if (enableInverseTextNormalization) {
  32. request = request + "&enable_inverse_text_normalization=" + true;
  33. }
  34. if (enableVoiceDetection) {
  35. request = request + "&enable_voice_detection=" + true;
  36. }
  37. System.out.println("Request: " + request);
  38. /**
  39. * Set the HTTP request header.
  40. * 1. Authentication parameters
  41. * 2. Content-Type: application/octet-stream
  42. */
  43. HashMap<String, String> headers = new HashMap<String, String>();
  44. headers.put("X-NLS-Token", this.accessToken);
  45. headers.put("Content-Type", "application/octet-stream");
  46. /**
  47. * Send the HTTPS POST request and process the response returned by the server.
  48. */
  49. String response = HttpUtil.sendPostFile(request, headers, fileName);
  50. if (response != null) {
  51. System.out.println("Response: " + response);
  52. String result = JSONPath.read(response, "result").toString();
  53. System.out.println("Recognition result: " + result);
  54. }
  55. else {
  56. System.err.println("Recognition failed!") ;
  57. }
  58. }
  59. public static void main(String[] args) {
  60. if (args.length < 2) {
  61. System.err.println("SpeechRecognizerRESTfulDemo need params: <token> <app-key>");
  62. System.exit(-1);
  63. }
  64. String token = args[0];
  65. String appkey = args[1];
  66. SpeechRecognizerRESTfulDemo demo = new SpeechRecognizerRESTfulDemo(appkey, token);
  67. String fileName = SpeechRecognizerRESTfulDemo.class.getClassLoader().getResource("./nls-sample-16k.wav").getPath();
  68. String format = "pcm";
  69. int sampleRate = 16000;
  70. boolean enablePunctuationPrediction = true;
  71. boolean enableInverseTextNormalization = true;
  72. boolean enableVoiceDetection = false;
  73. demo.process(fileName, format, sampleRate, enablePunctuationPrediction, enableInverseTextNormalization, enableVoiceDetection);
  74. }
  75. }

HttpUtils class:

  1. import okhttp3.*;
  2. import java.io.File;
  3. import java.io.IOException;
  4. import java.net.SocketTimeoutException;
  5. import java.util.HashMap;
  6. import java.util.Map;
  7. import java.util.concurrent.TimeUnit;
  8. public class HttpUtil {
  9. private static String getResponseWithTimeout(Request q) {
  10. String ret = null;
  11. OkHttpClient.Builder httpBuilder = new OkHttpClient.Builder();
  12. OkHttpClient client = httpBuilder.connectTimeout(10, TimeUnit.SECONDS)
  13. .readTimeout(60, TimeUnit.SECONDS)
  14. .writeTimeout(60, TimeUnit.SECONDS)
  15. .build();
  16. try {
  17. Response s = client.newCall(q).execute();
  18. ret = s.body().string();
  19. s.close();
  20. } catch (SocketTimeoutException e) {
  21. ret = null;
  22. System.err.println("get result timeout");
  23. } catch (IOException e) {
  24. System.err.println("get result error " + e.getMessage());
  25. }
  26. return ret;
  27. }
  28. public static String sendPostFile(String url, HashMap<String, String> headers, String fileName) {
  29. RequestBody body;
  30. File file = new File(fileName);
  31. if (!file.isFile()) {
  32. System.err.println("The filePath is not a file: " + fileName);
  33. return null;
  34. } else {
  35. body = RequestBody.create(MediaType.parse("application/octet-stream"), file);
  36. }
  37. Headers.Builder hb = new Headers.Builder();
  38. if (headers != null && !headers.isEmpty()) {
  39. for (Map.Entry<String, String> entry : headers.entrySet()) {
  40. hb.add(entry.getKey(), entry.getValue());
  41. }
  42. }
  43. Request request = new Request.Builder()
  44. .url(url)
  45. .headers(hb.build())
  46. .post(body)
  47. .build();
  48. return getResponseWithTimeout(request);
  49. }
  50. public static String sendPostData(String url, HashMap<String, String> headers, byte[] data) {
  51. RequestBody body;
  52. if (data.length == 0) {
  53. System.err.println("The send data is empty.");
  54. return null;
  55. } else {
  56. body = RequestBody.create(MediaType.parse("application/octet-stream"), data);
  57. }
  58. Headers.Builder hb = new Headers.Builder();
  59. if (headers != null && !headers.isEmpty()) {
  60. for (Map.Entry<String, String> entry : headers.entrySet()) {
  61. hb.add(entry.getKey(), entry.getValue());
  62. }
  63. }
  64. Request request = new Request.Builder()
  65. .url(url)
  66. .headers(hb.build())
  67. .post(body)
  68. .build();
  69. return getResponseWithTimeout(request);
  70. }
  71. }

C++ demo

The C++ demo uses the third-party library cURL to process HTTP requests and responses. You can download the cURL library and demo.

Directory description:

  • CMakeLists.txt: the CMakeList file of the demo project.
  • demo
File name Description
restfulAsrDemo.cpp The demo of the short sentence recognition RESTful API.
  • include
File name Description
curl The header file directory of the cURL library.
  • lib:the library, which contains the dynamic library cURL 7.60. Depending on the operating system, you can select the version for Linux (runtime environment: glibc 2.5 or later, and GCC 4 or GCC 5), or the version for Windows (runtime environment: Visual Studio 2013 or Visual Studio 2015).
  • readme.txt: the description.
  • release.log: the release notes.
  • version: the version number.
  • build.sh: the demo compilation script.

Note:

  1. In Linux, the minimum runtime environment requirements are as follows: glibc 2.5 or later, and GCC 4 or GCC 5.
  2. In Windows, you need to build your own demo project.
  3. The downloaded C++ demo package contains the test audio file sample.pcm.

Compilation and running:

  1. 1. Check whether you have installed CMake 2.4 or later on your local host.
  2. 2. cd path/to/sdk/lib
  3. 3. tar -zxvpf linux.tar.gz
  4. 4. cd path/to/sdk
  5. 5. Run the [./build.sh] command to compile the demo.
  6. 6. After compilation, go to the demo directory and run the [./restfulAsrDemo] command.
  7. If the operating system does not support CMake, you can run the following commands to manually compile the demo:
  8. 1: cd path/to/sdk/lib
  9. 2: tar -zxvpf linux.tar.gz
  10. 3: cd path/to/sdk/demo
  11. 4: g++ -o restfulAsrDemo restfulAsrDemo.cpp -I path/to/sdk/include -L path/to/sdk/lib/linux -lssl -lcrypto -lcurl -D_GLIBCXX_USE_CXX11_ABI=0
  12. 5: export LD_LIBRARY_PATH=path/to/sdk/lib/linux/
  13. 6: ./restfulAsrDemo your-token your-appkey
  14. In Windows, you need to build your own demo project.

Sample code:

  1. #include <iostream>
  2. #include <string>
  3. #include <fstream>
  4. #include <sstream>
  5. #include "curl/curl.h"
  6. using namespace std;
  7. #ifdef _WIN32
  8. string UTF8ToGBK(const string& strUTF8) {
  9. int len = MultiByteToWideChar(CP_UTF8, 0, strUTF8.c_str(), -1, NULL, 0);
  10. unsigned short * wszGBK = new unsigned short[len + 1];
  11. memset(wszGBK, 0, len * 2 + 2);
  12. MultiByteToWideChar(CP_UTF8, 0, (char*)strUTF8.c_str(), -1, (wchar_t*)wszGBK, len);
  13. len = WideCharToMultiByte(CP_ACP, 0, (wchar_t*)wszGBK, -1, NULL, 0, NULL, NULL);
  14. char *szGBK = new char[len + 1];
  15. memset(szGBK, 0, len + 1);
  16. WideCharToMultiByte(CP_ACP, 0, (wchar_t*)wszGBK, -1, szGBK, len, NULL, NULL);
  17. string strTemp(szGBK);
  18. delete[] szGBK;
  19. delete[] wszGBK;
  20. return strTemp;
  21. }
  22. #endif
  23. /**
  24. * Set the response callback function for the HTTP request sent through the short sentence recognition RESTful API.
  25. * The recognition result is a JSON string.
  26. */
  27. size_t responseCallback(void* ptr, size_t size, size_t nmemb, void* userData) {
  28. string* srResult = (string*)userData;
  29. size_t len = size * nmemb;
  30. char *pBuf = (char*)ptr;
  31. string response = string(pBuf, pBuf + len);
  32. #ifdef _WIN32
  33. response = UTF8ToGBK(response);
  34. #endif
  35. cout << "current result: " << response << endl;
  36. *srResult += response;
  37. cout << "total result: " << *srResult << endl;
  38. return len;
  39. }
  40. int sendAsrRequest(const char* request, const char* token, const char* fileName, string* srResult) {
  41. CURL* curl = NULL;
  42. CURLcode res;
  43. /**
  44. * Read the audio file.
  45. */
  46. ifstream fs;
  47. fs.open(fileName, ios::out | ios::binary);
  48. if (!fs.is_open()) {
  49. cerr << "The audio file is not exist!" << endl;
  50. return -1;
  51. }
  52. stringstream buffer;
  53. buffer << fs.rdbuf();
  54. string audioData(buffer.str());
  55. curl = curl_easy_init();
  56. if (curl == NULL) {
  57. return -1;
  58. }
  59. /**
  60. * Set the HTTP request line.
  61. */
  62. curl_easy_setopt(curl, CURLOPT_CUSTOMREQUEST, "POST");
  63. curl_easy_setopt(curl, CURLOPT_URL, request);
  64. /**
  65. * Set the HTTP request header.
  66. */
  67. struct curl_slist* headers = NULL;
  68. // token
  69. string X_NLS_Token = "X-NLS-Token:";
  70. X_NLS_Token += token;
  71. headers = curl_slist_append(headers, X_NLS_Token.c_str());
  72. // Content-Type
  73. headers = curl_slist_append(headers, "Content-Type:application/octet-stream");
  74. // Content-Length
  75. string content_Length = "Content-Length:";
  76. ostringstream oss;
  77. oss << content_Length << audioData.length();
  78. content_Length = oss.str();
  79. headers = curl_slist_append(headers, content_Length.c_str());
  80. curl_easy_setopt(curl, CURLOPT_HTTPHEADER, headers);
  81. /**
  82. * Set the HTTP request body.
  83. */
  84. curl_easy_setopt(curl, CURLOPT_POSTFIELDS, audioData.c_str());
  85. curl_easy_setopt(curl, CURLOPT_POSTFIELDSIZE, audioData.length());
  86. /**
  87. * Set the response callback function for the HTTP request.
  88. */
  89. curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, responseCallback);
  90. curl_easy_setopt(curl, CURLOPT_WRITEDATA, srResult);
  91. /**
  92. * Send the HTTP request.
  93. */
  94. res = curl_easy_perform(curl);
  95. // Release the resource.
  96. curl_slist_free_all(headers);
  97. curl_easy_cleanup(curl);
  98. if (res != CURLE_OK) {
  99. cerr << "curl_easy_perform failed: " << curl_easy_strerror(res) << endl;
  100. return -1;
  101. }
  102. return 0;
  103. }
  104. int process(const char* request, const char* token, const char* fileName) {
  105. // The demo is initialized only once.
  106. curl_global_init(CURL_GLOBAL_ALL);
  107. string srResult = "";
  108. int ret = sendAsrRequest(request, token, fileName, &srResult);
  109. curl_global_cleanup();
  110. return ret;
  111. }
  112. int main(int argc, char* argv[]) {
  113. if (argc < 3) {
  114. cerr << "params is not valid. Usage: ./demo your_token your_appkey" << endl;
  115. return -1;
  116. }
  117. string token = argv[1];
  118. string appKey = argv[2];
  119. string url = "http://nls-gateway-ap-southeast-1.aliyuncs.com/stream/v1/asr";
  120. string format = "pcm";
  121. int sampleRate = 16000;
  122. bool enablePunctuationPrediction = true;
  123. bool enableInverseTextNormalization = true;
  124. bool enableVoiceDetection = false;
  125. string fileName = "sample.pcm";
  126. /**
  127. * Set the RESTful request parameters.
  128. */
  129. ostringstream oss;
  130. oss << url;
  131. oss << "?appkey=" << appKey;
  132. oss << "&format=" << format;
  133. oss << "&sample_rate=" << sampleRate;
  134. if (enablePunctuationPrediction) {
  135. oss << "&enable_punctuation_prediction=" << "true";
  136. }
  137. if (enableInverseTextNormalization) {
  138. oss << "&enable_inverse_text_normalization=" << "true";
  139. }
  140. if (enableVoiceDetection) {
  141. oss << "&enable_voice_detection=" << "true";
  142. }
  143. string request = oss.str();
  144. cout << "request: " << request << endl;
  145. process(request.c_str(), token.c_str(), fileName.c_str());
  146. return 0;
  147. }

Python demo

Note: Use the httplib module for Python 2.x, and use the http.client module for Python 3.x.

  1. # -*- coding: UTF-8 -*-
  2. # Import the httplib module for Python 2.x.
  3. # import httplib
  4. # Import the http.client module for Python 3.x.
  5. import http.client
  6. import json
  7. def process(request, token, audioFile) :
  8. # Read the audio file.
  9. with open(audioFile, mode = 'rb') as f:
  10. audioContent = f.read()
  11. host = 'nls-gateway-ap-southeast-1.aliyuncs.com'
  12. # Set the HTTP request header.
  13. httpHeaders = {
  14. 'X-NLS-Token': token,
  15. 'Content-type': 'application/octet-stream',
  16. 'Content-Length': len(audioContent)
  17. }
  18. # Use the httplib module for Python 2.x.
  19. # conn = httplib.HTTPConnection(host)
  20. # Use the http.client module for Python 3.x.
  21. conn = http.client.HTTPConnection(host)
  22. conn.request(method='POST', url=request, body=audioContent, headers=httpHeaders)
  23. response = conn.getresponse()
  24. print('Response status and response reason:')
  25. print(response.status ,response.reason)
  26. body = response.read()
  27. try:
  28. print('Recognize response is:')
  29. body = json.loads(body)
  30. print(body)
  31. status = body['status']
  32. if status == 20000000 :
  33. result = body['result']
  34. print('Recognize result: ' + result)
  35. else :
  36. print('Recognizer failed!')
  37. except ValueError:
  38. print('The response is not json format string')
  39. conn.close()
  40. appKey = 'Your appkey'
  41. token = 'Your service authentication token'
  42. # Set the service request address.
  43. url = 'http://nls-gateway-ap-southeast-1.aliyuncs.com/stream/v1/asr'
  44. # Set the audio file.
  45. audioFile = '/path/to/nls-sample-16k.wav'
  46. format = 'pcm'
  47. sampleRate = 16000
  48. enablePunctuationPrediction = True
  49. enableInverseTextNormalization = True
  50. enableVoiceDetection = False
  51. # Set RESTful request parameters.
  52. request = url + '?appkey=' + appKey
  53. request = request + '&format=' + format
  54. request = request + '&sample_rate=' + str(sampleRate)
  55. if enablePunctuationPrediction :
  56. request = request + '&enable_punctuation_prediction=' + 'true'
  57. if enableInverseTextNormalization :
  58. request = request + '&enable_inverse_text_normalization=' + 'true'
  59. if enableVoiceDetection :
  60. request = request + '&enable_voice_detection=' + 'true'
  61. print('Request: ' + request)
  62. process(request, token, audioFile)

PHP demo

Note: The PHP demo uses cURL functions. Ensure that you have installed PHP 4.0.2 or later and cURL extensions.

  1. <?php
  2. function process($token, $request, $audioFile) {
  3. /**
  4. * Read the audio file.
  5. */
  6. $audioContent = file_get_contents($audioFile);
  7. if ($audioContent == FALSE) {
  8. print "The audio file is not exist!\n";
  9. return;
  10. }
  11. $curl = curl_init();
  12. curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  13. curl_setopt($curl, CURLOPT_TIMEOUT, 120);
  14. /**
  15. * Set the HTTP request line.
  16. */
  17. curl_setopt($curl, CURLOPT_URL, $request);
  18. curl_setopt($curl, CURLOPT_POST,TRUE);
  19. /**
  20. * Set the HTTP request header.
  21. */
  22. $contentType = "application/octet-stream";
  23. $contentLength = strlen($audioContent);
  24. $headers = array(
  25. "X-NLS-Token:" . $token,
  26. "Content-type:" . $contentType,
  27. "Content-Length:" . strval($contentLength)
  28. );
  29. curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
  30. /**
  31. * Set the HTTP request body.
  32. */
  33. curl_setopt($curl, CURLOPT_POSTFIELDS, $audioContent);
  34. curl_setopt($curl, CURLOPT_NOBODY, FALSE);
  35. /**
  36. * Send the HTTP request.
  37. */
  38. $returnData = curl_exec($curl);
  39. curl_close($curl);
  40. if ($returnData == FALSE) {
  41. print "curl_exec failed!\n";
  42. return;
  43. }
  44. print $returnData . "\n";
  45. $resultArr = json_decode($returnData, true);
  46. $status = $resultArr["status"];
  47. if ($status == 20000000) {
  48. $result = $resultArr["result"];
  49. print "The audio file recognized result: " . $result . "\n";
  50. }
  51. else {
  52. print "The audio file recognized failed.\n";
  53. }
  54. }
  55. $appkey = "Your appkey";
  56. $token = "Your token";
  57. $url = "http://nls-gateway-ap-southeast-1.aliyuncs.com/stream/v1/asr";
  58. $audioFile = "/path/to/nls-sample-16k.wav";
  59. $format = "pcm";
  60. $sampleRate = 16000;
  61. $enablePunctuationPrediction = TRUE;
  62. $enableInverseTextNormalization = TRUE;
  63. $enableVoiceDetection = FALSE;
  64. /**
  65. * Set the RESTful request parameters.
  66. */
  67. $request = $url;
  68. $request = $request . "?appkey=" . $appkey;
  69. $request = $request . "&format=" . $format;
  70. $request = $request . "&sample_rate=" . strval($sampleRate);
  71. if ($enablePunctuationPrediction) {
  72. $request = $request . "&enable_punctuation_prediction=" . "true";
  73. }
  74. if ($enableInverseTextNormalization) {
  75. $request = $request . "&enable_inverse_text_normalization=" . "true";
  76. }
  77. if ($enableVoiceDetection) {
  78. $request = $request . "&enable_voice_detection=" . "true";
  79. }
  80. print "Request: " . $request . "\n";
  81. process($token, $request, $audioFile);
  82. ?>

Node.js demo

Note: To install the request dependencies, run the following command in the directory of your demo file:

  1. npm install request --save

Sample code:

  1. const request = require('request');
  2. const fs = require('fs');
  3. function callback(error, response, body) {
  4. if (error != null) {
  5. console.log(error);
  6. }
  7. else {
  8. console.log('The audio file recognized result:');
  9. console.log(body);
  10. if (response.statusCode == 200) {
  11. body = JSON.parse(body);
  12. if (body.status == 20000000) {
  13. console.log('result: ' + body.result);
  14. console.log('The audio file recognized succeed!');
  15. } else {
  16. console.log('The audio file recognized failed!');
  17. }
  18. } else {
  19. console.log('The audio file recognized failed, http code: ' + response.statusCode);
  20. }
  21. }
  22. }
  23. function process(requestUrl, token, audioFile) {
  24. /**
  25. * Read the audio file.
  26. */
  27. var audioContent = null;
  28. try {
  29. audioContent = fs.readFileSync(audioFile);
  30. } catch(error) {
  31. if (error.code == 'ENOENT') {
  32. console.log('The audio file is not exist!');
  33. }
  34. return;
  35. }
  36. /**
  37. * Set the HTTP request header.
  38. */
  39. var httpHeaders = {
  40. 'X-NLS-Token': token,
  41. 'Content-type': 'application/octet-stream',
  42. 'Content-Length': audioContent.length
  43. };
  44. var options = {
  45. url: requestUrl,
  46. method: 'POST',
  47. headers: httpHeaders,
  48. body: audioContent
  49. };
  50. request(options, callback);
  51. }
  52. var appkey = 'Your appkey';
  53. var token = 'Your service authentication token';
  54. var url = 'http://nls-gateway-ap-southeast-1.aliyuncs.com/stream/v1/asr';
  55. var audioFile = '/path/to/nls-sample-16k.wav';
  56. var format = 'pcm';
  57. var sampleRate = '16000';
  58. var enablePunctuationPrediction = true;
  59. var enableInverseTextNormalization = true;
  60. var enableVoiceDetection = false;
  61. /**
  62. * Set the RESTful request parameters.
  63. */
  64. var requestUrl = url;
  65. requestUrl = requestUrl + '?appkey=' + appkey;
  66. requestUrl = requestUrl + '&format=' + format;
  67. requestUrl = requestUrl + '&sample_rate=' + sampleRate;
  68. if (enablePunctuationPrediction) {
  69. requestUrl = requestUrl + '&enable_punctuation_prediction=' + 'true';
  70. }
  71. if (enableInverseTextNormalization) {
  72. requestUrl = requestUrl + '&enable_inverse_text_normalization=' + 'true';
  73. }
  74. if (enableVoiceDetection) {
  75. requestUrl = requestUrl + '&enable_voice_detection=' + 'true';
  76. }
  77. process(requestUrl, token, audioFile);

.Net demo

Note: The demo depends on System.Net.Http and Newtonsoft.Json.Linq.

  1. using System;
  2. using System.Net.Http;
  3. using System.IO;
  4. using Newtonsoft.Json.Linq;
  5. namespace RESTfulAPI
  6. {
  7. class SpeechRecognizerRESTfulDemo
  8. {
  9. private string token;
  10. private string appkey;
  11. public SpeechRecognizerRESTfulDemo(string appkey, string token)
  12. {
  13. this.appkey = appkey;
  14. this.token = token;
  15. }
  16. public void process(string fileName, string format, int sampleRate,
  17. bool enablePunctuationPrediction,
  18. bool enableInverseTextNormalization,
  19. bool enableVoiceDetection)
  20. {
  21. /**
  22. * Set the HTTP REST POST request.
  23. * 1. Use HTTP.
  24. * 2. Set the domain name for the speech recognition service to nls-gateway-ap-southeast-1.aliyuncs.com.
  25. * 3. Set the request path of the speech recognition API to /stream/v1/asr.
  26. * 4. Set the required request parameters: appkey, format, and sample_rate.
  27. * 5. Set the optional request parameters: enable_punctuation_prediction, enable_inverse_text_normalization, and enable_voice_detection.
  28. */
  29. string url = "http://nls-gateway-ap-southeast-1.aliyuncs.com/stream/v1/asr";
  30. url = url + "?appkey=" + appkey;
  31. url = url + "&format=" + format;
  32. url = url + "&sample_rate=" + sampleRate;
  33. if (enablePunctuationPrediction)
  34. {
  35. url = url + "&enable_punctuation_prediction=" + true;
  36. }
  37. if (enableInverseTextNormalization)
  38. {
  39. url = url + "&enable_inverse_text_normalization=" + true;
  40. }
  41. if (enableVoiceDetection)
  42. {
  43. url = url + "&enable_voice_detection=" + true;
  44. }
  45. System.Console.WriteLine("URL: " + url);
  46. HttpClient client = new HttpClient();
  47. /**
  48. * Set the HTTP request header.
  49. * Set the authentication parameters.
  50. */
  51. client.DefaultRequestHeaders.Add("X-NLS-Token", token);
  52. if (!File.Exists(fileName))
  53. {
  54. System.Console.WriteLine("The audio file dose not exist");
  55. return;
  56. }
  57. byte[] audioData = File.ReadAllBytes(fileName);
  58. /**
  59. * Set the HTTP request body.
  60. * The content of the HTTP request body is the binary data of the audio file.
  61. * Content-Type:application/octet-stream
  62. */
  63. ByteArrayContent content = new ByteArrayContent(audioData);
  64. content.Headers.Add("Content-Type", "application/octet-stream");
  65. /**
  66. * Send the HTTPS POST request and process the response returned by the server.
  67. */
  68. HttpResponseMessage response = client.PostAsync(url, content).Result;
  69. string responseBodyAsText = response.Content.ReadAsStringAsync().Result;
  70. System.Console.WriteLine("Response: " + responseBodyAsText);
  71. if (response.IsSuccessStatusCode)
  72. {
  73. JObject obj = JObject.Parse(responseBodyAsText);
  74. string result = obj["result"].ToString();
  75. System.Console.WriteLine("Recognition result: " + result);
  76. }
  77. else
  78. {
  79. System.Console.WriteLine("Response status code and reason phrase: " +
  80. response.StatusCode + " " + response.ReasonPhrase);
  81. System.Console.WriteLine("Recognition failed!") ;
  82. }
  83. }
  84. static void Main(string[] args)
  85. {
  86. if (args.Length < 2)
  87. {
  88. System.Console.WriteLine("SpeechRecognizerRESTfulDemo need params: <token> <app-key>");
  89. return;
  90. }
  91. string token = args[0];
  92. string appkey = args[1];
  93. SpeechRecognizerRESTfulDemo demo = new SpeechRecognizerRESTfulDemo(appkey, token);
  94. string fileName = "nls-sample-16k.wav";
  95. string format = "pcm";
  96. int sampleRate = 16000;
  97. bool enablePunctuationPrediction = true;
  98. bool enableInverseTextNormalization = true;
  99. bool enableVoiceDetection = false;
  100. demo.process(fileName, format, sampleRate,
  101. enablePunctuationPrediction, enableInverseTextNormalization, enableVoiceDetection);
  102. }
  103. }
  104. }

GO demo

  1. package main
  2. import (
  3. "fmt"
  4. "encoding/json"
  5. "net/http"
  6. "io/ioutil"
  7. "strconv"
  8. "bytes"
  9. )
  10. func process(appkey string, token string, fileName string, format string, sampleRate int,
  11. enablePunctuationPrediction bool, enableInverseTextNormalization bool, enableVoiceDetection bool) {
  12. /**
  13. * Set the HTTP REST POST request.
  14. * 1. Use HTTP.
  15. * 2. Set the domain name for the speech recognition service to nls-gateway-ap-southeast-1.aliyuncs.com.
  16. * 3. Set the request path of the speech recognition API to /stream/v1/asr.
  17. * 4. Set the required request parameters: appkey, format, and sample_rate.
  18. * 5. Set the optional request parameters: enable_punctuation_prediction, enable_inverse_text_normalization, and enable_voice_detection.
  19. */
  20. var url string = "http://nls-gateway-ap-southeast-1.aliyuncs.com/stream/v1/asr"
  21. url = url + "?appkey=" + appkey
  22. url = url + "&format=" + format
  23. url = url + "&sample_rate=" + strconv.Itoa(sampleRate)
  24. if (enablePunctuationPrediction) {
  25. url = url + "&enable_punctuation_prediction=" + "true"
  26. }
  27. if (enableInverseTextNormalization) {
  28. url = url + "&enable_inverse_text_normalization=" + "true"
  29. }
  30. if (enableVoiceDetection) {
  31. url = url + "&enable_voice_detection=" + "false"
  32. }
  33. fmt.Println(url)
  34. /**
  35. * Read the audio file and use the audio data as the HTTP request body.
  36. */
  37. audioData, err := ioutil.ReadFile(fileName)
  38. if err != nil {
  39. panic(err)
  40. }
  41. request, err := http.NewRequest("POST", url, bytes.NewBuffer(audioData))
  42. if err != nil {
  43. panic(err)
  44. }
  45. /**
  46. * Set the HTTP request header.
  47. * 1. Authentication parameters
  48. * 2. Content-Type: application/octet-stream
  49. */
  50. request.Header.Add("X-NLS-Token", token)
  51. request.Header.Add("Content-Type", "application/octet-stream")
  52. /**
  53. * Send the HTTPS POST request and process the response returned by the server.
  54. */
  55. client := &http.Client{}
  56. response, err := client.Do(request)
  57. if err != nil {
  58. panic(err)
  59. }
  60. defer response.Body.Close()
  61. body, _ := ioutil.ReadAll(response.Body)
  62. fmt.Println(string(body))
  63. statusCode := response.StatusCode
  64. if (statusCode == 200) {
  65. var resultMap map[string]interface{}
  66. err = json.Unmarshal([]byte(body), &resultMap)
  67. if err != nil {
  68. panic(err)
  69. }
  70. var result string = resultMap["result"].(string)
  71. fmt.Println("Recognition result: " + result)
  72. } else {
  73. fmt.Println("Recognition failed. HTTP StatusCode: " + strconv.Itoa(statusCode))
  74. }
  75. }
  76. func main() {
  77. var appkey string = "Your appkey"
  78. var token string = "Your token"
  79. var fileName string = "nls-sample-16k.wav"
  80. var format string = "pcm"
  81. var sampleRate int = 16000
  82. var enablePunctuationPrediction bool = true
  83. var enableInverseTextNormalization bool = true
  84. var enableVoiceDetection = false
  85. process(appkey, token, fileName, format, sampleRate, enablePunctuationPrediction, enableInverseTextNormalization, enableVoiceDetection)
  86. }