All Products
Search
Document Center

Usage notes

Last Updated: Oct 13, 2020

This topic lists the key methods in the NUI SDK. This topic also describes the procedure for using the NUI SDK and provides sample code for your reference.

Prerequisites

  • You understand how the NUI SDK works. For more information, see Overview.

  • A project is created in the Intelligent Speech Interaction console. For more information, see Create a project.

  • The AccessKey ID and AccessKey secret of your Alibaba Cloud account are obtained. For more information, see Activate Intelligent Speech Interaction.

Download the NUI SDK

Log on to the Intelligent Speech Interaction console. In the left-side navigation pane, click All Projects. On the All Projects page, find the created project and click Project Settings in the Actions column. On the Project Settings page of the project, click the Device-oriented Solution tab. Then, click Download SDK to download the NUI SDK.

Device-oriented Solution

Key methods

  • asr_engine_init

     void *asr_engine_init(void);

    Description: initializes the engine.

    Response: returns the handle parameter for subsequent methods to use.

  • asr_engine_start

    int asr_engine_start(void *handle);

    Description: starts the engine. The engine keeps running until the asr_engine_stop method is called.

    Parameter: handle. It is the parameter returned by the asr_engine_init method.

    Response: returns ERROR_CODE, which indicates the result of the call. A value of 0 indicates that the call was successful. A value of 1 indicates that the call failed. A value of 2 indicates that the specified parameter is invalid.

  • asr_engine_feed_data

    int asr_engine_feed_data(void handle, char data, int data_size);

    Description: sends audio data to the server engine.

    Parameters:

    • handle: the parameter returned by the asr_engine_init method.

    • data: the audio data to be sent.

    • data_size: the size of the audio data to be sent, in bytes. The data size is not limited.

    Response: returns ERROR_CODE, which indicates the result of the call. A value of 0 indicates that the call was successful. A value of 1 indicates that the call failed. A value of 2 indicates that a specified parameter is invalid.

  • asr_engine_stop

    int asr_engine_stop(void *handle)

    Description: stops the engine.

    Parameter: handle. It is the parameter returned by the asr_engine_init method.

    Response: returns ERROR_CODE, which indicates the result of the call. A value of 0 indicates that the call was successful. A value of 1 indicates that the call failed. A value of 2 indicates that the specified parameter is invalid.

  • asr_engine_finalize

    int asr_engine_finalize(void *handle)

    Description: releases the engine.

    Parameter: handle. It is the parameter returned by the asr_engine_init method.

    Response: returns ERROR_CODE, which indicates the result of the call. A value of 0 indicates that the call was successful. A value of 1 indicates that the call failed. A value of 2 indicates that the specified parameter is invalid.

  • asr_engine_get_property

    int asr_engine_get_property(ASR_PROPERTY_TYPE property_type, char* value, int length);

    Description: queries the configuration of the engine.

    Parameters:

    • value: the setting of the property.

    • length: the length of the parameter value of value.

    • property_type: the property to be queried. The following table describes the available properties of the engine. For more information about this parameter, see the header file of the NUI SDK.

      Property

      Description

      ASRPropertyAEC

      Specifies whether the engine supports acoustic echo cancellation (AEC).

      ASRPropertyInitOnce

      Specifies whether the engine supports the initialization only once.

      ASRPropertyNeedAuth

      Specifies whether the engine requires authentication.

      ASRPropertySupportFFVP

      Specifies whether the engine supports returning processed audio data from the client.

      ASRPropertySupportVAD

      Specifies whether the engine supports returning audio data processed in voice activity detection (VAD) mode.

      ASRPropertyNeedAuthEveryTime

      Specifies whether the engine requires online authentication each time the service is called.

      ASRPropertyMixPcmFormat

      The sequence of the recording voice channels and reference voice channels in multi-channel pulse-code modulation (PCM) audio data. A value of 0 indicates that the reference voice channel is processed before the recording voice channels. A value of 1 indicates that the recording voice channels are processed before the reference voice channel.

      ASRPropertyDynamicWuws

      The specified wake-up words.

      ASRPropertyErrorCode

      The error code returned for the initialization.

    Response: returns ERROR_CODE, which indicates the result of the call. A value of 0 indicates that the call was successful. A value of 1 indicates that the call failed. A value of 2 indicates that a specified parameter is invalid.

  • asr_engine_set_params

    ASR_ERROR_CODE asr_engine_set_params(ASR_PARAM_TYPE param_type, unsigned long value_size, void *value);

    Description: sets the request parameters for the engine.

    Parameters:

    • value_size: the length of the parameter value of value.

    • value: the setting of the request parameter.

    • param_type: the request parameter to be set. The following table describes the available request parameters of the engine. For more information about this parameter, see the header file of the NUI SDK.

      Parameter

      Description

      ASR_PARAM_EVENT_CB

      Sets a callback to be fired. This parameter is valid only before the asr_engine_init method is called.

      ASR_PARAM_ENABLE_AEC

      Specifies whether to enable AEC for the engine. By default, the NUI SDK enables AEC. This feature cannot be disabled.

      ASR_PARAM_MIC_NUM

      The number of microphones. The parameter value cannot be changed when the engine is running.

      ASR_PARAM_REF_NUM

      The number of reference voice channels. The parameter value cannot be changed when the engine is running.

      ASR_PARAM_ENABLE_FFVP

      Reserved. Specifies whether the engine supports returning processed audio data from the client.

      ASR_PARAM_ENABLE_VAD

      Specifies whether the engine supports returning audio data processed in VAD mode. By default, the engine returns such data. The parameter value cannot be changed.

      ASR_PARAM_ENABLE_DEBUG

      Specifies whether to enable debugging. A value of 1 specifies that debugging is enabled. A value of 0 specifies that debugging is disabled. When this parameter is set to 1, the ASR_PARAM_SET_DEBUG_PATH parameter takes effect. The ASR_PARAM_SET_DEBUG_PATH parameter specifies the directory where audio files generated during the debugging are stored.

      ASR_PARAM_SET_DEBUG_PATH

      The directory where audio files generated during the debugging are stored.

      ASR_PARAM_SET_PRODUCT_ID

      The ID of the product.

      ASR_PARAM_SET_SDK_CODE

      Reserved. The code of the NUI SDK.

      ASR_PARAM_SET_NUI_APPKEY

      The appkey of the project. This parameter is valid only before the asr_engine_init method is called.

      ASR_PARAM_SET_AK_ID

      The AccessKey ID of your Alibaba Cloud account. This parameter is valid only before the asr_engine_init method is called.

      ASR_PARAM_SET_AK_KEY

      The AccessKey secret of your Alibaba Cloud account. This parameter is valid only before the asr_engine_init method is called.

      ASR_PARAM_DIRECT_IP

      Reserved.

      ASR_PARAM_SET_WORKSPACE

      The working directory from which the engine reads the configuration file. This parameter is valid only before the asr_engine_init method is called.

      ASR_PARAM_SET_MODE

      The working mode of the engine. A value of 0 indicates the keyword spotting (KWS) mode. A value of 1 indicates the VAD mode, in which speech recognition is started without wake-up. A value of 2 indicates the command word mode.

      ASR_PARAM_SET_WUWS

      The shortcut word. You can specify only one shortcut word each time you call the method. A null value specifies that all shortcut words are cleared.

      ASR_PARAM_ENABLE_WUW_SAVE

      Reserved. Specifies whether to save the audio data that is used to wake up the client on the cloud server.

      ASR_PARAM_SET_API_KEY

      The API key used to identify your device. This parameter is required for initialization.

      ASR_PARAM_SET_DEVICE_ID

      The ID of your device. This parameter is required for initialization.

      ASR_PARAM_SET_GAIN

      Reserved. The automatic gain of the client.

      ASR_PARAM_ENABLE_ASR

      Specifies whether to enable real-time speech recognition. A value of 1 specifies that real-time speech recognition is enabled. A value of 0 specifies that real-time speech recognition is disabled.

      ASR_PARAM_ENABLE_DIALOG

      Reserved. Specifies whether to enable the internal dialog feature.

      ASR_PARAM_ENABLE_THIRDPARTY_CONTENT_SUPPORT

      Reserved. Specifies whether to support a third-party cloud service provider.

      ASR_PARAM_ENABLE_CMD

      Reserved. Specifies whether to enable the command mode.

    • Response: returns ERROR_CODE, which indicates the result of the call. A value of 0 indicates that the call was successful. A value of 1 indicates that the call failed. A value of 2 indicates that a specified parameter is invalid.

  • asr_engine_vad_read

    int asr_engine_vad_read(void handle, char data, int data_size);

    Description: obtains audio data processed in VAD mode. This method is valid only after an ASR_EVENT_VAD_START event occurs. It is not a blocking method. To obtain the complete audio data processed in VAD mode, this method is continuously called until an ASR_EVENT_VAD_END event occurs.

    Parameters:

    • handle: the parameter returned by the asr_engine_init method.

    • data: the buffer used to store the audio data processed in VAD mode.

    • data_size: the size of the buffer, in bytes.

    Response: returns ERROR_CODE, which indicates the result of the call. A value of 0 indicates that the call was successful. A value of 1 indicates that the call failed. A value of 2 indicates that a specified parameter is invalid.

  • asr_engine_interactive

    int asr_engine_interactive(void *handle);

    Description: resets the engine to force it to enter the INTERACTIVE state.

    Parameter: handle. It is the parameter returned by the asr_engine_init method.

    Response: returns ERROR_CODE, which indicates the result of the call. A value of 0 indicates that the call was successful. A value of 1 indicates that the call failed. A value of 2 indicates that the specified parameter is invalid.

Callback events

The following method shows the parameters included in callback events:

void (*asr_event_callback)(ASR_EVENT_TYPE event_type, int cmd, const char result)

The following table describes the events that can be returned in callbacks.

Event

Description

ASR_EVENT_AUTH_SUCCESS

Indicates that the authentication is successful. It is a reserved event. You can check the authentication result based on the response of the asr_engine_init method.

ASR_EVENT_AUTH_FAIL

Indicates that the authentication failed. It is a reserved event. You can check the authentication result based on the response of the asr_engine_init method.

ASR_EVENT_AWAKE_SUCCESS

Wakes up the client.

ASR_EVENT_AWAKE_CMD

It is an event that occurs along with ASR_EVENT_AWAKE_SUCCESS. You do not need to process this event.

ASR_EVENT_VAD_START

Detects the beginning of human voice.

ASR_EVENT_VAD_END

Detects the end of human voice.

ASR_EVENT_VAD_TIMEOUT

Indicates that the service request times out because no human voice is detected for 10 consecutive seconds in VAD mode or after the client is woken up.

ASR_EVENT_PARTIAL_ASR_RESULT

Generates the intermediate recognition result.

ASR_EVENT_FINAL_ASR_RESULT

Generates the final recognition result.

ASR_EVENT_DIALOG

Generates the dialog result.

ASR_EVENT_ERROR

Resets the engine to the STOP state because an error has occurred.

The cmd parameter in the callback event can be used to indicate the command word that is used to wake up the client. To this end, this parameter must be used with the ASR_PARAM_SET_WUWS parameter.

When you use ASR_PARAM_SET_WUWS to specify the command word, the value_size parameter and its value are stored in the format of a key-value pair. In the asr_event_callback method, the cmd parameter is set to the value of the value_size parameter to inform you of the used wake-up word. If the primary wake-up word is used to wake up the client, the value of cmd is 0. Do not set value_size to 0 when you specify the command word.

Procedure

  1. Specify the ID and API key of your device. The values are used as the unique identifiers of your device.

  2. Specify the following information used for authentication:

    1. The appkey of your project

    2. The AccessKey ID of your Alibaba Cloud account

    3. The AccessKey secret of your Alibaba Cloud account

  3. Specify a model and the resource path. You can specify a relative path.

  4. Set the callback method.

  5. Call the asr_engine_init method to initialize the engine.

Sample code

#include "asr_api.h"
static pthread_t recording_thread;
//sdk handler
static void* handler;
// Handle the event that occurs. For more information about events, see the documentation provided in the NUI SDK package.
// We recommend that you do not call a time-consuming method as the callback method. You can use a non-blocking queue to transfer events to a thread that is not processing the callback method.
void event_callback(ASR_EVENT_TYPE event_type, int cmd, const char* result) {
}
void *recording_fn(void* param) {
    // You can specify the size of the data to be sent each time based on the capabilities of the client and hardware in use.
    int data_size = 512;
    char data[512] = {0};
    // You must specify the end of the recording based on the business scenario. In the following sample code, a simple infinite loop is used.
    while (1) {
        // In this example, the data parameter specifies that all the recorded audio data, with a size of 512 bytes, is to be sent.
        // Send the audio data to the NUI SDK.
        asr_engine_feed_data(handler, data, 512);
        // Set the remaining configuration items of the recognition task.
    }
    return NULL;
}
int main() {
    // Set the remaining configuration items of the recognition task.
    const char* CUSTOM_API_KEY = "the API key of your device";
    const char* CUSTOM_DEVICE_ID = "the ID of your device";
    const char* CUSTOM_API_KEY = "the appkey of your project that is created in the Intelligent Speech Interaction console";
    const char* TARGET_ACCESS_ID = "the AccessKey ID of your Alibaba Cloud account";
    const char* TARGET_ACCESS_SECRET = "the AccessKey secret of your Alibaba Cloud account";
    // Specify the API key and ID of your device.
    // Set the CUSTOM_API_KEY and CUSTOM_DEVICE_ID parameters as needed.
    // The value 32 indicates the length of the value of the third parameter. Specify a value as needed.
    asr_engine_set_params(ASR_PARAM_SET_API_KEY, 32, CUSTOM_API_KEY);
    asr_engine_set_params(ASR_PARAM_SET_PRODUCT_KEY, 32, CUSTOM_DEVICE_ID);
    // Specify the appkey of your project, and the AccessKey ID and AccessKey secret of your Alibaba Cloud account.
    // You can obtain the project appkey, AccessKey ID, and AccessKey secret in the Alibaba Cloud Management Console.
    asr_engine_set_params(ASR_PARAM_SET_NUI_APPKEY, 32, TARGET_NUI_APPKEY);
    asr_engine_set_params(ASR_PARAM_SET_AK_ID, 32, TARGET_ACCESS_ID);
    asr_engine_set_params(ASR_PARAM_SET_AK_KEY, 32, TARGET_ACCESS_SECRET);
    // Specify whether to enable real-time speech recognition based on your business needs.
    const char* TARGET_ENABLE_ASR = "1";
    // The type of TARGET_ENABLE_ASR is STRING. A value of 0 specifies that real-time speech recognition is disabled. Any other value specifies that real-time speech recognition is enabled.
    asr_engine_set_params(ASR_PARAM_ENABLE_ASR, 32, TARGET_ENABLE_ASR);
    // Specify the resource path.
    // The TARGET_WORKSPACE parameter specifies the absolute path or relative path of the resources.
    const char* TARGET_WORKSPACE = "the absolute path or relative path of the resources";
    asr_engine_set_params(ASR_PARAM_SET_WORKSPACE, 32, TARGET_WORKSPACE);
    // Configure the callback method.
    asr_engine_set_params(ASR_PARAM_EVENT_CB, sizeof(event_callback), (void *)event_callback);
    // Initialize the engine.
    handler = asr_engine_init();
    if (handler == NULL) {
        // The initialization fails. Troubleshoot the failure based on the returned error code.
        int error = 0;
        asr_engine_get_property(ASRPropertyErrorCode, &error, sizeof(error));
        // Troubleshoot the failure based on the returned error code.
        //...
        return -1;
    }
    // Start the engine.
    asr_engine_start(handler);    
    // Start the recording thread.
    pthread_create(&recording_thread, NULL, recording_fn, NULL);    
    // The recording thread keeps running until the NUI SDK is released.
    pthread_join(recording_thread, NULL);    
    // Stop the engine.
    asr_engine_stop(handler);    
    // Release the engine.
    asr_engine_finalize(handler);    
    return 0;
}