This guide explains how to build and deploy Real-time Conversational AI applications on your smart hardware.
Overview
Real-time Conversational AI provides a comprehensive solution for integrating high-quality, low-latency AI agents into smart hardware such as wearables, companion robots, and home appliances. This solution enables developers to quickly validate and deploy Conversational AI, making "AI-by-default" a reality. Common applications include AI toys, educational hardware, companion devices, wearable personal assistants, and home voice assistants.
Before you begin
Two chipsets are supported: JieLi AC791 and Espressif ESP32-S3. To request support, submit a ticket.
Activate the Real-time Conversational AI service and create an AI agent. For details, see Quick start for audio/video calls.
Apply for a smart hardware license. For more information, contact your business manager.
Procedure
Espressif ESP32-S3
Workflow
Download source code
Download the source code from GitHub open-source project.
Configure the environment
To set up the ESP-ADF development environment, see the Espressif Development Board Guide.
Run the demo
Set up the development environment by following the official Espressif documentation. This demo requires 3A audio processing. To prevent an "AUDIO_THREAD: Not found right xTaskCreateRestrictedPinnedToCore" error, apply the following patch to the
esp-idfdirectory.cd esp-adf/esp-idf # The esp-idf directory git apply ../idf_patches/idf_v5.4_freertos.patchClone the demo repository to your local machine. Open the project's
Kconfig.projbuildconfiguration file and enter your Wi-Fi credentials, agent information, and license.menu "Example Configuration" config WIFI_SSID string "WiFi SSID" default "xxx" help SSID (network name) for the example to connect to. config WIFI_PASSWORD string "WiFi Password" default "xxx" help WiFi password (WPA or WPA2) for the example to use. Can be left blank if the network has no security set. config AUDIO_PLAY_VOLUME int "Audio play volume" default 90 config AUDIO_RECORDER_AEC_ENABLE bool default y config RTC_APP_ID string "ARTC AppID associated with the agent (Warning: For development and testing only)" default "xxx" config RTC_APP_KEY string "ARTC AppKey associated with the agent (Warning: For development and testing only)" default "xxx" config RTC_USER_ID string "User ID (A unique ID is recommended for each device)" default "123" config VOICE_AGENT_ID string "Voice agent ID (Create an agent in the console in advance)" default "xxx" config VOICE_AGENT_REGION string "Region of the voice agent" default "cn-xxx" config LICENSE_PRODUCT_ID string "Enter the license product ID. Contact your business manager to obtain it." default "xxx" config LICENSE_AUTH_CODE string "Enter the license authorization code. Contact your business manager to obtain it." default "xxx" config LICENSE_DEVICE_ID string "Device serial number" default "xxx" endmenuImportantThe RTC_APP_KEY is used to generate a token locally. This method can only be used for development and testing. For production releases, do not embed the RTC_APP_KEY in your application. Instead, generate the token on your server and send it to the device. For instructions on generating a token, see Generate an ARTC authentication token. For smart hardware, the token does not need to be Base64 encoded. Send the raw JSON result directly to the device.
To build and flash the demo project, see the ESP-IDF Programming Guide.
Run the demo. The Monitoring Window displays logs for initialization, key event listener setup, and Wi-Fi connection. The message "Main: [ 5 ] Initialize finish, listen to events" indicates that initialization is complete. You can now press the Play button (the top-left button) to start a call.
I (1309) sleep_gpio: Configure to isolate all GPIO pins in sleep state I (1315) sleep_gpio: Enable automatic switching of GPIO sleep configuration I (1322) main_task: Started on CPU0 I (1325) esp_psram: Reserving pool of 32K of internal memory for DMA/internal allocations I (1333) main_task: Calling app_main() I (1336) Main: [ 1 ] Initialize start I (1355) Main: [1.1] Initialize peripherals I (1356) Main: [ 2 ] Start and wait for Wi-Fi network W (1368) wifi:Password length matches WPA2 standards, authmode threshold changes from OPEN to WPA2 W (1402) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43 W (1410) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43 W (1418) PERIPH_WIFI: Wi-Fi disconnected from SSID xxx, auto-reconnect enabled, reconnect after 1000 ms W (4825) PERIPH_WIFI: Wi-Fi disconnected from SSID xxx, auto-reconnect enabled, reconnect after 1000 ms W (5894) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:4 I (7426) Main: [2.1] Initializing SNTP I (9042) Main: [ 3 ] Start codec chip W (9042) i2c_bus_v2: I2C master handle is NULL, will create new one W (9082) ES7210: Enable TDM mode. ES7210_SDP_INTERFACE2_REG12: 2 I (9094) Main: [ 4 ] Set up event listener I (9095) Main: [ 5 ] Initialize finish, listen to events
Error description
The certificate expired error shown below is typically caused by an unstable Wi-Fi connection, which prevents a successful NTP time synchronization. As a result, the SDK defaults to the epoch time for certificate validation, causing the check to fail.
[1970-01-01 00:00:35.715][ERROR] [license.c:276][hermes_sdk] license response code is 400, dataSize: 321
[1970-01-01 00:00:35.717][ERROR] [license.c:300][hermes_sdk] license refresh fail with code=-3, svcCode=InvalidTimeStamp. Expired, regeustId=c401D243-E831-5AD9-A756-283A94D314FF
[1970-01-01 00:00:35.730][ERROR] [license.c:375][hermes_sdk] parse license response failed with result: -3
[1970-01-01 00:00:35.739][ERROR] [artc _aicall.c:79][hermes_sdk] raise error event code: -10001 msg: action return: -2147481342To resolve this, ensure your device has a stable Wi-Fi connection, for example, by connecting to a different hotspot. This demo uses simplified Wi-Fi connection logic for demonstration purposes. For production environments, implement a robust mechanism that handles disconnections and re-synchronizes the time after reconnecting.
JieLi AC791
Workflow
Download source code
Download the source code from the GitHub open-source project.
Configure the environment
To set up the development environment for the JieLi AC791 series development board, see the JieLi Development Environment Installation Guide.
Run the demo
Clone the Jieli SDK to your local machine.
Clone this demo's source code into the
apps/demodirectory of the Jieli SDK. Rename the cloned directory todemo_artc_aicall.- FW-AC79_AIOT_SDK - apps - demo - demo_artc_aicall (Note: The folder name uses underscores as separators.)In the
demo_artc_aicalldirectory, open the project fileboard/wl82/AC791N_DEMO_ARTC_AICALL.cbpusing the CodeBlocks IDE.Open the
artc_device_helper.cfile and update the Wi-Fi SSID and password.// Set the Wi-Fi SSID and password for network connection. #define WIFI_STA_SSID "ssid" #define WIFI_STA_PWD "pwd"Open the
artc_aicall_demo.cfile and update the license and agent information.// License information for the agent call #define LICENSE_PRODUCT_ID "xxx" // License product ID #define LICENSE_AUTH_CODE "xxx" // License authorization code #define LICENSE_DEVICE_ID "xxx" // Unique device ID // User ID #define USER_ID "xxxx" // Agent ID #define VOICE_AGENT_ID "xxxx" // Region where the agent is located #define AGENT_REGION "xxxx" // ARTC AppID associated with the agent #define RTC_APP_ID "xxxx" // ARTC AppKey associated with the agent #define RTC_APP_KEY "xxxx"ImportantThe RTC_APP_KEY is used to generate a token locally. This method can only be used for development and testing. For production releases, do not embed the RTC_APP_KEY in your application. Instead, generate the token on your server and send it to the device. For instructions on generating a token, see Generate an ARTC authentication token. For smart hardware, the token does not need to be Base64 encoded. Send the raw JSON result directly to the device.
Build the demo project. For build instructions, see the official JieLi documentation.
Run the demo. After powering on the device, use the following keys to interact with the agent.
- `K1`: Start the agent call - `K2`: Hang up the call - `K3`: Interrupt the agent - `K4`: Stop sending audio to the agent - `K5`: Resume sending audio to the agentTo view real-time logs, connect the device's serial port to your computer and use a serial monitoring tool.