Integrate with smart hardware - Intelligent Media Services

This guide explains how to build and deploy Real-time Conversational AI applications on your smart hardware.

Overview

Real-time Conversational AI provides a comprehensive solution for integrating high-quality, low-latency AI agents into smart hardware such as wearables, companion robots, and home appliances. This solution enables developers to quickly validate and deploy Conversational AI, making "AI-by-default" a reality. Common applications include AI toys, educational hardware, companion devices, wearable personal assistants, and home voice assistants.

Before you begin

Two chipsets are supported: JieLi AC791 and Espressif ESP32-S3. To request support, submit a ticket.
Activate the Real-time Conversational AI service and create an AI agent. For details, see Quick start for audio/video calls.
Apply for a smart hardware license. For more information, contact your business manager.

Procedure

Espressif ESP32-S3

Workflow

Download source code

Download the source code from GitHub open-source project.

Configure the environment

To set up the ESP-ADF development environment, see the Espressif Development Board Guide.

Run the demo

Set up the development environment by following the official Espressif documentation. This demo requires 3A audio processing. To prevent an "AUDIO_THREAD: Not found right xTaskCreateRestrictedPinnedToCore" error, apply the following patch to the esp-idf directory.
```
cd esp-adf/esp-idf   # The esp-idf directory
git apply ../idf_patches/idf_v5.4_freertos.patch
```

Clone the demo repository to your local machine. Open the project's Kconfig.projbuild configuration file and enter your Wi-Fi credentials, agent information, and license.

menu "Example Configuration"

config WIFI_SSID
    string "WiFi SSID"
	default "xxx"
	help
		SSID (network name) for the example to connect to.

config WIFI_PASSWORD
    string "WiFi Password"
    default "xxx"
	help
		WiFi password (WPA or WPA2) for the example to use.

		Can be left blank if the network has no security set.

config AUDIO_PLAY_VOLUME
    int "Audio play volume"
    default 90

config AUDIO_RECORDER_AEC_ENABLE
    bool
    default y

config RTC_APP_ID
    string "ARTC AppID associated with the agent (Warning: For development and testing only)"
    default "xxx"

config RTC_APP_KEY
    string "ARTC AppKey associated with the agent (Warning: For development and testing only)"
    default "xxx" 

config RTC_USER_ID
    string "User ID (A unique ID is recommended for each device)"
    default "123"


config VOICE_AGENT_ID
    string  "Voice agent ID (Create an agent in the console in advance)"
    default "xxx"

config VOICE_AGENT_REGION
    string "Region of the voice agent"
    default "cn-xxx"


config LICENSE_PRODUCT_ID
    string "Enter the license product ID. Contact your business manager to obtain it."
    default "xxx"

config LICENSE_AUTH_CODE
    string "Enter the license authorization code. Contact your business manager to obtain it."
    default "xxx"

config LICENSE_DEVICE_ID
    string "Device serial number"
    default "xxx"


endmenu

Important

The RTC_APP_KEY is used to generate a token locally. This method can only be used for development and testing. For production releases, do not embed the RTC_APP_KEY in your application. Instead, generate the token on your server and send it to the device. For instructions on generating a token, see Generate an ARTC authentication token. For smart hardware, the token does not need to be Base64 encoded. Send the raw JSON result directly to the device.

To build and flash the demo project, see the ESP-IDF Programming Guide.

Run the demo. The Monitoring Window displays logs for initialization, key event listener setup, and Wi-Fi connection. The message "Main: [ 5 ] Initialize finish, listen to events" indicates that initialization is complete. You can now press the Play button (the top-left button) to start a call.

I (1309) sleep_gpio: Configure to isolate all GPIO pins in sleep state
I (1315) sleep_gpio: Enable automatic switching of GPIO sleep configuration
I (1322) main_task: Started on CPU0
I (1325) esp_psram: Reserving pool of 32K of internal memory for DMA/internal allocations
I (1333) main_task: Calling app_main()
I (1336) Main: [ 1 ] Initialize start
I (1355) Main: [1.1] Initialize peripherals
I (1356) Main: [ 2 ] Start and wait for Wi-Fi network
W (1368) wifi:Password length matches WPA2 standards, authmode threshold changes from OPEN to WPA2
W (1402) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43
W (1410) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:43
W (1418) PERIPH_WIFI: Wi-Fi disconnected from SSID xxx, auto-reconnect enabled, reconnect after 1000 ms
W (4825) PERIPH_WIFI: Wi-Fi disconnected from SSID xxx, auto-reconnect enabled, reconnect after 1000 ms
W (5894) PERIPH_WIFI: WiFi Event cb, Unhandle event_base:WIFI_EVENT, event_id:4
I (7426) Main: [2.1] Initializing SNTP
I (9042) Main: [ 3 ] Start codec chip
W (9042) i2c_bus_v2: I2C master handle is NULL, will create new one
W (9082) ES7210: Enable TDM mode. ES7210_SDP_INTERFACE2_REG12: 2
I (9094) Main: [ 4 ] Set up event listener
I (9095) Main: [ 5 ] Initialize finish, listen to events

Error description

The certificate expired error shown below is typically caused by an unstable Wi-Fi connection, which prevents a successful NTP time synchronization. As a result, the SDK defaults to the epoch time for certificate validation, causing the check to fail.

[1970-01-01 00:00:35.715][ERROR] [license.c:276][hermes_sdk] license response code is 400, dataSize: 321
[1970-01-01 00:00:35.717][ERROR] [license.c:300][hermes_sdk] license refresh fail with code=-3, svcCode=InvalidTimeStamp. Expired, regeustId=c401D243-E831-5AD9-A756-283A94D314FF
[1970-01-01 00:00:35.730][ERROR] [license.c:375][hermes_sdk] parse license response failed with result: -3
[1970-01-01 00:00:35.739][ERROR] [artc _aicall.c:79][hermes_sdk] raise error event code: -10001 msg: action return: -2147481342

To resolve this, ensure your device has a stable Wi-Fi connection, for example, by connecting to a different hotspot. This demo uses simplified Wi-Fi connection logic for demonstration purposes. For production environments, implement a robust mechanism that handles disconnections and re-synchronizes the time after reconnecting.

JieLi AC791

Workflow

Download source code

Download the source code from the GitHub open-source project.

Configure the environment

To set up the development environment for the JieLi AC791 series development board, see the JieLi Development Environment Installation Guide.

Run the demo

Clone the Jieli SDK to your local machine.
Clone this demo's source code into the apps/demo directory of the Jieli SDK. Rename the cloned directory to demo_artc_aicall.
```
- FW-AC79_AIOT_SDK
  - apps
    - demo
      - demo_artc_aicall (Note: The folder name uses underscores as separators.)
```
In the demo_artc_aicall directory, open the project file board/wl82/AC791N_DEMO_ARTC_AICALL.cbp using the CodeBlocks IDE.

Open the artc_device_helper.c file and update the Wi-Fi SSID and password.

// Set the Wi-Fi SSID and password for network connection.
#define WIFI_STA_SSID "ssid"
#define WIFI_STA_PWD  "pwd"

Open the artc_aicall_demo.c file and update the license and agent information.
```
// License information for the agent call
#define LICENSE_PRODUCT_ID "xxx"  // License product ID
#define LICENSE_AUTH_CODE "xxx"   // License authorization code
#define LICENSE_DEVICE_ID "xxx"   // Unique device ID

// User ID
#define USER_ID "xxxx"               
// Agent ID
#define VOICE_AGENT_ID "xxxx"
// Region where the agent is located
#define AGENT_REGION "xxxx"
// ARTC AppID associated with the agent
#define RTC_APP_ID "xxxx"
// ARTC AppKey associated with the agent
#define RTC_APP_KEY "xxxx"
```
Important
The RTC_APP_KEY is used to generate a token locally. This method can only be used for development and testing. For production releases, do not embed the RTC_APP_KEY in your application. Instead, generate the token on your server and send it to the device. For instructions on generating a token, see Generate an ARTC authentication token. For smart hardware, the token does not need to be Base64 encoded. Send the raw JSON result directly to the device.
Build the demo project. For build instructions, see the official JieLi documentation.

Run the demo. After powering on the device, use the following keys to interact with the agent.

    - `K1`: Start the agent call
    - `K2`: Hang up the call
    - `K3`: Interrupt the agent
    - `K4`: Stop sending audio to the agent
    - `K5`: Resume sending audio to the agent

To view real-time logs, connect the device's serial port to your computer and use a serial monitoring tool.