This topic describes the billing details for Real-time Conversational AI.
Purchase guide
To use the Alibaba Cloud Real-time Conversational AI service, you must meet the following requirements:
Ensure that you have activated the Real-time Conversational AI feature. If it is not activated, go to Activate Service to activate it. If the service is already active, you can use it directly.
NoteIf you see the message "The quantity you are purchasing exceeds the available limit. Please select a new quantity!", it means the service is already active.
Product pricing
Standard pricing for the AI agent service
If the agent's input and output are audio-only, billing is based on the audio specification. The standard audio pricing includes fees for Speech-to-Text (STT), which covers Automatic Speech Recognition (ASR) and acoustic modeling, Text-to-Speech (TTS), and agent runtime.
If the agent's input or output includes video, billing is based on the video specification. The standard video pricing includes fees for STT, which covers ASR and acoustic modeling, TTS, and agent runtime (excluding digital humans).
The standard pricing model uses a bundled billing approach. The fees for all three services are charged in full.
Specification/Region | The Chinese mainland (USD/minute) | Singapore (USD/minute) |
Audio | 0.014 | 0.028 |
Video | 0.0502 | 0.1003 |
Pay-per-feature billing model
ApsaraVideo Real-time Communication service pricing
ApsaraVideo Real-time Communication provides calling capabilities and is billed based on call duration. The price is the same globally, regardless of the region. For more information about pricing, see ApsaraVideo Real-time Communication fees.
Digital human (optional)
Real-time Conversational AI lets you integrate digital human nodes. It currently supports FaceUnity and Lingjing digital humans.
FaceUnity: Go to the official FaceUnity website and contact their customer service for billing information.
Lingjing: To activate and use this service, submit a ticket.
Large language model (optional)
If you choose the system's preset large language model, this service is currently free of charge.
If you use an external large language model, you will incur corresponding LLM fees. For specific billing details, see the billing documentation for that product.
Billing rules
Total Real-time Conversational AI fee = AI agent service fee + ApsaraVideo Real-time Communication service fee
Fee for each item = Unit price of each service × Billable duration
Billing cycle: Fees are settled on an hourly basis, within 30 minutes after an AI agent session ends. Any duration less than one minute is rounded up to one minute.
Billing example
User A has 10 audio-only calls with an AI agent in the Chinese mainland region. Each call lasts 2 minutes. The fees for each module are calculated as follows:
AI agent service fee: The billable duration is 20 minutes (10 calls × 2 minutes). The fee is USD 0.28 (20 minutes × USD 0.014/minute).
ARTC: Because the calls are bidirectional, the billable duration is 40 minutes (10 calls × 2 minutes × 2). The fee is USD 0.0344 (40 minutes × USD 0.00086/minute).
Total fee: USD 0.3144 = USD 0.28 + USD 0.0344.