If you are using Intelligent Speech Interaction for the first time, you can read Quick Start first to prepare yourself and experience the use of relevant services.
If you have completed the quick start procedures, we recommend that you read the following topics in sequence to know more about Intelligent Speech Interaction.
|Concepts||Introduces the terms and concepts related to Intelligent Speech Interaction.|
|Console User Guide: Manage projects||Demonstrates how to create your own speech recognition project and set project parameters in the console.|
|Obtain a token||Describes how to obtain an access token. You must obtain an access token before calling Intelligent Speech Interaction services. Pay attention to the validity period of the access token when using it.|
|Call Intelligent Speech Interaction services||Short sentence recognition Real-time speech recognition Speech synthesis Recording file recognition|
|Use the speech recognition customization platform||Introduces the customization platform that you can use to optimize speech recognition effects.|
|Service||Timeliness||Feature||Scenario||Audio coding format||Call method||Free quota||Purchase|
|Short sentence recognition||Real-time recognition||Recognizes short speech that lasts within 1 minute.||Scenarios such as voice search in apps, customer service hotlines, chat conversations, and voice command control||Pulse-code modulation (PCM) (uncompressed PCM or WAV files) and Opus||Java SDK, C++ SDK, Android SDK, and iOS SDK||A maximum of two concurrent call requests||Separate resource package|
|Real-time speech recognition||Real-time recognition||Recognizes speech data streams that last for a long time.||Uninterrupted speech recognition scenarios such as conference speeches and live streaming||PCM (uncompressed PCM or WAV files)||Java SDK, C++ SDK, Android SDK, and iOS SDK||A maximum of two concurrent call requests||Separate resource package|
|Speech synthesis||Real-time synthesis||Converts text that contains a maximum of 300 UTF-8 encoded characters to speech.||Scenarios that require text-to-speech||PCM, WAV, and MP3||Java SDK, C++ SDK, Android SDK, and iOS SDK||A maximum of two concurrent call requests||Separate resource package|
|Recording file recognition||Recognition within 24 hours||Recognizes a recording file that has a maximum size of 512 MB.||Scenarios that do not require real-time recognition||Single-track and dual-track WAV and MP3||Java SDK, C++ SDK, Go SDK, .NET SDK, Node.js SDK, PHP SDK, and Python SDK||Call requests for recognizing recording files with a maximum duration of 2 hours per calendar day||Separate resource package|
- Except for the recording file recognition service, other recognition services of Intelligent Speech Interaction support only mono speech data.
- Intelligent Speech Interaction only supports 16-bit audio files sampled at 8 kHz or 16 kHz.