Intelligent Speech Interaction is suitable for various scenarios, including intelligent Q&A, intelligent quality inspection, real-time subtitling for speeches, and transcription of audio recordings. Intelligent Speech Interaction has been successfully applied in many industries such as finance, insurance, e-commerce and smart home. Intelligent Speech Interaction allows you to use self-learning platform to improve speech recognition accuracy, and provides a comprehensive management console and easy-to-use SDKs. You are welcome to activate Intelligent Speech Interaction.
High Recognition Accuracy
Alibaba Cloud is the first cloud service provider in China to use word-level LC-BLSTM and DFSMN-CTC models. Compared with the traditional CTC method in the industry, these models reduce the error rate by 20%, greatly improving the accuracy of speech recognition.
Ultra-high Decoding Speed
Alibaba Cloud is the first cloud service provider in China to use the low frame rate (LFR) decoding technology. This technology increases the decoding speed by more than three times without compromising recognition accuracy, greatly shortening response time and improving user experience
Novel Self-learning Platform
Intelligent Speech Interaction is the first system in the industry that provides a self-learning platform. It allows you to specify hotwords, and upload business-related data to build specific models for better recognition accuracy.
Extensive Industry Coverage
Currently, Intelligent Speech Interaction has customers in a wide variety of industries, such as finance, insurance, e-commerce, and smart home. It is ideal for various scenarios, including intelligent Q&A, intelligent quality inspection, real-time subtitling for speeches, and voice assistants.
Products and Services
Recording File Recognition
Converts audio from files uploaded by users into text within 24 hours. Applicable to scenarios that are not time-sensitive, such as call center quality assurance, transcription of court trials from recordings, summarization of meeting minutes, and medical record filing.
Real-time Speech Recognition
Converts audio streams into text in real time. Intelligent segmentation is used to identify when sentences start and end. Real-time Speech Recognition is ideal for scenarios with high requirements for real-time response, such as real-time transcription for live videos, meetings, and court trials.
Short Sentence Recognition
Converts short audio (< 1 min.) to text. Applicable to real-time scenarios, such as voice search, voice command control, and voice short message. Short Sentence Recognition can be integrated into various applications, smart home appliances, and smart assistants.
Converts text to natural speech. Speech Synthesis provides a variety of voices and allows you to adjust the speed, intonation, and volume. It is ideal for scenarios such as intelligent customer service, speech interaction, audio book, and broadcasting.
Allows you to upload business-related data to improve the recognition accuracy in specific user case. Currently, you can upload only text to customize language models. In the future, Self-learning Platform will allow you to upload audio data to customize acoustic models.
Upgraded Support For You
1 on 1 Presale Consultation, 24/7 Technical Support, Faster Response, and More Free Tickets.