Alibaba Cloud provides basic speech recognition models, including universal models or models specific to certain fields, for example, the 8 kHz model for customer service quality control and the model for e-commerce. If you have accumulated sufficient historical data in your own field, you can use the data to customize and optimize your own model.
Using the speech recognition customization platform of Alibaba Cloud, you can create a model and upload a training corpus on the GUI. By training the model based on its training corpus, you can effectively improve the speech recognition rate in specific scenarios. The customization platform is especially suitable to optimize the recognition effects of proper nouns and high-frequency words in text.
For more information about how to customize and train a model in the console, see Manage custom models
To customize and train a model on the customization platform, you can upload a UTF-8 encoded TXT file with a maximum size of 20 MB as the training corpus. The training corpus can consist of annotated text produced based on your historical data or a collection of specialized terms in specific scenarios.
- Each sentence must occupy a line in the training corpus.
- The key content to be recognized, such as proper nouns, names, and places, must be repeated in the training corpus to increase the weight of recognition.
A redology group is going to hold a seminar in a certain place. To record the speeches made by guests in the seminar, the host chooses to use Alibaba Cloud Intelligent Speech Interaction for speech transcription. Developers register an Alibaba Cloud account and activate Intelligent Speech Interaction. To improve the speech recognition rate, they train and optimize a custom model on the customization platform as follows:
- Select a model: Developers decide to use a universal model.
- Collect a training corpus: The topic of the seminar is about Dream of the Red Chamber. Developers crop the original text of the novel based on punctuation marks and store each sentence of the original text as a line in the training corpus.
- Train the model: Developers upload the training corpus and train the model based on the training corpus on the customization platform. Then, they can use the model to effectively recognize the words in Dream of the Red Chamber, such as Jia Baoyu, to obtain optimal recognition results.