A Create Dataset for EasyASR Models component converts audio data in the WAV format and text data into TFRecord files. You can then use the TFRecord files as pre-processed data to train or evaluate Automatic Speech Recognition (ASR) and speech classification models. This topic describes how to set parameters for a Create Dataset for EasyASR Models component and provides an example on how to use a Create Dataset for EasyASR Models component.
Only Machine Learning Studio 2.0 provides the algorithm component.
A Create Dataset for EasyASR Models component converts audio data in the WAV format and text data that contains labeling results, into TFRecord files and stores the TFRecord files in an Object Storage Service (OSS) bucket. This component can be used to prepare data for training or evaluating ASR and speech classification models.
You can find the Create Dataset for EasyASR Models component in the Data Preprocessing subfolder in the Audio Algorithm folder of the component library.
Configure the component in thePAI console
- Input port
The input port of a Create Dataset for EasyASR Models component must be connected to a Read File Data component. You must set the OSS Data Path parameter of the Read File Data component to the OSS path of the source CSV file.
- Component parameters
Tab Parameter Required Description Default Parameter Settings Output Path Yes The OSS path of the output TFRecord files. Example:
N/A Model Tuning Running Mode NoThe computing engine that is used to run the algorithm component. You can select a computing engine based on your business requirements. The following computing engines are supported:
- MaxCompute: Use the MaxCompute instance that is associated with your AI workspace as the computing engine. For information about how to add a computing engine, see Create a workspace. For information about the billing rules, see Billing of Machine Learning Studio or Machine Learning Designer.
- DLC: Use the DLC instance that is associated with your AI workspace as the computing engine. For information about how to add a computing engine, see Create a workspace. For information about the billing rules, see Billing of DLC.
MaxCompute Number of Workers No The number of workers that are used for data conversion. 1 CPU Machine Type No The type of the computing instance. This parameter is required only if you set the Running Mode parameter to DLC. N/A
- Output port
You can connect the output port of a Create Dataset for EasyASR Models component to an ASR Model Training or EasyASR Speech Classification Training component.
- Prepare a CSV file that contains audio data and text data.
The audio file used to train an ASR or speech classification model must be split in advance. We recommend that you split the audio file into segments of about 10 to 12 seconds in length and store the processed audio data in an OSS bucket. The audio file must contain mono audio with a sampling rate of 16,000 Hz. In this example, a speech recognition model is trained. The paths of the audio segments and the labeling results are stored in a CSV file. Each path and the text are separated with a comma (,). In the CSV file, the header line indicates the column names. You can enter
wav_filename,transcriptas the header line. This way, the first column stores the OSS paths of WAV files, and the second column stores the labeling results. The words of the text are separated with spaces, and all punctuations are replaced with semicolons (;). If a word does not exist in the vocabulary, this word must be replaced with an asterisk (*).
You can download the vocabulary based on the model that you use. For more information, see Use EasyASR for speech recognition.
- Build an experiment shown in the following figure. You must set the OSS Data Path parameter of the Read File Data component to the OSS path of the CSV file. For more information about how to set other parameters, see Component parameters in this topic.
- View the output TFRecord files in the specified OSS path.
After the experiment is run, you can view the output TFRecord files in the OSS path that you specify for the Output Path parameter of the Create Dataset for EasyASR Models component. The following figure shows sample output results.