All Products
Search
Document Center

Alibaba Cloud Model Studio:Recording guide

Last Updated:Oct 15, 2025

High-quality recording data is crucial for model training. This document describes how to create high-quality recordings by considering the recording environment, devices, and process.

Important

This document applies only to the China (Beijing) region. To use the models, you must use an API key from the China (Beijing) region.

Device

You can use devices such as mobile phones, digital voice recorders, or professional audio recorders.

Environment

Environment selection

When selecting a recording environment, the main considerations are reducing noise and reverberation. We recommend recording in a small room that is less than 10 square meters, especially one equipped with sound-absorbing devices for better results. You can also modify the room with low-cost, sound-absorbing cotton. This changes the planar reflection of sound waves to diffuse reflection, which reduces reverberation and improves recording quality.

Noise control

  • Outdoor noise: Close doors and windows to mitigate noise.

  • Indoor noise: Common sources of indoor noise include air conditioners, fans (including computer fans), fluorescent light ballasts, and human voices. To identify and eliminate these noise sources, you can record the ambient sound with a mobile phone and listen to the recording at a high volume.

Reverberation control

Reverberation is the auditory effect produced when sound reflects, refracts, diffuses, and gradually attenuates in a space. When sound waves reflect off smooth surfaces such as walls and glass, the sound can become muddy.

When you record, we recommend that you do not choose an empty room. Instead, use a location with sound-absorbing facilities or an environment with an irregular layout to reduce reverberation. Office areas and conference rooms typically have high reverberation and are not recommended as recording environments.

Instructions

A typical bedroom is a common and ideal recording environment. When you record, consider the following:

  • Maintain a distance of about 10 cm from the mobile phone to avoid plosives and air current problems that can result from being too close or too far away.

  • Close doors and windows to reduce outdoor noise.

  • Turn off the air conditioner or fan to reduce indoor noise interference.

  • Draw the curtains to reduce sound reflection from the glass.

  • Open cabinet doors and use items such as clothes or bed sheets to cover cabinet and desk surfaces. This reduces sound reflection from smooth surfaces and improves recording quality.

Scripts

  • In the script, avoid short sentences with only a few words. When you read, maintain fluency and avoid frequent or unnecessary pauses lasting 5 seconds or more. Long pauses can negatively affect the cloning and may cause it to fail.

  • We recommend that you familiarize yourself with the script before recording to determine the persona and performance style. Read with emotion and avoid a mechanical delivery to ensure the cloning meets your expectations.

  • There are no special restrictions on the script content. You can use content that is similar to the content you plan to synthesize.

  • If the scenario involves a mix of Chinese and English, you only need to record the part that you can read. After cloning, the model can automatically speak in both Chinese and English.

  • Do not read scripts that contain sensitive words. This will cause the cloning to fail.