Help Center/ MetaStudio/ User Guide/ Voice Modeling/ Creating a Voice Modeling Task (with In-House Models)

Updated on 2025-08-25 GMT+08:00

View PDF

Creating a Voice Modeling Task (with In-House Models)

You can view the preset voices of MetaStudio on the Video Production or Livestreams page. If the preset voices cannot meet your requirements, you can use an in-house model to customize a voice.

Constraints

Only enterprise users can customize voices on MetaStudio.

Preparations

Before creating a voice modeling task, see Procedure.

If you select Script Upload, record an audio in advance by referring to the recording guide on the voice modeling page.

Video

Watch this video to learn how to train your voice model and create a lifelike voice for your virtual avatar.

Procedure

Log in to the MetaStudio console and go to the Overview page.
Click Go to MetaStudio Console to go to the MetaStudio console.

Click the Voice Modeling card to go to the voice modeling page.

On the page displayed, the area on the left is for voice modeling, and the area on the right shows the voice modeling process.
Figure 1 Customizing a voice

Under the Huawei Models tab, configure voice modeling parameters.

For details, see Table 1.

**Table 1** GUI operations
Parameter	Description
Voice modeling	Voice modeling requires recording a complete, uninterrupted WAV or MP3 audio file of 10 to 30 minutes (recommended: 15 minutes). The remaining voice modeling quota will be displayed. Ensure that there is a pause of 2 to 3 seconds between sentences in an audio recording.
Voice Settings	Enter a voice name. Example: emotion_joyful_healing
Produce Voice	The method of voice modeling is Script Upload. You can follow the audio recording guide on the page to record a WAV or MP3 audio file. WAV or MP3 audio files can be uploaded without being compressed or containing TXT files. If no preset script is used, the voice tag is used only to indicate the voice application scenario.
Voice Gender	Gender of the voice. Specify this parameter to improve voice model precision. Options: Male Female
Input Language	Language of the uploaded script. Options: Chinese English Note: The current parameter setting is only an identifier and does not influence the training result.
Voice Field	Field to which a voice applies. You can select a field to quickly find the desired voice. There is preset text of different styles for different fields. After voice training is completed, you can choose Assets > My Models > Voices to preview how the voice reads text in the selected field. Script of each of the preceding tags is preset in MetaStudio, as shown in Script Examples (Advanced Edition). When using the preset script, you must select the corresponding tag.

Check the box for authorizing the voice use and click Submit.

The Information dialog box appears, notifying you of the remaining voice modeling quota and indicating that one resource will be consumed this time.
After confirming the information, click Submit.

After the voice modeling task is submitted, the message Production task submitted appears, as shown in Figure 2.

Model review and modeling take about seven working days.
- Figure 2 Production task submitted
You can click View Production Tasks to view the review progress of the voice modeling task.

When the status changes to Reviewed, algorithm training is automatically started. If there are multiple algorithm training tasks, queuing and delay may occur.