Help Center/ MetaStudio/ User Guide/ Voice Modeling/ Creating a Voice Modeling Task (with In-House Models)
Updated on 2025-08-25 GMT+08:00

Creating a Voice Modeling Task (with In-House Models)

You can view the preset voices of MetaStudio on the Video Production or Livestreams page. If the preset voices cannot meet your requirements, you can use an in-house model to customize a voice.

Constraints

Only enterprise users can customize voices on MetaStudio.

Preparations

Before creating a voice modeling task, see Procedure.

  • If you select Script Upload, record an audio in advance by referring to the recording guide on the voice modeling page.

Video

Watch this video to learn how to train your voice model and create a lifelike voice for your virtual avatar.

Procedure

  1. Log in to the MetaStudio console and go to the Overview page.
  2. Click Go to MetaStudio Console to go to the MetaStudio console.
  1. Click the Voice Modeling card to go to the voice modeling page.

    On the page displayed, the area on the left is for voice modeling, and the area on the right shows the voice modeling process.
    Figure 1 Customizing a voice

  2. Under the Huawei Models tab, configure voice modeling parameters.

    For details, see Table 1.

    Table 1 GUI operations

    Parameter

    Description

    Voice modeling

    Voice modeling requires recording a complete, uninterrupted WAV or MP3 audio file of 10 to 30 minutes (recommended: 15 minutes).

    The remaining voice modeling quota will be displayed. Ensure that there is a pause of 2 to 3 seconds between sentences in an audio recording.

    Voice Settings

    Enter a voice name.

    Example: emotion_joyful_healing

    Produce Voice

    The method of voice modeling is Script Upload. You can follow the audio recording guide on the page to record a WAV or MP3 audio file.

    WAV or MP3 audio files can be uploaded without being compressed or containing TXT files.

    If no preset script is used, the voice tag is used only to indicate the voice application scenario.

    Voice Gender

    Gender of the voice. Specify this parameter to improve voice model precision.

    Options:

    • Male
    • Female

    Input Language

    Language of the uploaded script.

    Options:

    • Chinese
    • English

    Note: The current parameter setting is only an identifier and does not influence the training result.

    Voice Field

    Field to which a voice applies. You can select a field to quickly find the desired voice.

    There is preset text of different styles for different fields. After voice training is completed, you can choose Assets > My Models > Voices to preview how the voice reads text in the selected field.

    Script of each of the preceding tags is preset in MetaStudio, as shown in Script Examples (Advanced Edition). When using the preset script, you must select the corresponding tag.

  3. Check the box for authorizing the voice use and click Submit.

    The Information dialog box appears, notifying you of the remaining voice modeling quota and indicating that one resource will be consumed this time.

  4. After confirming the information, click Submit.

    After the voice modeling task is submitted, the message Production task submitted appears, as shown in Figure 2.

    Model review and modeling take about seven working days.

    • Figure 2 Production task submitted

  5. You can click View Production Tasks to view the review progress of the voice modeling task.

    When the status changes to Reviewed, algorithm training is automatically started. If there are multiple algorithm training tasks, queuing and delay may occur.