Updated on 2025-09-12 GMT+08:00

Request for Starting TTS

Function

After establishing a WebSocket connection with the TTS engine, the client can send a start TTS request to initiate TTS. If the client sends multiple synthesis requests over the same WebSocket connection, the WebSocket connection must be re-established for each request, as one connection can handle only a single synthesis request at a time.

Request Parameters

Table 1 Parameter descriptions

Parameter

Type

Mandatory

Description

command

String

Yes

Set it to START to initiate the recognition request.

text

String

Yes

Text to be synthesized, which can contain up to 10,000 characters.

config

Object

No

Configuration information. For details, see Table 2.

Table 2 config data structure

Parameter

Type

Mandatory

Description

audio_format

String

No

Audio format header. Option: pcm.

Default value: pcm.

sample_rate

String

No

Sampling rate. Options: 16000 and 8000.

Default value: 8000

property

String

Yes

For details, see Table 3.

For speakers with high-quality pronunciation, the charge is made every 50 characters.

subtitle

String

No

Whether to generate timestamp information. Leave this parameter blank if not used.

Range:

word_level: text-level timestamp.

phoneme_level: phoneme-level timestamp.

Table 3 Range of property for speakers with high-quality pronunciation

Parameter

Value

Type

Use Case

Sampling Rate (Hz)

Audio Format

Ahmed

arabic_dh_male

Virtual human

Arabic

8k/16k

pcm

Aisha

arabic_dh_female

Virtual human

Arabic

8k/16k

pcm

Ahmed

english_dh_male

Virtual human

English

8k/16k

pcm

Aisha

english_dh_female

Virtual human

English

8k/16k

pcm

Example

{ 
    "command": "START",
    "text": "Nice to meet you.",
    "config": 
    { 
        "audio_format": "pcm", 
        "sample_rate": "16000", 
        "property": "english_dh_female"
    }
}

Status Codes

See Status Codes.

Error Codes

See Error Codes.