Updated on 2025-08-26 GMT+08:00

Sending Audio Data

After receiving the "recognition starting" response, the client starts to send audio data. To save traffic, audio is sent in binary messages.

The audio data is sent by segment, so the client can send a binary message after a certain amount of audio data is obtained. It is recommended that each segment be 50 ms to 1000 ms long. When real-time feedback is required, a segment can be 100 ms long. Otherwise, a segment can be 500 ms long.

Currently, the SIS service limits the shard size of an 8 kHz audio file to [160, 32768] bytes and that of a 16 kHz audio file to [320, 65536] bytes. If the shard size exceeds the upper limit or is below the lower limit, an error is reported.