Updated on 2025-09-18 GMT+08:00

What Is SIS?

Speech Interaction Service (SIS) lets you get instant responses by accessing and calling APIs in real time. For example, through Real-Time Automatic Speech Recognition (RASR), you can transcribe spoken audio or speech files into editable text. Additionally, Text To Speech (TTS) enables text-to-speech conversion, producing lifelike audio for an enhanced user experience. SIS is applicable in various scenarios, including speech-based customer service quality checks, meeting transcriptions, voice messaging, audiobooks, and call-back services.

Prerequisites

You must have programming capabilities and be familiar with the Java, Python, and iOS programming languages.

SIS provides APIs for you to convert speech into editable text and returns the recognition result in JSON format. You need to encode the recognition result and save it to a service system or save it in TXT or Excel format.

Using SIS for the First Time

If you are a first-time user, the following information will help you get familiar with SIS: