What Is SIS?

Speech Interaction Service (SIS) lets you get instant responses by accessing and calling APIs in real time. For example, through Real-Time Automatic Speech Recognition (RASR), you can transcribe spoken audio or speech files into editable text. Additionally, Text To Speech (TTS) enables text-to-speech conversion, producing lifelike audio for an enhanced user experience. SIS is applicable in various scenarios, including speech-based customer service quality checks, meeting transcriptions, voice messaging, audiobooks, and call-back services.

Prerequisites

You must have programming capabilities and be familiar with the Java, Python, and iOS programming languages.

SIS provides APIs for you to convert speech into editable text and returns the recognition result in JSON format. You need to encode the recognition result and save it to a service system or save it in TXT or Excel format.

Using SIS for the First Time

If you are a first-time user, the following information will help you get familiar with SIS:

Functions
Functions describes SIS functions, including Real-time ASR, Short Sentence Recognition, TTS.
Getting Started
SIS provides services through open APIs. You can learn how to use SIS by referring to the Speech Interaction Service Getting Started.
Using SIS
If you are a development engineer familiar with code compilation and want to directly call SIS APIs, see the Speech Interaction Service API Reference or Speech Interaction Service SDK Reference.
From Beginners to Experts
You can learn how to use SIS by referring to Progressive Knowledge.