Updated on 2025-05-28 GMT+08:00

Accessing a Real-Time Service Using Server-Sent Events

Context

Server-Sent Events (SSE) is a server push technology enabling a server to push events to a client via an HTTP connection. This technology is usually used to enable a server to push real-time data to a client, for example, a chat application or a real-time news update.

SSE primarily facilitates unidirectional real-time communication from the server to the client, such as streaming ChatGPT responses. In contrast to WebSockets, which provide bidirectional real-time communication, SSE is designed to be more lightweight and simpler to implement.

Prerequisites

The image for importing the model is SSE-compliant.

Constraints

  • SSE supports only the deployment of real-time services.
  • It supports only real-time services deployed using models imported from custom images.
  • When you call an API to access a real-time service, the size of the prediction request body and the prediction time are subject to the following limitations:
    • The size of a request body cannot exceed 12 MB. Otherwise, the request will fail.
    • Due to the limitation of API Gateway, the prediction duration of each request does not exceed 40 seconds.

Calling an SSE Real-Time Service

The SSE protocol itself does not introduce new authentication mechanisms; it relies on the same methods as HTTP requests.

The following section uses GUI software Postman for prediction and token authentication as an example to describe how to call an SSE service.

Figure 1 Calling an SSE service
Figure 2 Response header Content-Type

In normal cases, the value of Content-Type in the response header is text/event-stream;charset=UTF-8.