Help Center/ ModelArts/ FAQs/ ModelArts Standard Inference Deployment/ What Is the Maximum Size of a ModelArts Real-Time Service Prediction Request Body?

Updated on 2025-10-24 GMT+08:00

View PDF

What Is the Maximum Size of a ModelArts Real-Time Service Prediction Request Body?

After a service is deployed and running, you can send an inference request to the service. The requested content can be text, images, audios, or videos, depending on the model of the service.

If you perform prediction by calling an inference request address (URL of Huawei Cloud APIG) displayed on the Usage Guides tab of the service details page, the maximum size of the request body is 12 MB. If the request body is oversized, the request will be intercepted.

If you perform the prediction on the Prediction tab of the service details page, the size of the request body cannot exceed 8 MB. The size limit varies between the two tab pages because they use different network links.

Ensure that the size of a request body does not exceed the upper limit. If there are high-concurrency and heavy-traffic inference requests, submit a service ticket to professional service support.

Parent topic: ModelArts Standard Inference Deployment