Updated on 2025-08-04 GMT+08:00

Modifying the QPS of a Model Service in ModelArts Studio (MaaS)

QPS is a crucial metric for evaluating a model service's processing capability. It measures the number of requests the system can handle per second in high-concurrency scenarios, directly impacting response speed and efficiency. Improper QPS configuration can increase user waiting time and reduce satisfaction. Therefore, it is essential to adjust the QPS flexibly to maintain service performance, optimize user experience, ensure continuity, and control costs.

ModelArts Studio allows you to manually modify the QPS limit of a model service instance without disrupting service operation.

Notes and Constraints

The QPS can be modified only when the model service is in the Running or Alarm state.

Modifying QPS

  1. Log in to ModelArts Studio (MaaS) console and select the target region on the top navigation bar.
  2. In the navigation pane on the left, choose Real-Time Inference.
  3. In the Real-Time Inference > My Services tab, choose More > Set QPS in the Operation column of the target service. In the displayed dialog box, modify the value and click Submit.
    Figure 1 Modifying QPS

    In the My Services tab, click the service name to access its details page and check whether the change takes effect.