Modifying the QPS of a Model Service in ModelArts Studio (MaaS)
QPS is a crucial metric for evaluating a model service's processing capability. It measures the number of requests the system can handle per second in high-concurrency scenarios, directly impacting response speed and efficiency. Improper QPS configuration can increase user waiting time and reduce satisfaction. Therefore, it is essential to adjust the QPS flexibly to maintain service performance, optimize user experience, ensure continuity, and control costs.
ModelArts Studio allows you to manually modify the QPS limit of a model service instance without disrupting service operation.
Notes and Constraints
The QPS can be modified only when the model service is in the Running or Alarm state.
Modifying QPS
- Log in to ModelArts Studio (MaaS) console and select the target region on the top navigation bar.
- In the navigation pane on the left, choose Real-Time Inference.
- In the Real-Time Inference > My Services tab, choose More > Set QPS in the Operation column of the target service. In the displayed dialog box, modify the value and click Submit.
Figure 1 Modifying QPS
In the My Services tab, click the service name to access its details page and check whether the change takes effect.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot