Help Center/
ModelArts/
FAQs/
Service Deployment/
Service Deployment/
Real-Time Services/
How Do I Speed Up Real-Time Prediction?
Updated on 2024-06-15 GMT+08:00
How Do I Speed Up Real-Time Prediction?
- When deploying a real-time service, select the compute nodes with higher specifications for better performance. For example, use GPUs instead of CPUs.
- When deploying a real-time service, add the number of compute nodes.
If you set Compute Nodes to 1, standalone computing is used. If you set Compute Nodes to a value greater than 1, distributed computing is used. Configure this parameter based on site requirements.
- The inference speed is closely related to the model complexity. Try to optimize the model for faster prediction.
Parent topic: Real-Time Services
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
The system is busy. Please try again later.
For any further questions, feel free to contact us through the chatbot.
Chatbot