Help Center/ ModelArts/ Best Practices/ Model Inference/ End-to-End O&M of Inference Services

Updated on 2024-06-12 GMT+08:00

View PDF

End-to-End O&M of Inference Services

The end-to-end O&M of ModelArts inference services involves the entire AI process including algorithm development, service O&M, and service running.

Overview

End-to-End O&M Process

During algorithm development, store service data in Object Storage Service (OBS), and then label and manage the data using ModelArts data management. After the data is trained, obtain an AI model and create AI application images using a development environment.
During service O&M, use an image to create an AI application and deploy the AI application as a real-time service. You can obtain the monitoring data of the ModelArts real-time service on the Cloud Eye management console. Configure alarm rules so that you can be notified of alarms in real time.
During service running, access real-time service requests into the service system and then configure service logic and monitoring.

Figure 1 End-to-end O&M process for inference services

During the entire O&M process, service request failures and high resource usage are monitored. When the resource usage threshold is reached, the system will send an alarm notification to you.

Figure 2 Alarming process

Advantages

End-to-end service O&M enables you to easily check service running at both peak and off-peak hours and detect the health status of real-time services in real time.

Constraints

End-to-end service O&M applies only to real-time services because Cloud Eye does not monitor batch or edge inference services.

Procedure

This section uses an occupant safety algorithm in travel as an example to describe how to use ModelArts for process-based service deployment and update, as well as automatic service O&M and monitoring.

Figure 3 Occupant safety algorithm implementation

Use a locally developed model to create a custom image and use the image to create an AI application on ModelArts. For details, see .
On the ModelArts management console, deploy the created AI application as a real-time service.
Log in to the Cloud Eye management console, configure ModelArts alarm rules and enable notifications with a topic subscribed to. For details, see .

After the configuration, choose Cloud Service Monitoring > ModelArts in the navigation pane on the left to view the requests and resource usage of the real-time service.

Figure 4 Viewing service monitoring metrics

When an alarm is triggered based on the monitored data, the object who has subscribed to the target topic will receive a message notification.