What's New
Function Overview
Product Bulletin
- [Notice] Huawei Cloud ModelArts Has Discontinued the Old Version of Training Management
Service Overview
- Infographics
  - What Is ModelArts
- What Is ModelArts?
- Advantages
- Use Cases
- Functions
- AI Development Basics
- Security
- Notes and Constraints
- Permissions Management
- Billing Description
- Quotas
- ModelArts and Other Services
Billing
- Billing Modes
- Billing Item
- Billing Examples
- Changing the Billing Mode
- Renewal
- Bills
- About Arrears
- Stopping Billing
- Cost Management
- Billing FAQs
Getting Started
- How to Use ModelArts
- Using a Custom Algorithm to Build a Handwritten Digit Recognition Model
- Practices for Beginners
ModelArts User Guide (Standard)
- ModelArts Standard Usage
- ModelArts Standard Preparations
- ModelArts Standard Resource Management
- Using ExeML for Zero-Code AI Development
- Using Workflows for Low-Code AI Development
- Development Environments
- Data Management
- Model Training
- Inference Deployment
- Image Management
- Resource Monitoring
- Viewing Audit Logs
  - ModelArts Key Operations Traced by CTS
  - Viewing ModelArts Audit Logs
ModelArts User Guide (Lite Server)
- Before You Start
- Enabling Lite Server Resources
- Configuring Lite Server Resources
- Using Lite Server Resources
  - PyTorch GPU Training and Inference Guide for GPT-2
- Managing Lite Server Resources
ModelArts User Guide (Lite Cluster)
- Before You Start
- Enabling Lite Cluster Resources
- Configuring Lite Cluster Resources
- Using Lite Cluster Resources
- Managing Lite Server Resources
ModelArts User Guide (AI Gallery)
- AI Gallery
- Free Assets
- My Gallery
- Subscription & Use
- Publish & Share
  - Publishing a Free Algorithm
  - Publishing a Free Model
Best Practices
- Official Samples
- Permissions Management
- Notebook
  - Creating, Migrating, and Managing Conda Virtual Environments Based on SFS
- Model Training
- Model Inference
API Reference
- Before You Start
- API Overview
- Calling APIs
- Development Environment Management
- Training Management
- AI Application Management
- App Authentication Management
- Service Management
- Resource Management
- DevServer Management
- Authorization Management
- Managing DevEnviron Instances
  - Querying All Notebook Instances
- Use Cases
- Permissions Policies and Supported Actions
- Common Parameters
- Historical APIs
- Change History
SDK Reference
- Before You Start
- SDK Overview
- Getting Started
- (Optional) Installing the ModelArts SDK Locally
- Session Authentication
- OBS Management
- Data Management
- Training Management (New Version)
  - Training Jobs
  - APIs for Resources and Engine Specifications
    - Obtaining Resource Flavors
    - Obtaining Engine Types
- Training Management (Old Version)
- Model Management
- Service Management
- Change History
FAQs
- General Issues
- Billing
- ExeML (Old Version)
- Data Management (Old Version)
- Notebook (New Version)
- Training Jobs
- Service Deployment
  - Model Management
    - Importing Models
  - Service Deployment
    - Functional Consulting
    - Real-Time Services
- Resource Pools
- API/SDK
- Using PyCharm Toolkit
Troubleshooting
- General Issues
  - Incorrect OBS Path on ModelArts
- ExeML
- DevEnviron
- Training Jobs
- Inference Deployment
- MoXing
- APIs or SDKs
Videos
Preparations (To Be Offline)
- Creating a Huawei ID and Enabling Huawei Cloud Services
- Logging In to the ModelArts Management Console
- Configuring Access Authorization (Global Configuration)
- Creating an OBS Bucket
- Enabling ModelArts Resources
  - ModelArts Resources
  - Pay-Per-Use
User Guide (ExeML)
- ExeML (New Version)
- ExeML (Old Version)
Workflows
- MLOps Overview
- What Is Workflow?
- How to Use a Workflow?
- How to Develop a Workflow?
DevEnviron
- Introduction to DevEnviron
- Application Scenarios
- Managing Notebook Instances
- JupyterLab
- Local IDE
- ModelArts CLI Command Reference
Model Development (To Be Offline)
- Introduction to Model Development
- Preparing Data
- Preparing Algorithms
- Performing a Training
- Advanced Training Operations
- Distributed Training
- Automatic Model Tuning (AutoSearch)
Image Management
- Image Management
- Using a Preset Image
- Using Custom Images in Notebook Instances
- Using a Custom Image to Train Models (Model Training)
- Using a Custom Image to Create AI applications for Inference Deployment
  - Custom Image Specifications for Creating AI Applications
  - Creating a Custom Image and Using It to Create an AI Application
- FAQs
- Modification History
Model Inference (To Be Offline)
- Introduction to Inference
- Managing AI Applications
- Deploying an AI Application as a Service
- Inference Specifications
- ModelArts Monitoring on Cloud Eye
Resource Management
- Resource Pool
- Elastic Cluster
- Audit Logs
  - Key Operations Recorded by CTS
  - Viewing Audit Logs
- Monitoring Resources
Data Preparation and Analytics
- Introduction to Data Preparation
- Getting Started
- Creating a Dataset
- Importing Data
- Data Analysis and Preview
- Labeling Data
- Publishing Data
- Exporting Data
Data Labeling (To Be Offline)
- Introduction to Data Labeling
- Manual Labeling
- Auto Labeling
  - Creating an Auto Labeling Job
  - Confirming Hard Examples
- Team Labeling
User Guide for Senior AI Engineers (To Be Offline)
- Operation Guide
- Data Management (Old Version to Be Terminated)
- Training Management (Old Version )
- Resource Pools (Old Version to Be Terminated)
- Custom Images
- Permissions Management
  - Creating a User and Granting Permissions
  - Creating a Custom Policy
- Audit Logs
  - Key Operations Recorded by CTS
  - Viewing Audit Logs
- Change History
General Reference
- Glossary
- Service Level Agreement
- White Papers
- Endpoints
- Permissions

On this page

Show all

Help Center/ ModelArts/ ModelArts User Guide (Standard)/ Inference Deployment/ Managing a Synchronous Real-Time Service/ Viewing Performance Metrics of a Real-Time Service on Cloud Eye

Viewing Performance Metrics of a Real-Time Service on Cloud Eye

Updated on 2024-12-26 GMT+08:00

View PDF

ModelArts Metrics

The cloud service platform provides Cloud Eye to help you better understand the statuses of ModelArts real-time services and model loads. You can use Cloud Eye to automatically monitor your ModelArts real-time services and model loads in real time and manage alarms and notifications so that you can obtain the performance metrics of ModelArts and models.

**Table 1** ModelArts metrics
ID	Name	Description	Value Range	Monitored Object	Monitoring Interval
cpu_usage	CPU Usage	CPU usage of ModelArts Unit: %	≥ 0%	ModelArts model loads	1 minute
mem_usage	Memory Usage	Memory usage of ModelArts Unit: %	≥ 0%	ModelArts model loads	1 minute
gpu_util	GPU Usage	GPU usage of ModelArts Unit: %	≥ 0%	ModelArts model loads	1 minute
gpu_mem_usage	GPU Memory Usage	GPU memory usage of ModelArts Unit: %	≥ 0%	ModelArts model loads	1 minute
npu_util	NPU Usage	NPU usage of ModelArts Unit: %	≥ 0%	ModelArts model loads	1 minute
npu_mem_usage	NPU Memory Usage	NPU memory usage of ModelArts Unit: %	≥ 0%	ModelArts model loads	1 minute
successfully_called_times	Number of Successful Calls	Times that ModelArts services have been successfully called Unit: counts/minute	≥ counts/minute	ModelArts model loads ModelArts real-time services	1 minute
failed_called_times	Number of Failed Calls	Times that ModelArts services failed to be called Unit: counts/minute	≥ counts/minute	ModelArts model loads ModelArts real-time services	1 minute
total_called_times	Total Calls	Times that ModelArts services are called Unit: counts/minute	≥ counts/minute	ModelArts model loads ModelArts real-time services	1 minute
disk_read_rate	Disk Read Rate	Disk read rate of ModelArts Unit: bit/minute	≥ bit/minute	ModelArts model loads	1 minute
disk_write_rate	Disk Write Rate	Disk write rate of ModelArts Unit: bit/minute	≥ bit/minute	ModelArts model loads	1 minute
send_bytes_rate	Uplink rate	Outbound network traffic rate of ModelArts. Unit: bit/minute	≥ bit/minute	ModelArts model loads	1 minute
recv_bytes_rate	Downlink rate	Inbound network traffic rate of ModelArts.	≥ bit/minute	ModelArts model loads	1 minute
req_count_2xx	2xx Responses	Number of times that the API returns a 2xx response	≥ counts/minute	ModelArts real-time services	1 minute
req_count_4xx	4xx Errors	Number of times that the API returns a 4xx error	≥ counts/minute	ModelArts real-time services	1 minute
req_count_5xx	5xx Errors	Number of times that the API returns a 5xx error	≥ counts/minute	ModelArts real-time services	1 minute
avg_latency	Average Latency	Average latency of the API	≥ ms	ModelArts real-time services	1 minute
tp_99	TP99	Collects the response durations of each call over the last minute, arranges them in ascending order, and then excludes the top 1% of values. The highest remaining value represents the TP99.	≥ ms	ModelArts real-time services	1 minute
tp_999	TP99.9	Collects the response durations of each call over the last minute, arranges them in ascending order, and then excludes the top 0.1% of values. The highest remaining value represents the TP99.9.	≥ ms	ModelArts real-time services	1 minute
If a monitored object has multiple dimensions, all dimensions are mandatory when you use APIs to query the metrics. The following provides an example of using the multi-dimensional dim to query a single monitoring metric: dim.0=service_id,530cd6b0-86d7-4818-837f-935f6a27414d&dim.1="model_id,3773b058-5b4f-4366-9035-9bbd9964714a The following provides an example of using the multi-dimensional dim to query monitoring metrics in batches: "dimensions": [ { "name": "service_id", "value": "530cd6b0-86d7-4818-837f-935f6a27414d" } { "name": "model_id", "value": "3773b058-5b4f-4366-9035-9bbd9964714a" } ]

**Table 2** Dimension description
Key	Value
service_id	Real-time service ID
model_id	Model ID

Setting Alarm Rules

Setting alarm rules allows you to customize the monitored objects and notification policies so that you can know the status of ModelArts real-time services and models in a timely manner.

An alarm rule includes the alarm rule name, monitored object, metric, threshold, monitoring interval, and whether to send a notification. This section describes how to set alarm rules for ModelArts services and models.

NOTE:

Only real-time services in the Running status can be interconnected with CES.

Prerequisites:

A ModelArts real-time service has been created.
ModelArts monitoring has been enabled on Cloud Eye. To do so, log in to the Cloud Eye console. On the Cloud Eye page, click Custom Monitoring. Then, enable ModelArts monitoring as prompted.

Set an alarm rule in any of the following ways:

Set an alarm rule for all ModelArts services.
Set an alarm rule for a ModelArts service.
Set an alarm rule for a model version.
Set an alarm rule for a metric of a service or model version.

Method 1: Setting an Alarm Rule for All ModelArts Services

Log in to the management console.
In the Service List, click Cloud Eye under Management & Governance.
In the navigation pane on the left, choose Alarm Management > Alarm Rules and click Create Alarm Rule.
On the Create Alarm Rule page, set Resource Type to ModelArts, Dimension to Service, and Method to Configure manually, and set alarm policies. Then, confirm settings and click Create.

Method 2: Setting an Alarm Rule for a Single Service

Log in to the management console.
In the Service List, click Cloud Eye under Management & Governance.
In the navigation pane, choose Cloud Service Monitoring > ModelArts.
Locate a real-time service for which you want to create an alarm rule and click Create Alarm Rule in the Operation column.
On the Create Alarm Rule page, create an alarm rule for ModelArts real-time services and models as prompted.

Method 3: Setting an Alarm Rule for a Model Version

Log in to the management console.
In the Service List, click Cloud Eye under Management & Governance.
In the navigation pane, choose Cloud Service Monitoring > ModelArts.
Click the down arrow next to the target real-time service name. Then, click Create Alarm Rule in the Operation column of the target version.
On the Create Alarm Rule page, create an alarm rule for model loads as prompted.

Method 4: Setting an Alarm Rule for a Metric of a Service or Model Version

Log in to the management console.
In the Service List, click Cloud Eye under Management & Governance.
In the navigation pane, choose Cloud Service Monitoring > ModelArts.
Click the down arrow next to the target real-time service name. Then, click the target version and view alarm rule details.
On the alarm rule details page, click the plus sign (+) in the upper right corner of a metric and set an alarm rule for the metric.

Viewing Monitoring Metrics

Cloud Eye on the cloud service platform monitors the statuses of ModelArts real-time services and model loads. You can obtain the monitoring metrics of each ModelArts real-time service and model on the management console. It takes a period of time to transmit and display monitored data. The statuses displayed on the Cloud Eye console are obtained 5 to 10 minutes before. You can view the monitoring data of a newly created real-time service 5 to 10 minutes later.

Prerequisites:

The ModelArts real-time service is running properly.

Alarm rules have been configured on the Cloud Eye page. For details, see Setting Alarm Rules.
The real-time service has been properly running for at least 10 minutes.
The monitored data and graphics are available for a new real-time service after the service runs for at least 10 minutes.

Cloud Eye does not display the metrics of a faulty or deleted real-time service. The monitoring metrics can be viewed after the real-time service starts or recovers.

The monitoring data is unavailable without alarm rules configured in Cloud Eye. For details, see Setting Alarm Rules.

Log in to the management console.
In the Service List, click Cloud Eye under Management & Governance.
In the navigation pane, choose Cloud Service Monitoring > ModelArts.
View monitoring graphs.
- Viewing monitoring graphs of a real-time service: Click View Metric in the Operation column.
- Viewing monitoring graphs of the model loads: Click next to the target real-time service, and click View Metric in the Operation column of the target model.
In the monitoring area, you can select a period to view the monitoring data.

You can view the monitoring data in the last 1 hour, 3 hours, or 12 hours. To view the monitoring curve of a longer time range, click to enlarge the graph.

Parent topic: Managing a Synchronous Real-Time Service

Previous topic: Modifying a Real-Time Service

Next topic: Integrating a Real-Time Service API into the Production Environment

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot