Monitoring Clusters Using Cloud Eye
This section describes metrics reported by GES to Cloud Eye as well as their namespaces, lists, and dimensions. You can use APIs provided by Cloud Eye to query the metric information generated for GES.
Namespace
SYS.GES
Monitoring Metrics
Metric ID |
Metric |
Description |
Value Range |
Monitored Object |
---|---|---|---|---|
ges001_vertex_util |
Vertex Capacity Usage |
Vertex usage in a graph instance. The value is the ratio of used vertices to the total vertices. Unit: % |
0–100 Type: float |
GES instance |
ges002_edge_util |
Edge Capacity Usage |
Edge usage of a graph instance. The value is the ratio of the used edges to the total edges. Unit: % |
0–100 Type: float |
GES instance |
ges003_average_import_rate |
Average Import Rate |
Average rate of importing vertices or edges to a graph instance Unit: count/s |
0–400000 Type: float |
GES instance |
ges004_request_count |
Request Quantity |
Number of requests received by a graph instance Unit: count |
≥ 0 Type: integer |
GES instance |
ges005_average_response_time |
Average Response Time |
Average response time of requests received by a graph instance Unit: ms |
≥ 0 Type: integer |
GES instance |
ges006_min_response_time |
Minimum Response Time |
Minimum response time of requests received by a graph instance Unit: ms |
≥ 0 Type: integer |
GES instance |
ges007_max_response_time |
Maximum Response Time |
Maximum response time of requests received by a graph instance Unit: ms |
≥ 0 Type: integer |
GES instance |
ges008_read_task_pending_queue_size |
Length of the Waiting Queue for Read Tasks |
Length of the waiting queue for read requests received by a graph instance. This metric is used to view the number of read requests waiting in the queue. Unit: count |
≥ 0 Type: integer |
GES instance |
ges009_read_task_pending_max_time |
Maximum Waiting Duration of Read Tasks |
Maximum waiting duration of read requests received by a graph instance Unit: ms |
≥ 0 Type: integer |
GES instance |
ges010_pending_max_time_ read_task_type |
Type of the Read Task That Waits the Longest |
Type of the read request that waits the longest in a graph instance. You can find the corresponding task name in GES documents. |
≥ 1 Type: integer |
GES instance |
ges011_read_task_running_queue_size |
Length of the Running Queue for Read Tasks |
Length of the running queue for read requests received by a graph instance. This metric is used to view the number of running read requests. Unit: count |
≥ 0 Type: integer |
GES instance |
ges012_read_task_running_max_time |
Maximum Running Duration of Read Tasks |
Maximum running duration of read requests received by a graph instance Unit: ms |
≥ 0 Type: integer |
GES instance |
ges013_running_max_time_ read_task_type |
Type of the Read Task That Runs the Longest |
Type of the read request that runs the longest in a graph instance. You can find the corresponding task name in GES documentation. |
≥ 1 Type: integer |
GES instance |
ges014_write_task_pending_queue_size |
Length of the Waiting Queue for Write Tasks |
Length of the waiting queue for write requests received by a graph instance. This metric is used to view the number of write requests waiting in the queue. Unit: count |
≥ 0 Type: integer |
GES instance |
ges015_write_task_pending_max_time |
Maximum Waiting Duration of Write Tasks |
Maximum waiting duration of write requests received by a graph instance Unit: ms |
≥ 0 Type: integer |
GES instance |
ges016_pending_max_time_ write_task_type |
Type of the Write Task That Waits the Longest |
Type of the write request that waits the longest in a graph instance. You can find the corresponding task name in GES documents. |
≥ 1 Type: integer |
GES instance |
ges017_write_task_running_queue_size |
Length of the Running Queue for Write Tasks |
Length of the running queue for write requests received by a graph instance. This metric is used to view the number of running write requests. Unit: count |
≥ 0 Type: integer |
GES instance |
ges018_write_task_running_max_time |
Maximum Running Duration of Write Tasks |
Maximum running duration of write requests received by a graph instance Unit: ms |
≥ 0 Type: integer |
GES instance |
ges019 _running_max_time_ write_task_type |
Type of the Write Task That Runs the Longest |
Type of the write request that runs the longest in a graph instance. You can find the corresponding task name in GES documentation. |
≥ 1 Type: integer |
GES instance |
ges020_computer_resource_usage |
Computing Resource Usage |
Compute resource usage of each graph instance Unit: % |
0–100 Type: float |
GES instance |
ges021_memory_usage |
Memory Usage |
Memory usage of each graph instance Unit: % |
0–100 Type: float |
GES instance |
ges022_iops |
IOPS |
Number of I/O requests processed by each graph instance per second Unit: count/s |
≥ 0 Type: integer |
GES instance |
ges023_bytes_in |
Network Input Throughput |
Data input to each graph instance per second over the network Unit: byte/s |
≥ 0 Type: float |
GES instance |
ges024_bytes_out |
Network Output Throughput |
Data sent to the network per second from each graph instance Unit: byte/s |
≥ 0 Type: float |
GES instance |
ges025_disk_usage |
Disk Usage |
Disk usage of each graph instance Unit: % |
0–100 Type: float |
GES instance |
ges026_disk_total_size |
Total Disk Size |
Total data disk space of each graph instance Unit: GB |
≥ 0 Type: float |
GES instance |
ges027_disk_used_size |
Disk Space Used |
Used data disk space of each graph instance Unit: GB |
≥ 0 Type: float |
GES instance |
ges028_disk_read_throughput |
Disk Read Throughput |
Data volume read from the disk in a graph instance per second Unit: byte/s |
≥ 0 Type: float |
GES instance |
ges029_disk_write_throughput |
Disk Write Throughput |
Data volume written to the disk in a graph instance per second Unit: byte/s |
≥ 0 Type: float |
GES instance |
ges030_avg_disk_sec_per_read |
Average Time per Disk Read |
Average time per disk read for a graph instance Unit: second |
≥ 0 Type: float |
GES instance |
ges031_avg_disk_sec_per_write |
Average Time per Disk Write |
Average time per disk write for a graph instance Unit: second |
≥ 0 Type: float |
GES instance |
ges032_avg_disk_queue_length |
Average Disk Queue Length |
Average I/O queue length of the disk in a graph instance Unit: count |
≥ 0 Type: integer |
GES instance |
Dimensions
Key |
Value |
---|---|
instance_id |
GES instance |
Mapping Between Task Types and Names
Type |
Name |
---|---|
100 |
Querying vertices |
101 |
Creating a vertex |
102 |
Deleting a vertex |
103 |
Modifying a vertex property |
104 |
Adding a vertex label |
105 |
Deleting a vertex label |
200 |
Querying edges |
201 |
Creating an edge |
202 |
Deleting an edge |
203 |
Modifying an edge property |
300 |
Querying schema details |
301 |
Adding a label |
302 |
Modifying a label |
303 |
Querying a label |
304 |
Modifying a property |
400 |
Querying graph details |
401 |
Clearing graphs |
402 |
Incrementally importing graph data online |
403 |
Creating a graph |
405 |
Deleting a graph |
406 |
Exporting graphs |
407 |
filtered_khop |
408 |
Querying path details |
409 |
Incrementally importing graph data offline |
500 |
Creating a graph backup |
501 |
Restoring a graph from a backup |
601 |
Creating an index |
602 |
Querying indexes |
603 |
Updating an index |
604 |
Deleting an index |
700 |
Running an algorithm |
Viewing Instance Monitoring Information
- Log in to the GES management console and choose Graph Management.
- In the graph list, locate the row that contains the target graph, choose More, and select View Metric to access the Cloud Eye management console. By default, the graph instance monitoring information is displayed.
You can select a monitoring metric name and time range to check the performance curve.
Creating an Alarm Rule
By setting alarm rules for GES, you can customize monitoring objects and notification policies to promptly understand the operational status of GES and serve as an early warning.
Alarm rule settings for GES include parameters such as alarm rule name, monitoring object, monitoring metrics, alarm threshold, monitoring cycle, and notification sending.
This part describes how to set an alarm rule for GES.
- Log in to the GES management console and choose Graph Management from the navigation pane on the left.
- Locate the row containing the target instance, choose More in the Operation column, and select View Metric to access the Cloud Eye management console and check the GES monitoring information.
Figure 1 Selecting View Metrics
Ensure that the status of the instance whose monitoring information you want to view is Running. Otherwise, you cannot create an alarm.
- In the navigation pane on the left of the Cloud Eye management console, choose Alarm Management > Alarm Rules. On the page displayed, click Create Alarm Rule in the upper right corner or in the middle.
- On the Create Alarm Rule page, set parameters as prompted.
- Setting alarm parameters
Figure 2 Setting parameters
Table 3 Alarm parameters Parameter
Description
Example Value
Alarm Type
Alarm type the alarm rule applies to. The value can be Metric or Event.
Metric
Cloud Product
Name of the cloud service the alarm rule is created for
Graph Engine Service - Graph Instances
Resource Level
This parameter is only available when Alarm Type is set to Metric. The options are Cloud product (recommended) and Specific dimension.
Take GES as an example. A user purchases a cloud product (GES VMs) and divides the product into multiple sub-dimensions based on metrics, including disks, mount points, and processes.
Cloud product
Monitoring Scope
Resource scope the alarm rule applies to. Select Specified resources and select one or more monitored objects. For GES, select the ID of the cluster instance you have created. Then, set Instance.
Specific resources
Method
There are three options: Associate template, Use existing template, and Configure manually.
Associate template
Template
This parameter is available only when Use template is selected.
Select the template to be used. If no alarm template is available, click Create Custom Template to create one that meets your requirements.
-
Alarm Policy
This parameter is available only when Configure manually is selected for Method.
Set the policy that triggers an alarm. For example, trigger an alarm if the CPU usage equals to or is greater than 80% for 3 consecutive periods.
For details about GES monitoring metrics, see Monitoring Metrics.
-
Alarm Severity
Alarm severity, which can be Critical, Major, Minor, or Informational.
Major
- Configure the alarm notification parameters as prompted.
Figure 3 Setting alarm notification parameters
Table 4 Alarm notification parameters Parameter
Description
Example Value
Alarm Notification
Whether to send email, SMS, HTTP, or HTTPS notifications to users when an alarm is triggered
You can enable (recommended) or disable this function.
Enable this function
Notification Recipient
The options are Notification policies, Notification group, and Topic subscription.
Topic subscription
Notification Policy
If Notification policies is selected for Notification Recipient, you need to select one or more notification policies. You can specify the notification group, template, window, and other parameters in a notification policy. For how to create a notification policy, see Creating, Modifying, or Deleting a Notification Policy.
-
Notification Object
This parameter is mandatory when Notification Recipient is set to Topic subscription.
Name of the topic the alarm notification is to be sent to. If you have enabled Alarm Notification, select a topic. If no desirable topics are available, create one first, whereupon the SMN service is invoked.
For details about how to create a topic, see Creating a Topic.
SMN topic
Notification Group
This parameter is mandatory when Notification Recipient is set to Notification group.
You can select or create a notification group. After creating a notification group, you need to click Add Notification Recipient in the Operation column of the notification group list to add group members and notification methods.
Notification group name
Notification Template
You can select a system template or create a custom notification template.
System template
Notification Window
Notifications are sent only within the notification window specified in the alarm rule.
For example, if Notification Window is set to 00:00–08:00, Cloud Eye sends notifications only within this period.
-
Trigger Condition
Condition for triggering the alarm notification. You can select Generated alarm (when an alarm is generated), Cleared alarm (when an alarm is cleared), or both.
-
- Setting alarm parameters
- Click Create. After the alarm rule is created, if the metric data reaches the specified threshold, Cloud Eye will immediately inform you that an exception has occurred.
Transferring Data to OBS
On Cloud Eye, raw metric data is only stored for two days. However, if you subscribe to OBS, you can synchronize the raw data and extend the storage period.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot