One-Click Monitoring
Application Scenarios
One-click monitoring enables you to quickly and easily enable or disable monitoring of common events for certain services. Table 1 describes differences between one-click monitoring and common monitoring.
|
Alarm Type |
Objective |
Scope |
Alarm Object |
Trigger Condition |
|---|---|---|---|---|
|
One-click monitoring |
When an event occurs, Cloud Eye triggers alarms immediately. Advantages: The configuration is simple. |
Key events of ECS, EIP, and RDS. For detailed events, see Supported Cloud Services and Alarm Rules. |
Event monitoring |
Immediate trigger |
|
Common monitoring |
Cloud Eye triggers alarms based on the preset alarm policies. For example, Cloud Eye triggers an alarm if the average CPU usage is 80% or more for five consecutive times within 5 minutes. Advantages: Alarm policies are flexible and can be configured based on service requirements. |
All services supported by Cloud Eye |
|
Accumulative trigger |
|
When an event occurs, Cloud Eye triggers alarms based on the alarm policy. Advantages: The configuration is flexible. Only event alarms are supported. |
For details about services that support event monitoring, see Events Supported by Event Monitoring. |
Event monitoring |
Immediate trigger or accumulative trigger |
This topic describes how to use the one-click monitoring function to monitor key metrics.
Constraints
- One-click monitoring sends notifications only when alarms are generated and does not send notifications when alarms are cleared.
- Once the alarm threshold is reached, one-click monitoring will trigger alarms immediately.
Procedure
- Log in to the management console.
- Under Management & Deployment, select Cloud Eye.
- In the navigation pane on the left, choose Alarm Management> One-Click Monitoring.
- Locate the target cloud service, and enable One-Click Monitoring.
For details about the cloud services and alarm rules supported by one-click monitoring, see Supported Cloud Services and Alarm Rules.Figure 1 One-Click Monitoring
- Click the arrow to the left of the cloud service name to view the automatically generated alarm rules.
The notification object of one-click monitoring rule is the account contact. Alarm notifications will be sent to the phone number or email address provided during registration.
Figure 2 Viewing alarm rules
Supported Cloud Services and Alarm Rules
|
Alarm Name |
Alarm Policy |
Description |
Procedure |
|---|---|---|---|
|
alarm-StartAutoRecovery |
Elastic Cloud Server-Start auto recovery Immediate trigger |
When the host where the ECS resides becomes faulty, the system automatically migrates the ECS to a functional host. This process will cause the ECS to restart and send a "Start auto recovery" event. After the migration is complete and a "Stop auto recovery" event is sent, the ECS is restored. |
"Start auto recovery" indicates that a fault has occurred and the ECS cannot be used. In this case, you need to replace the ECS or direct traffic to other ECSs. |
|
alarm-EndAutoRecovery |
Elastic Cloud Server-Stop auto recovery Immediate trigger |
This alarm indicates that the ECS is working properly and can be used again. |
|
Alarm Name |
Alarm Policy |
Event Description |
Procedure |
|---|---|---|---|
|
alarm-BlockEIP |
Elastic IP-EIP blocked Immediate trigger |
If the bandwidth usage exceeds 5 Gbit/s, the traffic will be discarded. This indicates that the bandwidth usage exceeds the threshold or the system experiences attacks (generally DDoS attacks). An event will be received when the EIP is unblocked. |
Change the EIP to prevent services from being affected. In addition. Check the root cause and rectify the fault. |
|
alarm-UnblockEIP |
Elastic IP-EIP unblocked Immediate trigger |
Use the unblocked EIP again to avoid a waste of resources. |
|
|
alarm-EIPBandwidthOverflow |
Elastic IP-EIP bandwidth overflow Immediate trigger |
If this event is reported, the data traffic exceeds the purchased bandwidth, which may decrease your network speed or cause packet loss. |
Check whether the EIP data traffic continues to increase and whether services are normal. Increase the bandwidth if required. |
|
Alarm Name |
Alarm Policy |
Event Description |
Procedure |
|---|---|---|---|
|
alarm-CreateInstanceFailed |
Relational Database Service-DB instance creation failure Immediate trigger |
DB instance creation failed because of insufficient disks or quota, or underlying resources have been used up. |
Check the number and quota of disks. Release resources and create DB instances again. |
|
alarm-FullBackupFailed |
Relational Database Service-Full backup failure Immediate trigger |
Full backup failed. A single full backup failure does not affect the files that have been successfully backed up, but prolong the incremental backup time during the point-in-time restore (PITR). |
Create a manual backup again. |
|
alarm-ActiveStandBySwitchFailed |
Relational Database Service-Primary/standby switchover failure Immediate trigger |
The standby DB instance does not take over services from the primary DB instance due to network or server failures. The original primary DB instance continues to provide services within a short time. |
Check whether the connection between the application and the database is re-established. |
|
alarm-AbnormalReplicationStatus |
Relational Database Service-Replication status abnormal Immediate trigger |
The replication delay between the primary and standby DB instances is too long (usually occurs when a large amount of data is written to databases or a large transaction is performed). During off-peak hours, the replication delay between the primary and standby DB instances gradually decreases. Another possible cause is that the network between the primary and standby DB instances is interrupted. However, the network interruption does not interrupt data reads from or writes into a single DB instance, and customers' applications are unaware of the interruption. |
Submit a service ticket for processing. |
|
alarm-FaultyDBInstance |
Relational Database Service-DB instance faulty Immediate trigger |
A single or primary DB instance is faulty due to a disaster or a server failure. This event is critical and may cause the database service to be unavailable. |
Check whether an automated backup policy has been configured for the DB instance and submit a service ticket for processing. |
|
alarm-SingleToHAFailed |
Relational Database Service-Failure of changing single DB instance to primary/standby Immediate trigger |
When the standby DB instance is created or after the standby DB instance is created, the configuration synchronization between the primary DB instance and the standby DB instance is faulty. Generally, the fault is caused by insufficient resources of the data center where the standby DB instance is located. This event does not interrupt the data reads and writes of the original single DB instance, and customers' applications are unaware of this event. |
Submit a service ticket for processing. |
|
alarm-ReplicationStatusRecovered |
Relational Database Service-Replication status recovered Immediate trigger |
The replication delay between the primary and standby DB instances has been restored to the normal range, or the network connection between them has been restored. |
No action is required. |
|
alarm-DBInstanceRecovered |
Relational Database Service-DB instance recovered Immediate trigger |
RDS uses high availability tools to rebuild the standby DB instance for disaster recovery. After the recovery, this event will be reported. |
No action is required. |
Last Article: Viewing the Alarm History
Next Article: Alarm Rule Management
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.