Overview
Huawei Cloud can predict and proactively prevent hardware or software faults of hosts accommodating ECSs.
If host failures cannot be avoided, the system will generate and report events for affected ECSs to minimize impacts of instance unavailability or performance deterioration. These events include instance redeployment and local disk replacement. For details, see Event Type. The system does not frequently report events.
You can view events details on the ECS console, including the event type, instance name/ID, and event status. You can also check ECS events details on the Event Monitoring page on the Cloud Eye console. For details, see Viewing Event Monitoring Data.
Event Type
Table 1 describes events that can be reported by the system.
Event Type |
Generated When |
Impact |
Handling Suggestion |
---|---|---|---|
Instance redeployment |
The system detects that the host accommodating ECSs is faulty and it plans to deploy the ECSs on a new host. |
During the instance redeployment, ECSs will be temporarily unavailable for a short period of time. The system will send the event notification 24 to 72 hours earlier than the scheduled execution time.
NOTICE:
For ECSs using local disks, all data stored on the local disks will be lost. |
Refer to the following to rectify the fault. After the fault is rectified, check the impacts on services. If any problems occur, contact technical support. Handling an Instance Redeployment Event You are advised to select off-peak time as the scheduled start time during authorization. If you do not specify the start time, the current time is used as the start time by default. |
Local disk replacement |
The system detects that a disk of the host accommodating ECSs (including bare metal ECSs) is faulty. |
Local disk replacement will cause data loss on local disks. |
Refer to the following to rectify the fault. After the fault is rectified, check the impacts on services. If any problems occur, contact technical support.
NOTICE:
Local disk replacement will cause data loss on local disks. If you do not need to retain data on local disks, use one of the following methods:
|
Instance migration |
The system detects that the host accommodating ECSs is faulty and needs to be restarted, stopped, or brought offline, and it plans to migrate ECSs. |
The system attempts to perform a live migration of ECSs first. The HA mechanism will be triggered if an exception occurs (ECSs will be unavailable temporarily during this period). |
After the fault is rectified, check the impacts on services. If any problems occur, contact technical support. |
System maintenance |
The system detects that there are hardware or software faults in the host accommodating ECSs (including bare metal ECSs) and plans to perform maintenance operations on the affected instances. |
During system maintenance, the host may be powered off, and ECSs running on it become unavailable. |
Refer to the following to rectify the fault. After the fault is rectified, check the impacts on services. If any problems occur, contact technical support. Handling a System Maintenance Event Ensure that services running on the instances have been stopped and select an off-peak time as the scheduled start time during authorization. If you do not specify the start time, the current time is used as the start time by default. The duration required for system maintenance varies depending on the faults. The system maintenance will be completed within five working days generally after the authorization is started. Please wait patiently. |
Event Status
Table 2 lists statuses of the events reported by the system. You can check progresses of the events and filter events by status.
Type |
Description |
---|---|
Pending authorization |
An event is waiting to be authorized with the start time specified. The system will complete operations within a specified time. For details, see Handling an Event. |
To be executed |
The event is waiting for the system to schedule resources. |
Executing |
The system has scheduled resources and is rectifying the fault. |
Execution succeeded |
The system has completed event execution. Check the impacts on services. If any problems occur, contact technical support. |
Execution failed |
The system fails to automatically rectify the fault. |
Canceled |
The event has been canceled. |
The event status changes with the operations performed by users and the system.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot