Can ECSs Automatically Recover After the Physical Host Accommodating the ECSs Becomes Faulty?
ECSs run on physical hosts. Although the cloud platform offers multiple mechanisms to ensure system reliability, error tolerance, and high availability, host hardware might be damaged or power failure might occur. If physical hosts cannot be powered on or restarted due to damage, CPU and memory data will lose and live migration cannot be used to recovery ECSs.
The cloud platform provides automatic recovery by default to restart ECSs through cold migration, ensuring high availability and dynamic ECS migration. Once a physical host accommodating ECSs breaks down, the ECSs automatically migrate to a functional physical host. This minimizes user service interruption. The ECSs will restart during the migration.
- Automatic recovery does not ensure user data consistency.
- An ECS can be automatically recovered only if the physical host on which it is deployed becomes faulty. This function does not take effect if the fault is caused by the ECS itself.
- An ECS can be automatically recovered only after the physical host on which it is deployed is shut down. If the physical host is not shut down due to a fault, for example, a memory fault, automatic recovery fails to take effect.
- An ECS can be automatically recovered only once within 12 hours if the host on which it is deployed becomes faulty.
- ECS automatic recovery may fail in the following scenarios:
- No physical host is available for migration due to a system fault.
- The target physical host does not have sufficient temporary capacity.
- An ECS with any of the following resources cannot be automatically recovered:
- Local disk
- Passthrough FPGA card
- Passthrough InfiniBand NIC