Updated on 2024-11-14 GMT+08:00

Automatically Recovering ECSs

Scenarios

ECSs run on physical servers. Although there are multiple mechanisms to ensure system reliability, error tolerance, and high availability, server hardware might be damaged or power failure might occur. If physical servers cannot be powered on or restarted due to damage, CPU and memory data will be lost, and the ECSs cannot recover through live migration.

The cloud platform provides automatic recovery to restart ECSs through cold migration, ensuring high availability and top-performing dynamic migration capability of ECSs. You can enable automatic recovery during or after ECS creation. If a physical server accommodating ECSs breaks down, the ECSs with automatic recovery enabled will automatically be migrated to a functional server to minimize the impact on your services. During this process, the ECSs will restart.

Notes

  • Automatic recovery does not ensure user data consistency.
  • An ECS can be automatically recovered only if the physical server on which it is deployed becomes faulty. This function does not take effect if the fault is caused by the ECS itself.
  • An ECS can be automatically recovered only after the physical server on which it is deployed is shut down. If the physical server is not shut down due to a fault, for example, a memory fault, automatic recovery fails to take effect.
  • An ECS can be automatically recovered only once within 12 hours if the server on which it is deployed becomes faulty.
  • ECS automatic recovery may fail in the following scenarios:
    • No physical server is available for migration due to a system fault.
    • The target physical server does not have sufficient temporary capacity.
  • An ECS with any of the following resources cannot be automatically recovered:
    • Local disk
    • Passthrough FPGA card
    • Passthrough InfiniBand NIC

Procedure

  1. Log in to the management console.
  2. Click in the upper left corner and select your region and project.
  3. Under Compute, click Elastic Cloud Server.
  4. Click the name of the target ECS.

    The page providing details about the ECS is displayed.

  5. Set Auto Recovery to Enable or Disable.
    Automatic recovery is enabled by default.
    • If a physical server accommodating ECSs breaks down, the ECSs with automatic recovery enabled will automatically be migrated to a functional server to minimize the impact on your services. During this process, the ECSs will restart.
    • If Auto Recovery is disabled, you must wait for the system administrator to recover ECSs when hardware becomes faulty.