ALM-12010 Manager Heartbeat Interruption Between the Active and Standby Nodes (For MRS 2.x or Earlier)

This alarm is generated when the active Manager does not receive any heartbeat signal from the standby Manager within 7 seconds.

This alarm is cleared when the active Manager receives heartbeat signals from the standby Manager.

Alarm ID	Alarm Severity	Auto Clear
12010	Major	Yes

Parameter	Description
ServiceName	Specifies the service for which the alarm is generated.
RoleName	Specifies the role for which the alarm is generated.
HostName	Specifies the host for which the alarm is generated.
Local Manager HA Name	Specifies a local Manager HA.
Peer Manager HA Name	Specifies a peer Manager HA.

When the active Manager process is abnormal, an active/standby failover cannot be performed, and services are affected.

The link between the active and standby Manager servers is abnormal.

Check whether the network between the active and standby Manager servers is normal.
1. Go to the MRS cluster details page. In the alarm list on the alarm management tab page, click the row that contains the alarm. In the alarm details, view the address of the standby Manager server.
2. Log in to the active management node.
3. Run the following command to check whether the standby Manager is reachable:
  ping heartbeat IP address of the standby Manager
  - If yes, go to Step 2.
  - If no, go to 1.d.
4. Contact the O&M personnel to check whether the network is faulty.
  - If yes, go to 1.e.
  - If no, go to Step 2.
5. Rectify the network fault and check whether the alarm is cleared from the alarm list.
  - If yes, no further action is required.
  - If no, go to Step 2.
Log in to all Master nodes in the cluster and run the following commands to find all sed files and delete them:

find /srv/BigData/ -name "sed*"

find /opt -name "sed*"
Collect fault information.
1. On MRS Manager, choose System > Export Log.
2. Contact O&M engineers and send the collected logs.

None

Thank you very much for your feedback. We will continue working to improve the documentation.

The system is busy. Please try again later.