Help Center/ MapReduce Service/ User Guide/ MRS Cluster O&M/ MRS Cluster Alarm Handling Reference/ ALM-27003 DBService Heartbeat Interruption Between the Active and Standby Nodes (For MRS 2.x or Earlier)
Updated on 2024-09-23 GMT+08:00

ALM-27003 DBService Heartbeat Interruption Between the Active and Standby Nodes (For MRS 2.x or Earlier)

Description

This alarm is generated when the active or standby DBService node does not receive heartbeat messages from the peer node.

This alarm is cleared when the heartbeat recovers.

Attribute

Alarm ID

Alarm Severity

Auto Clear

27003

Major

Yes

Parameters

Parameter

Description

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Local DBService HA Name

Specifies a local DBService HA.

Peer DBService HA Name

Specifies a peer DBService HA.

Impact on the System

During the DBService heartbeat interruption, only one node can provide the service. If this node is faulty, no standby node is available for failover and the service is unavailable.

Possible Causes

The link between the active and standby DBService nodes is abnormal.

Procedure

  1. Check whether the network between the active and standby DBService servers is in normal state.

    1. Go to the cluster details page and choose Alarms.
    2. In the alarm list, locate the row that contains the alarm and view the IP address of the standby DBService server in the alarm details.
    3. Log in to the active DBService server.
    4. Run the ping heartbeat IP address of the standby DBService command to check whether the standby DBService server is reachable.
      • If yes, go to 2.
      • If no, go to 1.e.
    5. Contact the network administrator to check whether the network is faulty.
      • If yes, go to 1.f.
      • If no, go to 2.
    6. Rectify the network fault and check whether the alarm is cleared from the alarm list.
      • If yes, no further action is required.
      • If no, go to 2.

  2. Collect fault information.

    1. On MRS Manager, choose System > Export Log.
    2. Contact the O&M engineers and send the collected logs.

Reference

None