Help Center/ MapReduce Service/ User Guide (Ankara Region)/ Alarm Reference/ ALM-46003 MOTService Heartbeat Interruption Between the Active and Standby Nodes
Updated on 2024-11-29 GMT+08:00

ALM-46003 MOTService Heartbeat Interruption Between the Active and Standby Nodes

Alarm Description

This alarm is generated when the active or standby MOTService node has not received heartbeat messages from the peer node for 7 seconds.

This alarm is cleared when the heartbeat recovers.

Alarm Attributes

Alarm ID

Alarm Severity

Alarm Type

Service Type

Auto Cleared

46003

Major

Heartbeat

MOTService

Yes

Alarm Parameters

Type

Parameter

Description

Location Information

Source

Specifies the cluster for which the alarm was generated.

ServiceName

Specifies the service for which the alarm was generated.

RoleName

Specifies the role for which the alarm was generated.

HostName

Specifies the host for which the alarm was generated.

Additional Information

Local MOTService HA Name

Specifies a local MOTService HA.

Peer MOTService HA Name

Specifies a peer MOTService HA.

SYNC_PERCENT

Specifies the synchronization percentage of the active and standby MOTService nodes.

Impact on the System

During the MOTService heartbeat interruption, only one node provides the service. Once this node becomes faulty, the MOTService service cannot be switched to a standby node and will become unavailable.

Possible Causes

The network between the active and standby MOTService nodes is abnormal.

Handling Procedure

Check whether the network between the active and standby MOTService servers is normal.

  1. On FusionInsight Manager, choose Cluster > Services > MOTService > Instance. View and record the service IP addresses of MOTServer(Active) and MOTServer(Standby) instances.
  2. Log in to the MOTServer(Active) node as user omm.
  3. Run the following command to check whether the network connection between the active and standby MOTService nodes is normal:

    ping Service IP address of the MOTServer(Standby) node
    • If yes, go to 6.
    • If no, go to 4.

  4. Contact the network administrator to check whether the network is faulty.

    • If yes, go to 5.
    • If no, go to 6.

  5. Rectify the network fault and check whether the alarm is cleared in the alarm list.

    • If yes, no further action is required.
    • If no, go to 6.

Collect fault information.

  1. On FusionInsight Manager, choose O&M. In the navigation pane on the left, choose Log > Download.
  2. Expand the Service drop-down list and select MOTService.
  3. Expand the Hosts drop-down list. In the Select Host dialog box that is displayed, select the hosts to which the role belongs.
  4. Click the edit icon in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
  5. Contact O&M personnel/Technical support and provide the collected logs.

Alarm Clearance

This alarm is automatically cleared after the fault is rectified.

Related Information

None