Help Center > > User Guide> MRS Manager Operation Guide> Alarm Reference (Applicable to MRS 2.x or Earlier)> ALM-12002 HA Resource Is Abnormal

ALM-12002 HA Resource Is Abnormal

Updated at: Mar 25, 2021 GMT+08:00

Description

The high availability (HA) software periodically checks the WebService floating IP addresses and databases of Manager. This alarm is generated when the HA software detects that the WebService floating IP addresses or databases are abnormal.

This alarm is cleared when the HA software detects that the floating IP addresses or databases are normal.

Attribute

Alarm ID

Alarm Severity

Auto Clear

12002

Major

Yes

Parameter

Parameter

Description

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

RESName

Specifies the resource for which the alarm is generated.

Impact on the System

If the WebService floating IP addresses of Manager are abnormal, users cannot log in to or use Manager. If databases are abnormal, all core services and related service processes, such as alarms and monitoring functions, are affected.

Possible Causes

  • The floating IP address is abnormal.
  • The database is abnormal.

Procedure

  1. Check the floating IP address status of the active management node.

    1. Go to the MRS cluster details page. In the alarm list on the alarm management tab page, click the row that contains the alarm. In the alarm details, view the host address and resource name of the alarm.
    2. Log in to the active management node. Run the following commands to switch the user:

      sudo su - root

      su - omm

    3. Go to the ${BIGDATA_HOME}/om-0.0.1/sbin/ directory, run the status-oms.sh script to check whether the floating IP address of the active Manager is normal. View the command output, locate the row where ResName is floatip, and check whether the following information is displayed.

      Example:

      10-10-10-160 floatip Normal Normal Single_active
      • If yes, go to 2.
      • If no, go to 1.d.
    4. Contact the O&M personnel to check whether the floating IP NIC exists.
      • If yes, go to 2.
      • If no, go to 1.e.
    5. Contact O&M personnel to rectify the NIC fault.

      Wait 5 minutes and check whether the alarm is cleared.

      • If yes, no further action is required.
      • If no, go to 2.

  2. Check the database status of the active and standby management nodes.

    1. Log in to the active and standby management nodes, run the sudo su - root and su - ommdba commands to switch to user ommdba, and run the gs_ctl query command to check whether the following information is displayed in the command output.

      Command output of the active management node:

      Ha state:
      LOCAL_ROLE: Primary
      STATIC_CONNECTIONS: 1
      DB_STATE: Normal
      DETAIL_INFORMATION: user/password invalid
       Senders info:
      No information
       Receiver info:
      No information

      Command output of the standby management node:

      Ha state:
      LOCAL_ROLE: Standby
      STATIC_CONNECTIONS: 1
      DB_STATE : Normal
      DETAIL_INFORMATION: user/password invalid
       Senders info:
      No information
       Receiver info:
      No information
      • If yes, go to 2.c.
      • If no, go to 2.b.
    1. Contact the O&M personnel to check whether a network fault occurs and rectify the fault.
      • If yes, go to 2.c.
      • If no, go to 3.
    1. Wait 5 minutes and check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 3.

  3. Collect fault information.

    1. On MRS Manager, choose System > Export Log.
    2. Contact the O&M personnel and send the collected logs.

Reference

None

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel