Updated on 2024-09-23 GMT+08:00

ALM-12076 GaussDB Resource Is Abnormal

Description

HA checks the Manager database every 10 seconds. This alarm is generated when HA detects that the database is abnormal for 3 consecutive times.

This alarm is cleared when the database is normal.

Attribute

Alarm ID

Alarm Severity

Auto Clear

12076

Major

Yes

Parameters

Name

Meaning

Source

Specifies the cluster or system for which the alarm is generated.

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Impact on the System

If the database is abnormal, all core services and related service processes of Manager, such as the alarm, monitoring, and query functions, are affected.

Possible Causes

An exception occurs in the database.

Procedure

Check the database status of the active and standby management nodes.

  1. Log in to the active and standby management nodes respectively as user root. Run the su - ommdba command to switch to user ommdba, and then run the gs_ctl query command to check whether the following information is displayed in the command output.

    Command output of the active management node:

     Ha state:  
            LOCAL_ROLE: Primary  
            STATIC_CONNECTIONS            : 1  
            DB_STATE                      : Normal  
            DETAIL_INFORMATION            : user/password invalid  
     Senders info:  
            No information  
     Receiver info:  
            No information     

    Command output of the standby management node:

     Ha state:  
            LOCAL_ROLE: Standby  
            STATIC_CONNECTIONS            : 1  
            DB_STATE                      : Normal  
            DETAIL_INFORMATION            : user/password invalid  
     Senders info:  
            No information  
     Receiver info:  
            No information
    • If it is, go to 3.
    • If it is not, go to 2.

  2. Contact the network administrator to check whether the network is faulty.

    • If it is, go to 3.
    • If it is not, go to 5.

  3. Five minutes later, check whether the alarm is cleared.

    • If it is, no further action is required.
    • If it is not, go to 4.

  4. Log in to the active and standby management nodes, run the su -omm command to switch to user omm, go to the ${BIGDATA_HOME} /om-server/om/sbin/ directory, and run the status-oms.sh script to check whether the floating IP addresses and GaussDB resources of the active and standby FusionInsight Managers are in the status shown in the following figure.

    • If they are, find the alarm in the alarm list and manually clear the alarm.
    • If they are not, go to 5.

Collect fault information.

  1. On FusionInsight Manager, choose O&M > Log > Download.
  2. Select OmmServer for Service and click OK.
  3. Click in the upper right corner. In the displayed dialog box, set Start Date and End Date to 10 minutes before and after the alarm generation time respectively and click OK. Then, click Download.
  4. Contact the O&M personnel and send the collected log information.

Alarm Clearing

This alarm will be automatically cleared after the fault is rectified.

Related Information

None