Updated on 2024-11-29 GMT+08:00

ALM-27001 DBService Is Unavailable

Alarm Description

The alarm module checks the DBService status every 30 seconds. This alarm is generated when the system detects that DBService is unavailable.

This alarm is cleared when DBService recovers.

Alarm Attributes

Alarm ID

Alarm Severity

Alarm Type

Service Type

Auto Cleared

27001

Critical

Quality of service

FusionInsight Manager

Yes

Alarm Parameters

Type

Parameter

Description

Location Information

Source

Specifies the cluster for which the alarm is generated.

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Impact on the System

The database service is unavailable and cannot provide data import and query functions for upper-layer services, which results in service exceptions.

Possible Causes

  • The floating IP address does not exist.
  • There is no active DBServer instance.
  • The active and standby DBServer processes are abnormal.

Handling Procedure

Check whether the floating IP address exists in the cluster environment.

  1. On FusionInsight Manager, choose Cluster > Name of the desired cluster > Services > DBService > Instance.
  2. Check whether the active instance exists.

    • If yes, go to 3.
    • If no, go to 9.

  3. Select the active DBServer instance and record the IP address.
  4. Log in to the host that corresponds to the preceding IP address as user root, and run the ifconfig command to check whether the floating IP address of DBService exists on the node.

    • If yes, go to 5.
    • If no, go to 9.

  5. Run the ping floating IP address command to check whether the DBService floating IP address can be pinged.

    • If yes, go to 6.
    • If no, go to 9.

  6. Log in to the host that corresponds to the DBService floating IP address as user root, and run the following command to delete the floating IP address:

    ifconfig interface down

  7. On FusionInsight Manager, choose Cluster > Name of the desired cluster > Services > DBService. On the displayed page, click More > Restart Service to restart DBService. Check whether DBService is started successfully.

    • If yes, go to 8.
    • If no, go to 9.

  8. Wait about 2 minutes and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 14.

Check the status of the active DBServer instance.

  1. Select the DBServer instance whose role status is abnormal and record the IP address.
  2. On the Alarms page, check whether the Process Fault alarm is generated for the DBServer instance on the host corresponding to the preceding IP address.

    • If yes, go to 11.
    • If no, go to 19.

  3. Rectify the fault by following the procedure provided in ALM-12007 Process Fault.
  4. Wait about 5 minutes and check whether the alarm is cleared in the alarm list.

    • If yes, no further action is required.
    • If no, go to 19.

Check the status of the active and standby DBServer processes.

  1. Log in to the host that corresponds to the IP address of DBService as user root, and run the su - omm command to switch to user omm.
  2. Run the cd ${DBSERVER_HOME} command to access the installation directory of DBService.
  3. Run the sh sbin/status-dbserver.sh command to view the status of the active and standby HA processes of DBService. Determine whether the status can be viewed successfully.

    HAMode 
    double 
    
    NodeName                  HostName               HAVersion                StartTime                HAActive             HAAllResOK           HARunPhase          
    10_5_89_12                host01                 V100R001C01              2019-06-13 21:33:09      active               normal               Actived             
    10_5_89_66                host03                 V100R001C01              2019-06-13 21:33:09      standby              normal               Deactived           
    
    NodeName                  ResName                ResStatus                ResHAStatus              ResType             
    10_5_89_12                floatip                Normal                   Normal                   Single_active       
    10_5_89_12                gaussDB                Active_normal            Normal                   Active_standby      
    10_5_89_66                floatip                Stopped                  Normal                   Single_active       
    10_5_89_66                gaussDB                Standby_normal           Normal                   Active_standby  
    • If yes, go to 16.
    • If no, go to 19.

  4. Check whether the active and standby HA processes are abnormal.

    • If yes, go to 17.
    • If no, go to 19.

  5. On FusionInsight Manager, choose Cluster > Name of the desired cluster > Services > DBService. On the displayed page, click More > Restart Service to restart DBService. Check whether DBService is restarted successfully.

    • If yes, go to 18.
    • If no, go to 19.

  6. Wait about 2 minutes and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 19.

Collect fault information.

  1. On FusionInsight Manager, choose O&M. In the navigation pane on the left, choose Log > Download.
  2. Expand the Service drop-down list, and select DBService and NodeAgent for the target cluster.
  3. Click in the upper right corner, and set Start Date and End Date for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click Download.
  4. Contact O&M engineers and provide the collected logs.

Alarm Clearance

This alarm is automatically cleared after the fault is rectified.

Related Information

None.