ALM-27001 DBService Service Unavailable

Description

The alarm module checks the DBService service status every 30 seconds. This alarm is generated when the system detects that DBService service is unavailable.

This alarm is cleared when DBService service recovers.

Attribute

Alarm ID	Alarm Severity	Automatically Cleared
27001	Critical	Yes

Parameters

Name	Meaning
Source	Specifies the cluster for which the alarm is generated.
ServiceName	Specifies the service for which the alarm is generated.
RoleName	Specifies the role for which the alarm is generated.
HostName	Specifies the host for which the alarm is generated.

Impact on the System

The database service is unavailable and cannot provide data import and query functions for upper-layer services, which results in some services exceptions.

Possible Causes

The floating IP address does not exist.
There is no active DBServer instance.
The active and standby DBServer processes are abnormal.

Procedure

Check whether the floating IP address exists in the cluster environment.

On the FusionInsight Manager home page, choose Cluster > Name of the desired cluster > Services > DBService > Instance.
Check whether the active instance exists.
- If yes, go to 3.
- If no, go to 9.
Select the active DBServer instance and record the IP address.
Log in to the host that corresponds to the preceding IP address as user root, and run the ifconfig command to check whether the DBService floating IP address exists on the node.
- If yes, go to 5.
- If no, go to 9.
Run the ping floatip command to check whether the DBService floating IP address can be pinged successfully.
- If yes, go to 6.
- If no, go to 9.
Log in to the host that corresponds to the DBService floating IP address as user root, and run the command to delete the floating IP address.

ifconfig interface down
On the FusionInsight Manager home page, choose Cluster > Name of the desired cluster > Services > DBService > More > Restart Service to restart DBService, and check whether DBService is restarted successfully.
- If yes, go to 8.
- If no, go to 9.
Wait for about 2 minutes and check whether the alarm is cleared in the alarm list.
- If yes, no further action is required.
- If no, go to 14.

Check the status of the active DBServer instance.

Select the DBServer instance whose role status is abnormal and record the IP address.
On the Alarm page, check whether Process Fault occurs in the DBServer instance on the host that corresponds to the IP address.
- If yes, go to 11.
- If no, go to 14.
Handle the alarm according to "ALM-12007 Process Fault".
Wait for about 5 minutes and check whether the alarm is cleared in the alarm list.
- If yes, no further action is required.
- If no, go to 19.

Check the status of the active and standby DBServers.

Log in to the host that corresponds to the preceding IP address as user root, and run the su - omm command to switch to user omm.
Run the cd ${DBSERVER_HOME} command to go to the installation directory of the DBService.

Run the sh sbin/status-dbserver.sh command to view the status of the active and standby HA processes of DBService. Determine whether the status can be viewed successfully.

HAMode 
double 

NodeName                  HostName               HAVersion                StartTime                HAActive             HAAllResOK           HARunPhase          
10_5_89_12                host01                 V100R001C01              2019-06-13 21:33:09      active               normal               Actived             
10_5_89_66                host03                 V100R001C01              2019-06-13 21:33:09      standby              normal               Deactived           

NodeName                  ResName                ResStatus                ResHAStatus              ResType             
10_5_89_12                floatip                Normal                   Normal                   Single_active       
10_5_89_12                gaussDB                Active_normal            Normal                   Active_standby      
10_5_89_66                floatip                Stopped                  Normal                   Single_active       
10_5_89_66                gaussDB                Standby_normal           Normal                   Active_standby

If yes, go to 16.
If no, go to 19.

Check whether the active and standby HA processes are in the abnormal state.
- If yes, go to 17.
- If no, go to 19.
On FusionInsight Manager, choose Cluster > Name of the desired cluster > Services > DBService > More > Restart Service to restart DBService, and check whether the system displays a message indicating that the restart is successful.
- If yes, go to 18.
- If no, go to 19.
Wait for about 2 minutes and check whether the alarm is cleared in the alarm list.
- If yes, no further action is required.
- If no, go to 19.

Collect fault information.

On FusionInsight Manager, choose O&M > Log > Download.
Select DBService in the required cluster and NodeAgent from the Service.
Click in the upper right corner, and set Start Date and End Date for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click Download.
Contact the O&M personnel and send the collected logs.