ALM-27001 DBService Service Unavailable
Description
The alarm module checks the DBService service status every 30 seconds. This alarm is generated when the system detects that DBService service is unavailable.
This alarm is cleared when DBService service recovers.
Attribute
Alarm ID |
Alarm Severity |
Automatically Cleared |
---|---|---|
27001 |
Critical |
Yes |
Parameters
Name |
Meaning |
---|---|
Source |
Specifies the cluster for which the alarm is generated. |
ServiceName |
Specifies the service for which the alarm is generated. |
RoleName |
Specifies the role for which the alarm is generated. |
HostName |
Specifies the host for which the alarm is generated. |
Impact on the System
The database service is unavailable and cannot provide data import and query functions for upper-layer services, which results in some services exceptions.
Possible Causes
- The floating IP address does not exist.
- There is no active DBServer instance.
- The active and standby DBServer processes are abnormal.
Procedure
Check whether the floating IP address exists in the cluster environment.
- On the FusionInsight Manager home page, choose Cluster > Name of the desired cluster > Services > DBService > Instance.
- Check whether the active instance exists.
- Select the active DBServer instance and record the IP address.
- Log in to the host that corresponds to the preceding IP address as user root, and run the ifconfig command to check whether the DBService floating IP address exists on the node.
- Run the ping floatip command to check whether the DBService floating IP address can be pinged successfully.
- Log in to the host that corresponds to the DBService floating IP address as user root, and run the command to delete the floating IP address.
ifconfig interface down
- On the FusionInsight Manager home page, choose Cluster > Name of the desired cluster > Services > DBService > More > Restart Service to restart DBService, and check whether DBService is restarted successfully.
- Wait for about 2 minutes and check whether the alarm is cleared in the alarm list.
- If yes, no further action is required.
- If no, go to 14.
Check the status of the active DBServer instance.
- Select the DBServer instance whose role status is abnormal and record the IP address.
- On the Alarm page, check whether Process Fault occurs in the DBServer instance on the host that corresponds to the IP address.
- Handle the alarm according to "ALM-12007 Process Fault".
- Wait for about 5 minutes and check whether the alarm is cleared in the alarm list.
- If yes, no further action is required.
- If no, go to 19.
Check the status of the active and standby DBServers.
- Log in to the host that corresponds to the preceding IP address as user root, and run the su - omm command to switch to user omm.
- Run the cd ${DBSERVER_HOME} command to go to the installation directory of the DBService.
- Run the sh sbin/status-dbserver.sh command to view the status of the active and standby HA processes of DBService. Determine whether the status can be viewed successfully.
HAMode double NodeName HostName HAVersion StartTime HAActive HAAllResOK HARunPhase 10_5_89_12 host01 V100R001C01 2019-06-13 21:33:09 active normal Actived 10_5_89_66 host03 V100R001C01 2019-06-13 21:33:09 standby normal Deactived NodeName ResName ResStatus ResHAStatus ResType 10_5_89_12 floatip Normal Normal Single_active 10_5_89_12 gaussDB Active_normal Normal Active_standby 10_5_89_66 floatip Stopped Normal Single_active 10_5_89_66 gaussDB Standby_normal Normal Active_standby
- Check whether the active and standby HA processes are in the abnormal state.
- On FusionInsight Manager, choose Cluster > Name of the desired cluster > Services > DBService > More > Restart Service to restart DBService, and check whether the system displays a message indicating that the restart is successful.
- Wait for about 2 minutes and check whether the alarm is cleared in the alarm list.
- If yes, no further action is required.
- If no, go to 19.
Collect fault information.
- On FusionInsight Manager, choose O&M > Log > Download.
- Select DBService in the required cluster and NodeAgent from the Service.
- Click in the upper right corner, and set Start Date and End Date for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click Download.
- Contact the O&M personnel and send the collected logs.
Alarm Clearing
After the fault is rectified, the system automatically clears this alarm.
Related Information
None
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.