ALM-12012 NTP Service Abnormal (For MRS 2.x or Earlier)
Description
This alarm is generated when the NTP service on the current node fails to synchronize time with the NTP service on the active OMS node.
This alarm is cleared when the NTP service on the current node synchronizes time properly with the NTP service on the active OMS node.
Attribute
Alarm ID |
Alarm Severity |
Auto Clear |
---|---|---|
12012 |
Major |
Yes |
Parameters
Parameter |
Description |
---|---|
ServiceName |
Specifies the service for which the alarm is generated. |
RoleName |
Specifies the role for which the alarm is generated. |
HostName |
Specifies the host for which the alarm is generated. |
Impact on the System
The time on the node is inconsistent with that on other nodes in the cluster. Therefore, some MRS applications on the node may not run properly.
Possible Causes
- The NTP service on the current node cannot start properly.
- The current node fails to synchronize time with the NTP service on the active OMS node.
- The key value authenticated by the NTP service on the current node is inconsistent with that on the active OMS node.
- The time offset between the node and the NTP service on the active OMS node is large.
Procedure
- Check the NTP service on the current node.
- Check whether the ntpd process is running on the node using the following method. Log in to the node for which the alarm is generated and run the sudo su - root command to switch to user root. Then run the following command to check whether the command output contains the ntpd process:
ps -ef | grep ntpd | grep -v grep
- Run service ntp start to start the NTP service.
- Wait 10 minutes and check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 2.a.
- Check whether the ntpd process is running on the node using the following method. Log in to the node for which the alarm is generated and run the sudo su - root command to switch to user root. Then run the following command to check whether the command output contains the ntpd process:
- Check whether the current node can synchronize time properly with the NTP service on the active OMS node.
- Check whether the node can synchronize time with the NTP service on the active OMS node based on additional information of the alarm.
If yes, go to 2.b.
If no, go to 3.
- Check whether the synchronization with the NTP service on the active OMS node is faulty.
Log in to the node for which the alarm is generated, run the sudo su - root command to switch to user root, and run the ntpq -np command.
If an asterisk (*) exists before the IP address of the NTP service on the active OMS node in the command output, the synchronization is in normal state. The command output is as follows:
remote refid st t when poll reach delay offset jitter ============================================================================== *10.10.10.162 .LOCL. 1 u 1 16 377 0.270 -1.562 0.014
If there is no asterisk (*) before the IP address of the NTP service on the active OMS node, as shown in the following command output, and the value of refid is .INIT., the synchronization is abnormal.
remote refid st t when poll reach delay offset jitter ============================================================================== 10.10.10.162 .INIT. 1 u 1 16 377 0.270 -1.562 0.014
- Rectify the fault, wait 10 minutes, and then check whether the alarm is cleared.
An NTP synchronization failure is usually related to the system firewall. If the firewall can be disabled, disable it and then check whether the fault is rectified. If the firewall cannot be disabled, check the firewall configuration policies and ensure that port UDP 123 is enabled (you need to follow specific firewall configuration policies of each system).
- If yes, no further action is required.
- If no, go to 3.
- Check whether the node can synchronize time with the NTP service on the active OMS node based on additional information of the alarm.
- Check whether the key value authenticated by the NTP service on the current node is consistent with that on the active OMS node.
Run cat to check whether the authentication code whose key value index is 1 is the same as the value of the NTP service on the active OMS node.
- Check whether the time offset between the node and the NTP service on the active OMS node is large.
- Check whether the time offset is large in additional information of the alarm.
- On the Hosts page, select the host of the node, and choose More > Stop All Roles to stop all the services on the node.
If the time on the alarm node is later than that on the NTP service of the active OMS node, adjust the time of the alarm node. After adjusting the time, choose More > Start All Roles to start the services on the node.
If the time on the alarm node is earlier than that on the NTP service of the active OMS node, wait until the time offset is due and adjust the time of the alarm node. After adjusting the time, choose More > Start All Roles to start the services on the node.
If you do not wait, data loss may occur.
- Wait 10 minutes and check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to 5.
- Collect fault information.
- On MRS Manager, choose .
- Contact the O&M engineers and send the collected logs.
Reference
None
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.