Replacing the NTP Server for the Cluster

Scenario

After FusionInsight Manager is installed, if no NTP server is configured or the configured NTP server is no longer used, you can specify a new NTP server for the cluster or replace the NTP server with a new one to enable the cluster to synchronize time with the new NTP clock source.

Impact on the System

Replacing the NTP server is a high-risk operation and may result in time change in the cluster.
If the time difference between the NTP server and the cluster is greater than 150s before the NTP server replacement, you need to stop the cluster first to prevent data loss. Services are unavailable when the cluster is stopped.

Prerequisites

You have prepared a new NTP server and obtained its IP address, and have configured the network between the cluster and the new NTP server. Ensure that the NTP service status of the server is normal. Otherwise, the operations in this section will fail.

Procedure

Log in to FusionInsight Manager and check whether there are uncleared alarms.
- If yes, clear the alarm. After the alarm is cleared, go to 2.
- If no, go to 2.
Log in to the active and standby management nodes as user omm.
Run the following command on the active management node to check the management plane gateway:

cat ${BIGDATA_HOME}/om-server/OMS/workspace/conf/oms-config.ini | grep om_gateway
Run the ping Management plane gateway command on the active and standby management nodes and check whether the nodes are connected to the management plane gateway.
- If yes, go to 5.
- If no, contact the network administrator to rectify the network fault. After the fault is rectified, go to 5.
Run the following command on the active management node to obtain the domain name of the NTP server in the current environment:

This section uses ntp.myhuaweicloud.com as an example.

cat /opt/Bigdata_func/cloudinit/cloudinit_params | grep ntpserver
On the active management node, check the time difference between the new NTP server and the cluster. The unit is second.

For example, to check the time different with the NTP server at ntp.myhuaweicloud.com, run the ntpdate -d ntp.myhuaweicloud.com command. The following information is displayed:
```
 6 Dec 15:16:10 ntpdate[2861453]: step time server 10.79.3.251 offset +2.118107 sec
```
In the preceding information, +2.118107 sec indicates the time offset. A positive value indicates that the NTP server time is earlier than the current cluster time. A negative value indicates the opposite.
- You can run the ntpq -v or ntpq --version command to query the NTP version. The command output may vary with the actual service environment.
  
  Output of the ntpq -v command:
  10.1.1.112: ~# ntpq -v ntpq - standard NTP query program - Ver. 4.2.4p8
  
  Output of the ntpq --version command:
  10.1.1.112: ~# ntpq --version ntpq 4.2.8p10@1.3728-o Mon Jun 6 08:01:59 UTC 2016 (1)
Check whether the absolute value of the time difference exceeds 150.
- If yes, go to 8.
- If no, perform 10 as user omm.
Check whether the cluster can be stopped.
- If yes, stop upper-layer services and the cluster, and go to 9.
- If no, no further action is required.
Check whether the time of the NTP server is slower than the time of the cluster.
- If yes, wait a period of the time difference obtained in 6 after message Operation successful is displayed on the UI, perform 11 as user omm.
- If no, after message Operation successful is displayed on the UI, perform 11 as user omm.
Run the following command on the active management node to replace the NTP server:

sh ${BIGDATA_HOME}/om-server/om/bin/tools/modifyntp.sh --ntp_server_ip ntp.myhuaweicloud.com

The IP address of the NTP server cannot be set to the IP address of a node in the cluster. Otherwise, the service network between the node and the active/standby OMS node may be disconnected.
Run the following command on the active management node to forcibly synchronize time from the NTP server at ntp.myhuaweicloud.com immediately and replace the NTP server:

sh ${BIGDATA_HOME}/om-server/om/bin/tools/modifyntp.sh --ntp_server_ip ntp.myhuaweicloud.com --force_sync_time
- If the cluster is stopped, start the cluster after the NTP server is replaced.
- After the command for forcible time synchronization is executed, it takes about five minutes for time synchronization on cluster nodes.