Help Center/ MapReduce Service/ User Guide/ Managing Clusters/ Managing an MRS Cluster/ Replacing the NTP Server for an MRS Cluster
Updated on 2024-09-23 GMT+08:00

Replacing the NTP Server for an MRS Cluster

If no NTP server is configured or the configured NTP server is no longer used, you can specify a new NTP server for the MRS cluster or replace the NTP server with a new one to enable the cluster to synchronize time with the new NTP clock source.

This section applies only to MRS 3.x or later.

Prerequisites

  • You have prepared a new NTP server and obtained its IP address, and have configured the network between the cluster and the new NTP server.
  • Ensure that the NTP service status of the server is normal. Otherwise, the operations in this section will fail.

Impact on the System

  • Replacing the NTP server is a high-risk operation and may result in time change in the cluster.
  • If the time difference between the NTP server and the cluster is greater than 150s before the NTP server replacement, you need to stop the cluster first to prevent data loss. Services are unavailable when the cluster is stopped.
  • If the time difference between the NTP server and the cluster exceeds 15 minutes, the cluster will be unable to access OBS.
  • If your clusters use Kerberos authentication and the time difference between the NTP server and the cluster exceeds 5 minutes, authentication will not work.

Modifying the NTP Server of an MRS Cluster

  1. Log in to FusionInsight Manager and check whether there are uncleared alarms.

    • If yes, clear the alarm. After the alarm is cleared, go to 2.
    • If no, go to 2.

  2. Log in to the active and standby management nodes as user omm.
  3. Run the following command on the active management node to check the management plane gateway:

    cat ${BIGDATA_HOME}/om-server/OMS/workspace/conf/oms-config.ini | grep om_gateway

  4. Run the ping Management plane gateway command on the active and standby management nodes and check whether the nodes are connected to the management plane gateway.

    • If yes, go to 5.
    • If no, contact the network administrator to rectify the network fault. After the fault is rectified, go to 5.

  5. Run the following command on the active management node to obtain the domain name of the NTP server in the current environment:

    This section uses ntp.myhuaweicloud.com as an example.

    cat /opt/Bigdata_func/cloudinit/cloudinit_params | grep ntpserver

  6. On the active management node, check the time difference between the new NTP server and the cluster. The unit is second.

    For example, to check the time different with the NTP server at ntp.myhuaweicloud.com, run the ntpdate -d ntp.myhuaweicloud.com command. The following information is displayed:

     6 Dec 15:16:10 ntpdate[2861453]: step time server 10.79.3.251 offset +2.118107 sec
    In the preceding information, +2.118107 sec indicates the time offset. A positive value indicates that the NTP server time is earlier than the current cluster time. A negative value indicates the opposite.
    • You can run the ntpq -v or ntpq --version command to query the NTP version. The command output may vary with the actual service environment.

      • Output of the ntpq -v command:
        10.1.1.112: ~# ntpq -v
        ntpq - standard NTP query program - Ver. 4.2.4p8
      • Output of the ntpq --version command:
        10.1.1.112: ~# ntpq --version
        ntpq 4.2.8p10@1.3728-o Mon Jun  6 08:01:59 UTC 2016 (1)

  7. Check whether the absolute value of the time difference exceeds 150.

    • If yes, go to 8.
    • If no, perform 10 as user omm.

  8. Check whether the cluster can be stopped.

    • If yes, stop upper-layer services and the cluster, and go to 9.
    • If no, no further action is required.

  9. Check whether the time of the NTP server is slower than the time of the cluster.

    • If yes, wait a period of the time difference obtained in 6 after message Operation successful is displayed on the UI, perform 11 as user omm.
    • If no, after message Operation successful is displayed on the UI, perform 11 as user omm.

  10. Run the following command on the active management node to replace the NTP server:

    sh ${BIGDATA_HOME}/om-server/om/bin/tools/modifyntp.sh --ntp_server_ip ntp.myhuaweicloud.com

    The IP address of the NTP server cannot be set to the IP address of a node in the cluster. Otherwise, the service network between the node and the active/standby OMS node may be disconnected.

  11. Run the following command on the active management node to forcibly synchronize time from the NTP server at ntp.myhuaweicloud.com immediately and replace the NTP server:

    sh ${BIGDATA_HOME}/om-server/om/bin/tools/modifyntp.sh --ntp_server_ip ntp.myhuaweicloud.com --force_sync_time

    • If the cluster is stopped, start the cluster after the NTP server is replaced.
    • After the command for forcible time synchronization is executed, it takes about five minutes for time synchronization on cluster nodes.