Help Center/ MapReduce Service/ User Guide/ MRS Cluster O&M/ MRS Cluster Alarm Handling Reference/ ALM-12062 OMS Parameter Configurations Mismatch with the Cluster Scale
Updated on 2024-09-23 GMT+08:00

ALM-12062 OMS Parameter Configurations Mismatch with the Cluster Scale

Description

The system checks whether the OMS parameter configurations match with the cluster scale at each top hour. If the OMS parameter configurations do not meet the cluster scale requirements, the system generates this alarm. This alarm is automatically cleared when the OMS parameter configurations are modified.

Attribute

Alarm ID

Alarm Severity

Auto Clear

12062

Major

Yes

Parameters

Parameter

Description

Source

Specifies the cluster or system for which the alarm is generated.

ServiceName

Specifies the name of the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Impact on the System

If the parameters configured for the current cluster are smaller than the configuration standard required by the cluster scale, problems such as job running delay and slow service page response may occur. In severe cases, the Agent or OMS process on the cluster node is abnormal. As a result, component jobs fail to be submitted and OMS data fails to be synchronized.

Possible Causes

The OMS parameter configurations mismatch with the cluster scale.

Procedure

Check whether the OMS parameter configurations match with the cluster scale.

  1. In the alarm list on FusionInsight Manager, locate the row that contains the alarm, and view the IP address of the host for which the alarm is generated.
  2. Log in to the host where the alarm is generated as user root.
  3. Run the su - omm command to switch to user omm.
  4. Run the vi $BIGDATA_LOG_HOME/controller/scriptlog/modify_manager_param.log command to open the log file and search for the log file containing the following information: Current oms configurations cannot support xx nodes. In the information, xx indicates the number of nodes in the cluster.
  5. Optimize the current cluster configuration by following the instructions in Optimizing Manager Configurations Based on the Number of Cluster Nodes.
  6. One hour later, check whether the alarm is cleared.

    • If it is, no further action is required.
    • If it is not, go to 7.

Collect fault information.

  1. On FusionInsight Manager, choose O&M > Log > Download.
  2. Select Controller from the Service and click OK.
  3. Click in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
  4. Contact the O&M personnel and send the collected log information.

Alarm Clearing

After the fault is rectified, the system automatically clears this alarm.

Related Information

Optimizing Manager Configurations Based on the Number of Cluster Nodes

  1. Log in to the active Manager node as user omm.
  2. Run the following command to switch the directory:

    cd ${BIGDATA_HOME}/om-server/om/sbin

  3. Run the following command to view the current Manager configurations.

    sh oms_config_info.sh -q

  4. Run the following command to specify the number of nodes in the current cluster.

    Command format: sh oms_config_info.sh -s number of nodes

    Example:

    sh oms_config_info.sh -s 1000

    Enter y as prompted.

    The following configurations will be modified:
         Module       Parameter         Current          Target 
         Controller   controller.Xmx    4096m        =>  16384m
         Controller   controller.Xms    1024m        =>  8192m  
         Controller   controller.node.heartbeat.error.threshold     30000                      =>   60000                   
         Pms          pms.mem           8192m        =>  10240m 
    Do you really want to do this operation? (y/n):

    The configurations are updated successfully if the following information is displayed:

    ...
    Operation has been completed. Now restarting OMS server.                  [done]
    Restarted oms server successfully.
    • OMS is automatically restarted during the configuration update process.
    • Clusters with similar quantities of nodes have same Manager configurations. For example, when the number of nodes is changed from 100 to 101, no configuration item needs to be updated.