ALM-12062 OMS Parameter Configurations Mismatch with the Cluster Scale

Alarm Description

The system checks whether the OMS parameter configurations match with the cluster scale at each top hour. If the OMS parameter configurations do not meet the cluster scale requirements, the system generates this alarm. This alarm is automatically cleared when the OMS parameter configurations are modified.

Alarm Attributes

Alarm ID	Alarm Severity	Auto Cleared
12062	Major	Yes

Alarm Parameters

Parameter	Description
Source	Specifies the cluster or system for which the alarm is generated.
ServiceName	Specifies the name of the service for which the alarm is generated.
RoleName	Specifies the role for which the alarm is generated.
HostName	Specifies the host for which the alarm is generated.

Impact on the System

If the parameters configured for the current cluster are smaller than the configuration standard required by the cluster scale, problems such as job running delay and slow service page response may occur. In severe cases, the Agent or OMS process on the cluster node is abnormal. As a result, component jobs fail to be submitted and OMS data fails to be synchronized.

Possible Causes

The OMS parameter configurations mismatch with the cluster scale.

Handling Procedure

Check whether the OMS parameter configurations match with the cluster scale.

Log in to the active management node of the MRS cluster as user root.

Run the following command to switch to user omm:
```
su - omm
```
Run the following command to open the Manager log:
```
vi $BIGDATA_LOG_HOME/controller/scriptlog/modify_manager_param.log
```
Search for log "Current oms configurations can not support xx nodes", where xx indicates the number of nodes in the current cluster.
Optimize the current cluster configuration by referring to Optimizing Manager Configurations Based on the Number of Cluster Nodes.
One hour later, check whether the alarm is cleared.
- If yes, no further action is required.
- If no, go to Step 5.

Collect fault information.

On FusionInsight Manager, choose O&M > Log > Download.
Select Controller from the Service and click OK.
Click in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
Contact the O&M personnel and send the collected log information.

Alarm Clearance

After the fault is rectified, the system automatically clears this alarm.

Related Information

Optimizing Manager Configurations Based on the Number of Cluster Nodes

Log in to the active management node of the MRS cluster as user root.

Run the following command to switch to user omm:
```
su - omm
```
Run the following command to switch the directory:
```
cd ${BIGDATA_HOME}/om-server/om/sbin
```
Run the following command to view the current Manager configurations.
```
sh oms_config_info.sh -q
```

Run the following command to specify the number of nodes in the current cluster:

Command format:

sh oms_config_info.sh -s Number of nodes

Example:

sh oms_config_info.sh -s 1000

Enter y as prompted.

The following configurations will be modified:
     Module       Parameter         Current               Target 
     Controller   controller.Xmx    4096m             =>  16384m
     Controller   controller.Xms    1024m             =>  8192m        Controller   controller.node.heartbeat.error.threshold     30000                      =>   60000                   
     Pms          pms.mem           8192m             =>  10240m 
Do you really want to do this operation? (y/n):

The configurations are updated successfully if the following information is displayed:

...
Operation has been completed. Now restarting OMS server.                  [done]
Restarted oms server successfully.

OMS is automatically restarted during the configuration update process.
Clusters with similar quantities of nodes have same Manager configurations. For example, when the number of nodes is changed from 100 to 101, no configuration item needs to be updated.

Parent Topic: MRS Cluster Alarm Handling Reference

Previous topic: ALM-12061 Process Usage Exceeds the Threshold

Next topic: ALM-12063 Unavailable Disk

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.

The system is busy. Please try again later.