Updated on 2024-09-23 GMT+08:00

ALM-12041 Incorrect Permission on Key Files

Description

The system checks whether the permission, user, and user group information about critical directories or files is normal every 5 minutes. This alarm is generated when the information is abnormal.

This alarm is cleared when the information becomes normal.

Attribute

Alarm ID

Alarm Severity

Auto Clear

12041

Major

Yes

Parameters

Name

Meaning

Source

Specifies the cluster or system for which the alarm is generated.

ServiceName

Specifies the service name for which the alarm is generated.

RoleName

Specifies the role name for which the alarm is generated.

HostName

Specifies the object (host ID) for which the alarm is generated.

PathName

Specifies the path or name of the abnormal file.

Impact on the System

System functions are unavailable.

  • If the permission on the okerberos and oldap key files is abnormal, authentication fails and jobs may fail.
  • If the permission on the controller and pms key files is abnormal, the process may be faulty, which may affect the elastic scaling performance.
  • If the permission on key Tomcat files is abnormal, the login and viewing functions of FusionInsight Manager are affected.

Possible Causes

The file permission is abnormal or the file is lost due to a user manually modified information such as the file permission, user, and user group, or the system is powered off unexpectedly.

Procedure

Check whether the abnormal file exists and whether the permission on the abnormal file is correct.

  1. On the FusionInsight Manager portal, choose O&M > Alarm > Alarms.
  2. Check the value of HostName to obtain the host name involved in this alarm. Check the value of PathName to obtain the path or name of the abnormal file.
  3. Log in to the node for which the alarm is generated as user root.
  4. Run the ll pathName command, where pathName indicates the name of the abnormal file to obtain the user, permission, and user group information about the file or directory.
  5. Go to ${BIGDATA_HOME}/om-agent/nodeagent/etc/agent/autocheck directory. Then run the vi keyfile command and search for the name of the abnormal file and check the due permission of the file.

    To ensure proper configuration synchronization between the active and standby OMS servers, files, directories, and files and sub-directories in the directories configured in $OMS_RUN_PATH/workspace/ha/module/hasync/plugin/conf/filesync.xml will also be monitored except files and directories in keyfile. User omm must have read and write permissions of files and read and execute permissions of directories.

  6. Compare the real-world permission of the file with the due permission obtained in 5 and correct the permission, user, and user group information for the file.
  7. Wait a hour and check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 8.

    If the disk partition where the cluster installation directory resides is used up, some temporary files will be generated in the program installation directory when running the sed command fails. Users do not have the read, write, and execute permissions of these temporary files. The system reports an alarm indicating that permissions of temporary files are abnormal if these files are within the monitoring range of the alarm. Perform the preceding alarm handling processes to clear the alarm. Alternatively, you can directly delete the temporary files after confirming that files with abnormal permissions are temporary. The temporary file generated after a sed command execution failure is similar to the following.

Collect fault information.

  1. On the FusionInsight Manager portal, choose O&M > Log > Download.
  2. Select NodeAgent from the Service and click OK.
  3. Click in the upper right corner, and set Start Date and End Date for log collection to 10 minutes ahead of and after the alarm generation time, respectively. Then, click Download.
  4. Contact the O&M personnel and send the collected log information.

Alarm Clearing

After the fault is rectified, the system automatically clears this alarm.

Related Information

None