Updated on 2024-01-17 GMT+08:00

ALM-24001 Flume Agent Is Abnormal (For MRS 2.x or Earlier)

Description

This alarm is generated if the Flume agent monitoring module detects that the Flume agent process is abnormal.

This alarm is cleared after the Flume agent process recovers.

Attribute

Alarm ID

Alarm Severity

Auto Clear

24001

Minor

Yes

Parameters

Parameter

Description

ServiceName

Specifies the service for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Impact on the System

Functions of the alarmed Flume agent instance are abnormal. Data transmission tasks of the instance are suspended. In real-time data transmission, data will be lost.

Possible Causes

  • The JAVA_HOME directory does not exist or the Java permission is incorrect.
  • The permission of the Flume agent directory is incorrect.

Procedure

  1. Check the Flume agent's configuration file.

    1. Log in to the host where the faulty node resides. Run the following command to switch to user root:

      sudo su - root

    2. Run the cd Flume installation directory/fusioninsight-flume-1.6.0/conf/ command to go to Flume's configuration directory.
    3. Run the cat ENV_VARS command. Check whether the JAVA_HOME directory exists and whether the Flume agent user has execute permission of Java.
      • If yes, go to 2.a.
      • If no, go to 1.d.
    4. Specify the correct JAVA_HOME directory and grant the Flume agent user with the execute permission of Java. Then go to 2.d.

  2. Check the permission of the Flume agent directory.

    1. Log in to the host where the faulty node resides. Run the following command to switch to user root:

      sudo su - root

    2. Run the following command to access the installation directory of the Flume agent:

      cd Flume agent installation directory

    3. Run the ls -al * -R command. Check whether the owner of all files is the Flume agent user.
      • If yes, go to 3.
      • If no, run the chown command and change the owner of the files to the Flume agent user. Then go to 2.d.
    4. Check whether the alarm is cleared.
      • If yes, no further action is required.
      • If no, go to 3.

  3. Collect fault information.

    1. On MRS Manager, choose System > Export Log.
    2. Contact the O&M engineers and send the collected logs.

Related Information

N/A