Help Center > > User Guide> FusionInsight Manager Operation Guide> Alarm Reference (Applicable to MRS 3.x)> ALM-24001 Flume Agent Exception

ALM-24001 Flume Agent Exception

Updated at: Mar 25, 2021 GMT+08:00

Description

The Flume Agent instance for which the alarm is generated cannot be started. This alarm is generated when the Flume Agent process is faulty (The system checks in every 5 seconds.) or Flume Agent fails to start (The system reporting alarms immediately).

This alarm is cleared when the Flume Agent process recovers, Flume Agent starts successfully and the alarm handling is completed.

Attribute

Alarm ID

Alarm Severity

Automatically Cleared

24001

Major

Yes

Parameters

Name

Meaning

Source

Specifies the cluster for which the alarm is generated.

ServiceName

Specifies the service for which the alarm is generated.

AgentName

Specifies the agent for which the alarm is generated.

RoleName

Specifies the role for which the alarm is generated.

HostName

Specifies the host for which the alarm is generated.

Impact on the System

The Flume Agent instance for which the alarm is generated cannot provide services properly, and the data transmission tasks of the instance are temporarily interrupted. Real-time data is lost during real-time data transmission.

Possible Causes

  • The JAVA_HOME directory does not exist or the Java permission is incorrect.
  • The Flume Agent directory permission is incorrect.
  • Flume Agent fails to start.

Procedure

Check whether the JAVA_HOME directory exists or whether the JAVA permission is correct.

  1. Log in to the host where the alarm is generated as user root.
  2. Run the cd Flume client installation directory/fusioninsight-flume-1.9.0/conf/ command to go to the Flume configuration directory.
  3. Run the cat ENV_VARS command.
  4. Check whether the JAVA_HOME directory exists.

    • If yes, go to 6.
    • If no, go to 5.

  5. Specify the correct JAVA_HOME directory.
  6. Check whether the user who runs Flume Agent is assigned the Java execution permission.

    • If yes, go to 8.
    • If no, go to 7.

  7. Assign the user who runs Flume Agent with the Java execution permission.

Check the directory permission of the Flume Agent.

  1. Log in to the host where the alarm is generated as user root.
  2. Run the following command to switch to the Flume Agent installation directory:

    cd Flume Agent installation directory

  3. Run the ls -al * -R command to check whether any file owner is the user who running the Flume Agent.

    • If yes, go to 11.
    • If no, run the chown command to change the file owner to the user who runs the Flume Agent.

Checking the Flume Agent Configuration.

  1. Check whether the type of Flume Source is spooldir or Taildir.

    • If yes, go to 11.
    • If no, go to 16.

  2. Check whether the data monitoring directory exists.

    • If yes, go to 14.
    • If no, go to 13.

  3. Specify a correct data monitoring directory.
  4. Check whether the Flume Agent user has the read, write, and execute permissions on the monitoring directory.

    • If yes, go to 16.
    • If no, go to 15.

  5. Grant the read, write, and execute permissions on the monitoring directory to the Flume Agent running user.
  6. Check whether the components connected to Flume Sink are in security mode.

    • If yes, go to 17.
    • If no, go to 22.

  7. Check whether the configuration file contains the keytab authentication path.

    • If yes, go to 19.
    • If no, go to 18.

  8. Specify the correct keytab directory, and go to 20.
  9. Check whether the Flume Agent running user has the permission to access the keytab authentication file.

    • If yes, go to 21.
    • If no, go to 20.

  10. Grant the read permission on the keytab file in the authentication path to the Flume Agent running user, and restart Flume Agent process.
  11. Check whether the alarm is cleared.

    • If yes, no further action is required.
    • If no, go to 22.

Collect fault information.

  1. On FusionInsight Manager, choose O&M > Log > Download.
  2. Select Flume in the required cluster from the Service drop-down list.
  3. Click in the upper right corner, and set Start Date and End Date for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click Download.
  4. Contact the O&M personnel and send the collected logs.

Alarm Clearing

After the fault is rectified, the system automatically clears this alarm.

Related Information

None

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel