Help Center > > User Guide> FusionInsight Manager Operation Guide (Applicable to 3.x)> Alarm Reference (Applicable to MRS 3.x)> ALM-24004 Exception Occurs When Flume Reads Data

ALM-24004 Exception Occurs When Flume Reads Data

Updated at: Sep 02, 2021 GMT+08:00

Description

The alarm module monitors the status of Flume Source. This alarm is generated immediately when the duration in which Source fails to read the data exceeds the threshold.

The default threshold is 0, indicating that the threshold is disabled. You can change the threshold by modifying the properties.properties file. Specifically, modify the NoDatatime parameter of required the source.

The alarm is cleared when Source reads the data and the alarm handling is complete.

Attribute

Alarm ID

Alarm Severity

Auto Clear

24004

Major

Yes

Parameters

Name

Meaning

Source

Specifies the cluster for which the alarm is generated.

ServiceName

Specifies the name of the service for which the alarm is generated.

HostName

Specifies the name of the host for which the alarm is generated.

AgentId

Specifies the ID of the agent for which the alarm is generated.

ComponentType

Specifies the component type for which the alarm is generated.

ComponentName

Specifies the component name for which the alarm is generated.

Impact on the System

If data is found in the data source and Flume Source continuously fails to read data, the data collection is stopped.

Possible Causes

  • Flume Source is faulty, so data cannot be sent.
  • The network is faulty, so the data cannot be sent.

Procedure

Check whether Flume Source is faulty.

  1. Open the properties.properties configuration file on the local PC, search for keyword type = spooldir in the file, and check whether the Flume source type is spoolDir.

    • If yes, go to 2.
    • If no, go to 3.

  2. View the spoolDir directory to check whether all files are already transferred.

    • If yes, no further action is required.
    • If no, go to 5.

      The monitoring directory of spooDir is specified by the .spoolDir parameter in the properties.properties configuration file. If all files in the monitoring directory have been transferred, the file name extension of all files in the monitoring directory is .COMPLETED.

  3. Open the properties.properties configuration file on the local PC, search for org.apache.flume.source.kafka.KafkaSource in the file, and check whether the Flume source type is Kafka.

    • If yes, go to 4.
    • If no, go to 7.

  4. Check whether the topic data configured by Kafka Source has been used up.

    • If yes, no further action is required.
    • If no, go to 5.

  5. On the FusionInsight Manager portal, choose Cluster > Name of the desired cluster > Services > Flume > Instance.
  6. Go to the Flume instance page of the faulty node to check whether the indicator Source Speed Metrics in the alarm is 0.

    • If yes, go to 11.
    • If no, go to 7.

Check the network connection between the faulty node and the node that corresponds to the Flume Source IP address.

  1. Open the properties.properties configuration file on the local PC, search for type = avro in the file, and check whether the Flume source type is Avro.

    • If yes, go to 8.
    • If no, go to 11.

  2. Log in to the faulty node as user root, and run the ping IP address of the Flume source command to check whether the peer host can be pinged successfully.

    • If yes, go to 11.
    • If no, go to 9.

  3. Contact the network administrator to restore the network.
  4. In the alarm list, check whether the alarm is cleared after a period.

    • If yes, no further action is required.
    • If no, go to 11.

Collect the fault information.

  1. On the FusionInsight Manager portal, choose O&M > Log > Download.
  2. Expand the Service drop-down list, and select Flume for the target cluster.
  3. Click in the upper right corner, and set Start Date and End Date for log collection to 1 hour ahead of and after the alarm generation time, respectively. Then, click Download.
  4. Contact O&M personnel and send the collected logs.

Alarm Clearing

After the fault that triggers the alarm is rectified, the alarm is automatically cleared.

Related Information

None

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel