Updated on 2022-02-22 GMT+08:00

Flume Data Collection Is Slow

Symptom

After Flume is started, it takes a long time for Flume to collect data.

Cause Analysis

  1. The heap memory of Flume is not properly set. As a result, the Flume process keeps in the GC state. View Flume run logs.
    2019-02-26T13:06:20.666+0800: 1085673.512: [Full GC:[CMS: 3849339k->3843458K(3853568K), 2.5817610 secs] 4153654K->3843458K(4160256K), [CMS Perm : 27335K->27335K(45592K),2.5820080 SECS] [Times: user=2.63, sys0.00, real=2.59 secs]
  2. The deletePolicy policy configured for the Spooldir source is immediate.

Solution

  1. Increase the size of the heap memory (xmx).
  2. Change the deletePolicy policy of the Spooldir source to never.