更新时间:2023-04-28 GMT+08:00

当事件队列溢出时如何配置事件队列的大小

问题

当Driver日志中出现如下的日志时,表示事件队列溢出了。当事件队列溢出时如何配置事件队列的大小?

  • 普通应用
    Dropping SparkListenerEvent because no remaining room in event queue. 
    This likely means one of the SparkListeners is too slow and cannot keep
    up with the rate at which tasks are being started by the scheduler.
  • Spark Streaming应用
    Dropping StreamingListenerEvent because no remaining room in event queue.
    This likely means one of the StreamingListeners is too slow and cannot keep
    up with the rate at which events are being started by the scheduler.

回答

  1. 停止应用,在Spark的配置文件“spark-defaults.conf”中将配置项“spark.event.listener.logEnable”配置为“true”。并把配置项“spark.eventQueue.size”配置为1000W。如果需要控制打印频率(默认为1000毫秒打印1条日志),请根据需要修改配置项“spark.event.listener.logRate”,该配置项的单位为毫秒。
  2. 启动应用,可以发现如下的日志信息(消费者速率、生产者速率、当前队列中的消息数量和队列中消息数量的最大值)。
    INFO LiveListenerBus: [SparkListenerBus]:16044 events are consumed in 5000 ms.
    INFO LiveListenerBus: [SparkListenerBus]:51381 events are produced in 5000 ms, eventQueue still has 86417 events, MaxSize: 171764.
  3. 用户可以根据日志信息【队列中消息数量的最大值MaxSize】,在配置文件“spark-defaults.conf”中将配置项“spark.eventQueue.size”配置成合适的队列大小。比如【队列中消息数量的最大值】为250000,那么配置合适的队列大小为300000。