Deze pagina is nog niet beschikbaar in uw eigen taal. We werken er hard aan om meer taalversies toe te voegen. Bedankt voor uw steun.

On this page

Show all

Help Center/ MapReduce Service/ Developer Guide (Normal_Earlier Than 3.x)/ Spark Application Development/ FAQs/ Why Does Kafka Fail to Receive the Data Written Back by Spark Streaming?

Why Does Kafka Fail to Receive the Data Written Back by Spark Streaming?

Updated on 2022-09-14 GMT+08:00

Question

When a running Spark Streaming task is writing data back to Kafka, Kafka cannot receive the written data and Kafka logs contain the following error information:

2016-03-02 17:46:19,017 | INFO | [kafka-network-thread-21005-1] | Closing socket connection to /10.91.8.208 due to invalid request: Request of length
122371301 is not valid, it is larger than the maximum size of 104857600 bytes. | kafka.network.Processor (Logging.scala:68)
2016-03-02 17:46:19,155 | INFO | [kafka-network-thread-21005-2] | Closing socket connection to /10.91.8.208. | kafka.network.Processor (Logging.scala:68)
2016-03-02 17:46:19,270 | INFO | [kafka-network-thread-21005-0] | Closing socket connection to /10.91.8.208 due to invalid request:
Request of length 122371301 is not valid, it is larger than the maximum size of 104857600 bytes. | kafka.network.Processor (Logging.scala:68)
2016-03-02 17:46:19,513 | INFO | [kafka-network-thread-21005-1] | Closing socket connection to /10.91.8.208 due to invalid request:
Request of length 122371301 is not valid, it is larger than the maximum size of 104857600 bytes. | kafka.network.Processor (Logging.scala:68)
2016-03-02 17:46:19,763 | INFO | [kafka-network-thread-21005-2] | Closing socket connection to /10.91.8.208 due to invalid request:
Request of length 122371301 is not valid, it is larger than the maximum size of 104857600 bytes. | kafka.network.Processor (Logging.scala:68)
53393 [main] INFO  org.apache.hadoop.mapreduce.Job  - Counters: 50

Answer

As shown in the figure below, the logic defined in Spark Streaming applications is as follows: reading data from Kafka > executing processing > writing result data back to Kafka.

Imagine that data is written into Kafka at a data rate of 10 MB/s, the interval (defined in Spark Streaming) between write-back operations is 60s, and a total of 600 MB data needs to be written back into Kafka. If a maximum of 500 MB data can be received at a time in Kafka, then the size of written-back data exceeds the threshold, triggering the error information.

Figure 1 Scenarios

Troubleshooting solution:

Method 1: On Spark Streaming, reduce the interval between write-back operations to avoid the size of written-back data exceeding the threshold defined by Kafka. The recommended interval is 5–10 seconds.

Method 2: Increase the threshold defined in Kafka. It is advisable to increase the threshold by adjusting the socket.request.max.bytes parameter of Kafka service on MRS Manager.

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback