Why the "Class Does not Exist" Error Is Reported While the SparkStreamingKafka Project Is Running?
Question
When the KafkaWordCount task (org.apache.spark.examples.streaming.KafkaWordCount) is being submitted by running the spark-submit script, the log file shows that the Kafka-related class does not exist. The KafkaWordCount sample is provided by the Spark open-source community. The KafkaWordCount sample is provided by the Spark open-source community.
Answer
When Spark is deployed, the following JAR files are saved in the $SPARK_HOME/jars/streamingClient directory on the client and the /opt/Bigdata/MRS/FusionInsight-Spark-2.2.1/spark/jars/streamingClient directory on the server.
- kafka-clients-0.8.2.1.jar
- kafka_2.10-0.8.2.1.jar
- spark-streaming-kafka_2.10-1.5.1.jar
Because $SPARK_HOME/lib/streamingClient/* is not added in to classpath by default, you need to configure manually.
When the application is submitted and run, add following parameters in the command:
--jars $SPARK_CLIENT_HOME/jars/streamingClient/kafka-clients-0.8.2.1.jar,$SPARK_CLIENT_HOME/jars/streamingClient/kafka_2.10-0.8.2.1.jar,$SPARK_CLIENT_HOME/jars/streamingClient/park-streaming-kafka_2.10-1.5.1.jar
You can run the preceding command to submit the self-developed applications and sample projects.
To submit the sample projects such as KafkaWordCount provided by Spark open source community, you need to add other parameters in addition to --jars. Otherwise, the ClassNotFoundException error will occur. The configurations in yarn-client and yarn-cluster modes are as follows:
- yarn-client mode:
In the configuration file spark-defaults.conf on the client, add the path of the client dependency package, for example $SPARK_HOME/lib/streamingClient/*, (in addition to --jars) to the spark.driver.extraClassPath parameter.
- yarn-cluster mode:
Perform any one of the following configurations in addition to --jars.
- In the configuration file spark-defaults.conf on the client, add the path of the server dependency package, for example /opt/huawei/Bigdata/FusionInsight/spark/spark/lib/streamingClient/*, to the spark.yarn.cluster.driver.extraClassPath parameter.
- Delete the spark-examples_2.10-1.5.1.jar package from each server node.
- In the spark-defaults.conf configuration file on the client, modify (or add and modify) the parameter spark.driver.userClassPathFirst to true.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.