Help Center/
Data Lake Insight/
FAQs/
Spark Jobs/
Spark Job Development/
How Do I Read Uploaded Files for a Spark Jar Job?
Updated on 2024-11-15 GMT+08:00
How Do I Read Uploaded Files for a Spark Jar Job?
You can use SparkFiles to read the file submitted using –-file form a local path: SparkFiles.get("Name of the uploaded file").
- The file path in the Driver is different from that obtained by the Executor. The path obtained by the Driver cannot be passed to the Executor.
- You still need to call SparkFiles.get("filename") in Executor to obtain the file path.
- The SparkFiles.get() method can be called only after Spark is initialized.
Figure 1 Adding other dependencies
The java code is as follows:
package main.java import org.apache.spark.SparkFiles import org.apache.spark.sql.SparkSession import scala.io.Source object DliTest { def main(args:Array[String]): Unit = { val spark = SparkSession.builder .appName("SparkTest") .getOrCreate() // Driver: obtains the uploaded file. println(SparkFiles.get("test")) spark.sparkContext.parallelize(Array(1,2,3,4)) // Executor: obtains the uploaded file. .map(_ => println(SparkFiles.get("test"))) .map(_ => println(Source.fromFile(SparkFiles.get("test")).mkString)).collect() } }
Parent topic: Spark Job Development
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
The system is busy. Please try again later.
For any further questions, feel free to contact us through the chatbot.
Chatbot