Writing and Running the Spark Program in the Local Windows Environment
Scenario
You can run applications in the Windows environment after application code development is complete. The procedures for running applications developed using Scala or Java are the same on IDEA.
- In the Windows environment, only the sample code for accessing Spark SQL using JDBC is provided.
- Ensure that the Maven image repository of the SDK in the Huawei image site has been configured for Maven. For details, see Configuring Huawei Open-Source Mirrors.
Procedure
- Obtain the sample code.
Download the Maven project source code and configuration file of the sample project. For details, see Obtaining Sample Projects.
Import the sample code to IDEA.
- Obtain configuration files.
Obtain the files from the cluster client. Download the hive-site.xml and spark-defaults.conf files from $SPARK_HOME/conf to a local directory.
- Upload data to HDFS.
- Create a data text file on Linux and save the following data to the data file:
Miranda,32 Karlie,23 Candice,27
- On the HDFS client running the Linux OS, run the hadoop fs -mkdir /data command (or the hdfs dfs command) to create a directory.
- On the HDFS client running the Linux OS, run the hadoop fs -put data /data command to upload the data file.
- Create a data text file on Linux and save the following data to the data file:
- Configure related parameters in the sample code.
Change the SQL statement for loading data to LOAD DATA INPATH 'hdfs:/data/data' INTO TABLE CHILD.
- Add running parameters to the hive-site.xml and spark-defaults.conf files when the application is running.
- Run the application.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot