Filtering Partitions Without Paths in a Partitioned Table
Scenarios
When you perform the select query in a Hive partitioned table, error message "FileNotFoundException" is displayed if a specified partition path does not exist in HDFS. To avoid the preceding error, configure the spark.sql.hive.verifyPartitionPath parameter to filter out partitions without paths.
Configuration Description
Perform either of the following methods to filter partitions without paths.
Method 1: Configure the parameter on the client.
- Install the Spark client.
For details, see Installing a Client.
- Log in to the Spark client node as the client installation user.
Modify the following parameters in the {Client installation directory}/Spark/spark/conf/spark-defaults.conf file on the Spark client.
Table 1 Parameter description Parameter
Description
Example Value
spark.sql.hive.verifyPartitionPath
Indicates whether to filter out partitions without paths when reading Hive partitioned tables.
- true: filters out partitions without paths when reading Hive partitioned tables.
- false: disables the filtering
true
Method 2: Configure the --conf parameter to filter out partitions without paths when running the spark-submit command to submit an application.
spark-submit --class org.apache.spark.examples.SparkPi --conf spark.sql.hive.verifyPartitionPath=true $SPARK_HOME/lib/spark-examples_*.jar
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot