Updated on 2022-12-14 GMT+08:00

Filtering Partitions without Paths in Partitioned Tables

Scenario

When you perform the select query in Hive partitioned tables, the FileNotFoundException exception is displayed if a specified partition path does not exist in HDFS. To avoid the preceding exception, configure spark.sql.hive.verifyPartitionPath parameter to filter partitions without paths.

Procedure

Perform either of the following methods to filter partitions without paths:

  • Configure the following parameter in the spark-defaults.conf file on Spark client.
    Table 1 Parameter description

    Parameter

    Description

    Default Value

    spark.sql.hive.verifyPartitionPath

    Whether to filter partitions without paths when reading Hive partitioned tables.

    true: enables the filtering

    false: disables the filtering

    false

  • When running the spark-submit command to submit an application, configure the --conf parameter to filter partitions without paths.
    For example:
    spark-submit --class org.apache.spark.examples.SparkPi  --conf spark.sql.hive.verifyPartitionPath=true $SPARK_HOME/lib/spark-examples_*.jar