Updated on 2024-10-09 GMT+08:00

Filtering Partitions Without Paths in a Partitioned Table

Scenarios

When you perform the select query in a Hive partitioned table, the FileNotFoundException exception is displayed if a specified partition path does not exist in HDFS. To avoid the preceding exception, configure the spark.sql.hive.verifyPartitionPath parameter to filter partitions without paths.

Configuration Description

Perform either of the following methods to filter partitions without paths.

  • Configure the following parameters in the spark-defaults.conf file on the Spark driver.
    Table 1 Parameter description

    Parameter

    Description

    Default Value

    spark.sql.hive.verifyPartitionPath

    Indicates whether to filter partitions without paths when reading Hive partitioned tables.

    true: filters partitions without paths when reading Hive partitioned tables.

    false: disables the filtering

    false

  • When running the spark-submit command to submit an application, configure the --conf parameter to filter partitions without paths.
    Example:
    spark-submit --class org.apache.spark.examples.SparkPi  --conf spark.sql.hive.verifyPartitionPath=true $SPARK_HOME/lib/spark-examples_*.jar