Updated on 2025-08-22 GMT+08:00

Filtering Partitions Without Paths in a Partitioned Table

Scenarios

When you perform the select query in a Hive partitioned table, error message "FileNotFoundException" is displayed if a specified partition path does not exist in HDFS. To avoid the preceding error, configure the spark.sql.hive.verifyPartitionPath parameter to filter out partitions without paths.

Configuration Description

Perform either of the following methods to filter partitions without paths.

Method 1: Configure the parameter on the client.

  1. Install the Spark client.

    For details, see Installing a Client.

  2. Log in to the Spark client node as the client installation user.

    Modify the following parameters in the {Client installation directory}/Spark/spark/conf/spark-defaults.conf file on the Spark client.
    Table 1 Parameter description

    Parameter

    Description

    Example Value

    spark.sql.hive.verifyPartitionPath

    Indicates whether to filter out partitions without paths when reading Hive partitioned tables.

    • true: filters out partitions without paths when reading Hive partitioned tables.
    • false: disables the filtering

    true

Method 2: Configure the --conf parameter to filter out partitions without paths when running the spark-submit command to submit an application.

Example:
spark-submit --class org.apache.spark.examples.SparkPi  --conf spark.sql.hive.verifyPartitionPath=true $SPARK_HOME/lib/spark-examples_*.jar