Help Center/ MapReduce Service/ Component Operation Guide (Normal)/ Using HDFS/ Configuring the Number of Files in a Single HDFS Directory
Updated on 2025-10-11 GMT+08:00

Configuring the Number of Files in a Single HDFS Directory

Scenario

Generally, multiple services are deployed in a cluster, and the storage of most services depends on the HDFS file system. Different components such as Spark and Yarn or clients are constantly writing files to the same HDFS directory when the cluster is running. However, the number of files in a single directory in HDFS is limited. Users must plan to prevent excessive files in a single directory and task failure.

You can set the number of files in a single directory using the dfs.namenode.fs-limits.max-directory-items parameter in HDFS.

Procedure

  1. Log in to FusionInsight Manager.

    For details about how to log in to FusionInsight Manager, see Accessing MRS Manager.

  2. Choose Cluster > Services > HDFS > Configurations > All Configurations.
  3. Search for the following parameters.

    Table 1 Parameters

    Parameter

    Description

    Default Value

    dfs.namenode.fs-limits.max-directory-items

    Maximum number of items in a directory

    Value range: 1 to 6,400,000

    1048576

  4. Set the maximum number of files or directories that can be stored in a single HDFS directory. Save the modified configuration. Restart the expired service or instance for the configuration to take effect.

    Plan data storage in advance based on time and service type to prevent excessive files in a single directory. You are advised to use the default value, which allows about 1 million files in a single directory.