Updated on 2025-10-11 GMT+08:00

Improving HDFS Write Performance

Scenario

The HDFS write performance directly affects the efficiency of the entire system. Improving the write performance can reduce the data write time, make the data processing and system response more efficient. Additionally, improving the HDFS write performance enables the HDFS cluster to better adapt to its service requirements.

Notes and Constraints

This section applies to MRS 3.x or later.

Procedure

  1. Log in to FusionInsight Manager.

    For details about how to log in to FusionInsight Manager, see Accessing MRS Manager.

  2. Choose Cluster > Services > HDFS > Configurations > All Configurations.
  3. Search for the following parameters and change their values as required.

    Table 1 Parameters for improving HDFS write performance

    Parameter

    Description

    Default Value

    dfs.datanode.drop.cache.behind.reads

    Whether to enable a DataNode to automatically clear all data in the cache after the data in the cache is transferred to the client.

    • true: The cached data is discarded. This parameter needs to be configured on the DataNode.

      You are advised to set it to true if data is repeatedly read only a few times, so that the cache can be used by other operations.

    • false: You are advised to set it to false if data is read repeatedly for many times to improve the read speed.

    This parameter is optional for improving write performance. You can configure it as needed.

    false

    dfs.client-write-packet-size

    Size of each data packet when client writes data, in bytes.

    When the HDFS client writes data to the DataNode, the client generates multiple data packets and sends them to the DataNode for storage over the network. This parameter specifies the size of the data packet to be transmitted, which can be specified by each job.

    • Larger data packets can reduce the number of transmissions, improve the bandwidth utilization and write performance, but may increase the delay of each transmission.
    • Smaller data packets have lower transmission delay, but increase the number of transmissions. They are applicable to delay-sensitive scenarios.

    In the 10-Gigabit network, you can increase the value of this parameter to enhance the transmission throughput.

    262144

  4. Save the settings. Restart the expired service or instance for the configuration to take effect.