Improving HDFS Write Performance

Scenario

The HDFS write performance directly affects the efficiency of the entire system. Improving the write performance can reduce the data write time, make the data processing and system response more efficient. Additionally, improving the HDFS write performance enables the HDFS cluster to better adapt to its service requirements.

Notes and Constraints

This section applies to MRS 3.x or later.

Procedure

Log in to FusionInsight Manager.
Choose Cluster > Services > HDFS > Configurations > All Configurations.

Search for the following parameters and change their values as required.

**Table 1** Parameters for improving HDFS write performance
Parameter	Description	Default Value
dfs.datanode.drop.cache.behind.reads	Whether to enable a DataNode to automatically clear all data in the cache after the data in the cache is transferred to the client. true: The cached data is discarded. This parameter needs to be configured on the DataNode. You are advised to set it to true if data is repeatedly read only a few times, so that the cache can be used by other operations. false: You are advised to set it to false if data is read repeatedly for many times to improve the read speed. This parameter is optional for improving write performance. You can configure it as needed.	false
dfs.client-write-packet-size	Size of each data packet when client writes data, in bytes. When the HDFS client writes data to the DataNode, the client generates multiple data packets and sends them to the DataNode for storage over the network. This parameter specifies the size of the data packet to be transmitted, which can be specified by each job. Larger data packets can reduce the number of transmissions, improve the bandwidth utilization and write performance, but may increase the delay of each transmission. Smaller data packets have lower transmission delay, but increase the number of transmissions. They are applicable to delay-sensitive scenarios. In the 10-Gigabit network, you can increase the value of this parameter to enhance the transmission throughput.	262144