Updated on 2022-06-01 GMT+08:00

Suggestions

Notes for Reading and Writing HDFS Files

HDFS does not support random read/write.

In HDFS, data can be appended only to the end of a file.

Only the data stored in HDFS can be appended. The edit.log and metadata files do not support appending. When using the appending function, set dfs.support.append in hdfs-site.xml to true.

By default, the dfs.support.append parameter is false in the open source community version and is true in the MRS.

This parameter is a server parameter. You are advised to set the parameter to true to use the append function.

If HDFS is not applicable, you can use other methods, such as HBase, to store data.

HDFS Is Not Suitable for Storing a Large Number of Small Files

HDFS is not suitable for storing a large number of small files because the metadata of small files will consume excessive memory resources of the NameNode.

Back Up HDFS Data in Three Duplicates

Three duplicates are enough for DataNode data backup. System data security is improved when more duplicates are generated but system efficiency is reduced. If a node is faulty, data on the node will be balanced to other nodes.