Updated on 2022-11-18 GMT+08:00

Operation on SmallFS

Function

The new background small file merging feature of SmallFS enables it to automatically detect small files in the system based on the file size threshold, merge them during idle hours, and store metadata to a third-party Key Value (KV) system to reduce NameNode load. Moreover, it provides a new FileSystem interface for users to transparently access these small files.

The new FileSystem interface provides a wealth of file operation functions and is almost the same as the Hadoop Distributed File System (HDFS). Before compiling an API for SmallFS, read SmallFS Common API to find out if SmallFS supports the desired API.

Example Code

Using the SmallFS that is consistent with HDFS for secondary development is totally the same as using HDFS original interface for secondary development. For the example code, see the example code of HDFS.

  1. Initializing the HDFS
  2. Creating Directories
  3. Writing Data into a File
  4. Appending Data to a File
  5. Reading Data from a File
  6. Deleting a File
  7. Deleting Directories

Prohibited Operations and Restrictions

  • When using SmallFS interface that is not consistent with HDFS for secondary development, for the example code, see the SmallFSFileSystem class.
  • SmallFS does not support Colocation.
  • SmallFS does not support setting the storage policy.
  • Each small file directory contains a subdirectory named .sfs, which stores the merged files. You are not allowed to directly modify the subdirectory. Otherwise, data loss may occur.
  • Modifying small file directories using the HDFS interface is not recommended because it can easily cause data loss.