Updated on 2025-04-14 GMT+08:00

Initializing the HDFS

Function

Hadoop distributed file system (HDFS) initialization is a prerequisite for using application programming interfaces (APIs) provided by the HDFS. The process of initializing the HDFS is

  1. Load the HDFS service configuration file.
  2. Instantiate a FileSystem.

Example Codes

The following is code snippets. For complete codes, see the HdfsExample class in com.huawei.bigdata.hdfs.examples.

The initialization codes used when applications are run in Linux and the codes used when applications are run in Windows are the same. The example codes are as follows:
 //initialization
 /**confLoad();

 // Creating a sample project
 HdfsExample hdfs_examples = new HdfsExample("/user/hdfs-examples", "test.txt");
 /**
  * 
  * If the application is running in the Linux OS, the path of core-site.xml, hdfs-site.xml must be modified to the absolute path of the client file in the Linux OS. 
  *
  * 
  */  
 private static void confLoad() throws IOException {
   conf = new Configuration();
   // conf file
   conf.addResource(new Path(PATH_TO_HDFS_SITE_XML));
   conf.addResource(new Path(PATH_TO_CORE_SITE_XML));
   // conf.addResource(new Path(PATH_TO_SMALL_SITE_XML));
 }

 /**
  *Create a sample project.
  */
 public HdfsExample(String path, String fileName) throws IOException  {
   this.DEST_PATH = path;
   this.FILE_NAME = fileName;
   instanceBuild();
 }
 private void instanceBuild() throws IOException {
   fSystem = FileSystem.get(conf);
 }

(Optional) Specify a user to run the example code. To run the example code related to the Colocation operation, the user must be a member of the supergroup group. The following describes two ways to specify the user who runs the example code:

Add the environment variable HADOOP_USER_NAME: For operation details, see Commissioning an HDFS Application.

Modify the code: If HADOOP_USER_NAME is not specified, modify "USER" in the code to the actual user name:
System.setProperty("HADOOP_USER_NAME", USER);