Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

HDFS Java APIs

Updated on 2024-10-23 GMT+08:00

For details about Hadoop distributed file system (HDFS) APIs, see:

http://hadoop.apache.org/docs/r3.1.1/api/index.html.

HDFS Common API

Common HDFS Java classes are as follows:

  • FileSystem: the core class of client applications. For details about common APIs, see Table 1.
  • FileStatus: record the status of files and directories. For details about common APIs, see Table 2.
  • DFSColocationAdmin: API used to manage colocation group information. For details about common APIs, see Table 3.
  • DFSColocationClient: API used to manage colocation files. For details about common APIs, see Table 4.
    NOTE:
    • The system reserves only the mapping between nodes and locator IDs, but does not reserve the mapping between files and locator IDs. When a file is created using a Colocation interface, the file is created on the node that corresponds to a locator ID. File creation and writing must be performed using Colocation interfaces.
    • After the file is written, subsequent operations on the file can use other open-source interfaces in addition to Colocation interfaces.
    • The DFSColocationClient class inherits from the open-source DistributedFileSystem class and contains common file operation functions. If a user uses the DFSColocationClient class to create a Colocation file, the user is advanced to use the functions of this class in file operations.
Table 1 Common FileSystem APIs

API

Description

public static FileSystem get(Configuration conf)

FileSystem is the API class provided for users in the Hadoop class library. FileSystem is an abstract class. Concrete classes can be obtained only using the get method. The get method has multiple overload versions and is commonly used.

public FSDataOutputStream create(Path f)

This API is used to create files in the HDFS. f indicates a complete file path.

public void copyFromLocalFile(Path src, Path dst)

This API is used to upload local files to a specified directory in the HDFS. srcanddst indicate complete file paths.

public boolean mkdirs(Path f)

This API is used to create folders in the HDFS. f indicates a complete folder path.

public abstract boolean rename(Path src, Path dst)

This API is used to rename a specified HDFS file. srcanddst indicate complete file paths.

public abstract boolean delete(Path f, boolean recursive)

This API is used to delete a specified HDFS file. findicates the complete path of the file to be deleted, andrecursive specifies recursive deletion.

public boolean exists(Path f)

This API is used to query a specified HDFS file. f indicates a complete file path.

public FileStatus getFileStatus(Path f)

This API is used to obtain the FileStatus object of a file or folder. The FsStatus object records status information of the file or folder, including the modification time and file directory.

public BlockLocation[] getFileBlockLocations(FileStatus file, long start, long len)

This API is used to query the block location of a specified file in an HDFS cluster. fileindicates a complete file path, andstartandlen specify the block scope.

public FSDataInputStream open(Path f)

This API is used to open the output stream of a specified file in the HDFS and read the file using the API provided by the FSDataInputStream class. f indicates a complete file path.

public FSDataOutputStream create(Path f, boolean overwrite)

This API is used to create the input stream of a specified file in the HDFS and write the file using the API provided by the FSDataOutputStream class. findicates a complete file path. Ifoverwriteistrue, the file is rewritten if it exists; ifoverwriteisfalse, an error is reported if the file exists.

public FSDataOutputStream append(Path f)

This API is used to open the input stream of a specified file in the HDFS and write the file using the API provided by the FSDataOutputStream class. f indicates a complete file path.

Table 2 Common FileStatus APIs

API

Description

public long getModificationTime()

This API is used to query the modification time of a specified HDFS file.

public Path getPath()

This API is used to query all files in an HDFS directory.

Table 3 Common DFSColocationAdmin APIs

API

Description

public Map<String, List<DatanodeInfo>> createColocationGroup(String groupId,String file)

This API is used to create a group based on the locatorIds information in the file. file indicates the file path.

public Map<String, List<DatanodeInfo>> createColocationGroup(String groupId,List<String> locators)

This API is used to create a group based on the locatorIds information in the list in the memory.

public void deleteColocationGroup(String groupId)

This API is used to delete a group.

public List<String> listColocationGroups()

This API is used to return all group information of Colocation. The returned group ID arrays are sorted by the creation time.

public List<DatanodeInfo> getNodesForLocator(String groupId, String locatorId)

This API is used to obtain the list of all nodes in the locator.

Table 4 Common DFSColocationAdmin APIs

API

Description

public FSDataOutputStream create(Path f, boolean overwrite, String groupId,String locatorId)

This API is used to create a FSDataOutputStream in colocation mode to allow users to write files in f.

f is the HDFS path.

overwrite indicates whether an existing file can be overwritten.

groupId and locatorId of the file specified by a user must exist.

public FSDataOutputStream create(final Path f, final FsPermission permission, final EnumSet<CreateFlag> cflags, final int bufferSize, final short replication, final long blockSize, final Progressable progress, final ChecksumOpt checksumOpt, final String groupId, final String locatorId)

The function of this API is the same as that of FSDataOutputStream create(Path f, boolean overwrite, String groupId, String locatorId), except that users are allowed to customize checksum.

public void close()

This API is used to close the connection.

Table 5 HDFS client WebHdfsFileSystem API

API

Description

public RemoteIterator<FileStatus> listStatusIterator(final Path)

This API will help in fetching the child files and folders information through multiple request using remote iterator, thus avoiding the user interface from becoming slow when there are millions of child information to be fetched.

Glob path pattern based API to get LocatedFileStatus and Open file from FileStatus

Following APIs are added in DistributedFileSystem to get the FileStatus with block location and open file from FileStatus object. These APIs will reduce the number of RPC calls from client to Namenodes.

Table 6 FileSystem APIs

Interface

Description

public LocatedFileStatus[] globLocatedStatus(Path, PathFilter, boolean) throws IOException

Return an array of LocatedFileStatus objects whose path names match pathPattern and pass the in path filter.

public FSDataInputStream open(FileStatus stat) throws IOException

If the stat is an instance of LocatedFileStatusHdfs that already have the location information, the InputStream is created without contacting NameNode.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback