El contenido no se encuentra disponible en el idioma seleccionado. Estamos trabajando continuamente para agregar más idiomas. Gracias por su apoyo.

Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

HDFS Java APIs

Updated on 2024-10-23 GMT+08:00

For details about Hadoop distributed file system (HDFS) APIs, see:

http://hadoop.apache.org/docs/r3.1.1/api/index.html.

HDFS Common API

Common HDFS Java classes are as follows:

  • FileSystem: the core class of client applications. For details about common APIs, see Table 1.
  • FileStatus: record the status of files and directories. For details about common APIs, see Table 2.
  • DFSColocationAdmin: API used to manage colocation group information. For details about common APIs, see Table 3.
  • DFSColocationClient: API used to manage colocation files. For details about common APIs, see Table 4.
    NOTE:
    • The system reserves only the mapping between nodes and locator IDs, but does not reserve the mapping between files and locator IDs. When a file is created using a Colocation interface, the file is created on the node that corresponds to a locator ID. File creation and writing must be performed using Colocation interfaces.
    • After the file is written, subsequent operations on the file can use other open-source interfaces in addition to Colocation interfaces.
    • The DFSColocationClient class inherits from the open-source DistributedFileSystem class and contains common file operation functions. If a user uses the DFSColocationClient class to create a Colocation file, the user is advanced to use the functions of this class in file operations.
Table 1 Common FileSystem APIs

API

Description

public static FileSystem get(Configuration conf)

FileSystem is the API class provided for users in the Hadoop class library. FileSystem is an abstract class. Concrete classes can be obtained only using the get method. The get method has multiple overload versions and is commonly used.

public FSDataOutputStream create(Path f)

This API is used to create files in the HDFS. f indicates a complete file path.

public void copyFromLocalFile(Path src, Path dst)

This API is used to upload local files to a specified directory in the HDFS. src and dst indicate complete file paths.

public boolean mkdirs(Path f)

This API is used to create folders in the HDFS. f indicates a complete folder path.

public abstract boolean rename(Path src, Path dst)

This API is used to rename a specified HDFS file. src and dst indicate complete file paths.

public abstract boolean delete(Path f, boolean recursive)

This API is used to delete a specified HDFS file. f indicates the complete path of the file to be deleted, and recursive specifies recursive deletion.

public boolean exists(Path f)

This API is used to query a specified HDFS file. f indicates a complete file path.

public FileStatus getFileStatus(Path f)

This API is used to obtain the FileStatus object of a file or folder. The FsStatus object records status information of the file or folder, including the modification time and file directory.

public BlockLocation[] getFileBlockLocations(FileStatus file, long start, long len)

This API is used to query the block location of a specified file in an HDFS cluster. file indicates a complete file path, and start and len specify the block scope.

public FSDataInputStream open(Path f)

This API is used to open the output stream of a specified file in the HDFS and read the file using the API provided by the FSDataInputStream class. f indicates a complete file path.

public FSDataOutputStream create(Path f, boolean overwrite)

This API is used to create the input stream of a specified file in the HDFS and write the file using the API provided by the FSDataOutputStream class f indicates a complete file path. If overwrite is true, the file is rewritten if it exists; if overwrite is false, an error is reported if the file exists.

public FSDataOutputStream append(Path f)

This API is used to open the input stream of a specified file in the HDFS and write the file using the API provided by the FSDataOutputStream class f indicates a complete file path.

Table 2 Common FileStatus APIs

API

Description

public long getModificationTime()

This API is used to query the modification time of a specified HDFS file.

public Path getPath()

This API is used to query all files in an HDFS directory.

Table 3 Common DFSColocationAdmin APIs

API

Description

public Map<String, List<DatanodeInfo>> createColocationGroup(String groupId,String file)

This API is used to create a group based on the locatorIds information in the file. file indicates the file path.

public Map<String, List<DatanodeInfo>> createColocationGroup(String groupId,List<String> locators)

This API is used to create a group based on the locatorIds information in the list in the memory.

public void deleteColocationGroup(String groupId)

This API is used to delete a group.

public List<String> listColocationGroups()

This API is used to return all group information of Colocation. The returned group ID arrays are sorted by the creation time.

public List<DatanodeInfo> getNodesForLocator(String groupId, String locatorId)

This API is used to obtain the list of all nodes in the locator.

Table 4 Common DFSColocationAdmin APIs

API

Description

public FSDataOutputStream create(Path f, boolean overwrite, String groupId,String locatorId)

This API is used to create a FSDataOutputStream in colocation mode to allow users to write files in f.

f is the HDFS path.

overwrite indicates whether an existing file can be overwritten.

groupId and locatorId of the file specified by a user must exist.

public FSDataOutputStream create(final Path f, final FsPermission permission, final EnumSet<CreateFlag> cflags, final int bufferSize, final short replication, final long blockSize, final Progressable progress, final ChecksumOpt checksumOpt, final String groupId, final String locatorId)

The function of this API is the same as that of FSDataOutputStream create(Path f, boolean overwrite, String groupId, String locatorId), except that users are allowed to customize checksum.

public void close()

This API is used to close the connection.

Table 5 HDFS client WebHdfsFileSystem API

API

Description

public RemoteIterator<FileStatus> listStatusIterator(final Path)

This API will help in fetching the child files and folders information through multiple requests using remote iterator, thus avoiding the user interface from becoming slow when there is a large amount of child information to be fetched.

SmallFS Common API

The Java class SmallFileSystem common interfaces of the SmallFS are shown in Table 6.

Table 6 Description of Class SmallFileSystem Common Interfaces

Interface

Description

public void close()

Closes the connection after use.

public void copyFromLocalFile(boolean delSrc, boolean overwrite, Path src,Path dst) throws IOException

This interface is used to upload the local file to the given location of the SmallFileSystem. src and dst indicate complete file paths.

public void copyFromLocalFile(boolean delSrc, boolean overwrite, Path[] srcs, Path dst) throws IOException

This interface is used to upload multiple local files to the given location of the SmallFileSystem. src and dst indicate complete file paths.

public void copyToLocalFile(boolean delSrc, Path src, Path dst, boolean useRawLocalFileSystem) throws IOException

This interface is used to download specified files of the SmallFileSystem to the local folder. src and dst indicate complete file paths.

public FSDataOutputStream create(Path path, FsPermission permission, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException

This interface is used to create files in the given path in the SmallFileSystem.

public FileSystem[] getChildFileSystems()

This interface is used to obtain all sub-file systems of the SmallFileSystem.

public long getDefaultBlockSize()

This interface is used to obtain the default block size of the SmallFileSystem.

public short getDefaultReplication()

This interface is used to obtain the default backup counts of the SmallFileSystem.

public BlockLocation[] getFileBlockLocations(Path path, long start, long len) throws IOException

This interface is used to obtain the block location of the given file path.

public Path getHomeDirectory()

This interface is used to obtain the original path.

public String getScheme()

This interface is used to obtain the Schema of the SmallFileSystem.

public FsServerDefaults getServerDefaults() throws IOException

This interface is used to obtain the default configuration of the SmallFileSystem.

public void initialize(URI name, Configuration conf) throws IOException

This interface is used to initialize the SmallFileSystem.

public void setOwner(Path path, String username, String groupname)

throws IOException

This interface is used to set the owner of the given path (file or path).

The parameters username and groupname cannot both be null.

NOTE:

The merged files do not support this interface.

public void setPermission(Path p, FsPermission permission) throws IOException

This interface is used to set the file permission of the given path (file or path).

NOTE:

The merged files do not support this interface.

public boolean setReplication(Path path, short replication) throws IOException

This interface is used to set the backup counts of the given path (file or path).

NOTE:

The merged files do not support this interface.

public void setTimes(Path path, long mtime, long atime) throws IOException

This interface is used to set the modification time and access time of the given path (file or path).

NOTE:

The merged files do not support this interface.

public boolean delete(Path path, boolean recursive) throws IOException

This interface is used to delete the given file path (file or path) in the SmallFileSystem.

public FileStatus getFileStatus(Path path) throws IOException

This interface is used to obtain the FsStatus object in the designated partition of the SmallFileSystem. The object records information such as the total capacity of the partition, used capacity, and remaining capacity.

public URI getUri()

This interface is used to return the default URI of the SmallFileSystem.

public Path getWorkingDirectory()

This interface is used to obtain the current working directory for the SmallFileSystem.

public FileStatus[] listStatus(Path path) throws IOException

This interface is used to list the statuses of files/directories in the given path if the path is a directory.

public boolean mkdirs(Path path, FsPermission permission) throws IOException

This interface is used to create files or folders for the given path.

NOTE:

The merged files do not support this interface.

public FSDataInputStream open(Path path, int bufferSize) throws IOException

This interface is used to open the output stream of the specified file in the SmallFileSystem and read files through the interface provided by the FSDataInputStream class. path indicates a complete file path.

public boolean rename(Path src, Path dst) throws IOException

This interface is used to rename the specified files of the SmallFileSystem. src and dst indicate complete file paths.

public void setWorkingDirectory(Path path)

The method is rewritten in the SmallFileSystem and cannot be set to other work directories.

public Configuration getConf()

This interface is used to obtain the configuration of the SmallFileSystem.

public Path getInitialWorkingDirectory()

This interface is used to obtain the initial working directory of the SmallFileSystem.

public FsStatus getStatus(Path path) throws IOException

This interface is used to return a status object describing the usage and capacity of the file system. If the file system has multiple partitions, the usage and capacity of the partition pointed by the specified path is reflected.

public FSDataOutputStream createNonRecursive(Path path, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress) throws IOException

This interface is used to open an FSDataOutputStream at the indicated path with write-progress reporting. Same as create(), except fails if the parent directory does not exist.

NOTE:

This interface is not recommended. Use the create interface instead.

public FSDataOutputStream append(Path path) throws IOException

This interface is used to add the additional content to the specified file path in the SmallFileSystem.

NOTE:

The merged files do not support this interface.

public boolean truncate(Path path, long newLength) throws IOException

This interface is used to cut relevant content of the specified file path in the SmallFileSystem.

NOTE:

The merged files do not support this interface.

public FsServerDefaults getServerDefaults(Path path) throws IOException

This interface is used to obtain the FsServerDefaults object of the target file system in the designated path. The object records information such as block size, backup count, and garbage retention time.

public long getUsed() throws IOException

This interface is used to return the total size of all files in the filesystem.

public long getDefaultBlockSize(Path path)

This interface is used to obtain the default block size of the specified file path.

public short getDefaultReplication(Path path)

This interface is used to obtain the default backup counts of the specified file path.

Glob path pattern based API to get LocatedFileStatus and Open file from FileStatus

Following APIs are added in DistributedFileSystem to get the FileStatus with block location and open file from FileStatus object. These APIs will reduce the number of RPC calls from client to Namenodes.

Table 7 FileSystem APIs

Interface

Description

public LocatedFileStatus[] globLocatedStatus(Path, PathFilter, boolean) throws IOException

Return an array of LocatedFileStatus objects whose path names match pathPattern and pass the in path filter.

public FSDataInputStream open(FileStatus stat) throws IOException

If the stat is an instance of LocatedFileStatusHdfs that already have the location information, the InputStream is created without contacting NameNode.

Utilizamos cookies para mejorar nuestro sitio y tu experiencia. Al continuar navegando en nuestro sitio, tú aceptas nuestra política de cookies. Descubre más

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback