Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
On this page

Show all

Java API

Updated on 2022-07-11 GMT+08:00

Directly consult official website for detailed API of MapReduce: http://hadoop.apache.org/docs/r3.1.1/api/index.html

Common Interfaces

Common classes in MapReduce are as follows:

  • org.apache.hadoop.mapreduce.Job: an interface for users to submit MR jobs and used to set job parameters, submit jobs, control job executions, and query job status.
  • org.apache.hadoop.mapred.JobConf: configuration class of a MapReduce job and major configuration interface for users to submit jobs to Hadoop.
Table 1 Common interfaces of org.apache.hadoop.mapreduce.Job

Interface

Description

Job(Configuration conf, String jobName), Job(Configuration conf)

Creates a MapReduce client for configuring job attributes and submitting the job.

setMapperClass(Class<extends Mapper> cls)

A core interface used to specify the Mapper class of a MapReduce job. The Mapper class is empty by default. You can also configure mapreduce.job.map.class in mapred-site.xml.

setReducerClass(Class<extends Reducer> cls)

A core interface used to specify the Reducer class of a MapReduce job. The Reducer class is empty by default. You can also configure mapreduce.job.reduce.class in mapred-site.xml.

setCombinerClass(Class<extends Reducer> cls)

Specifies the Combiner class of a MapReduce job. The Combiner class is empty by default. You can also configure mapreduce.job.combine.class in mapred-site.xml. The Combiner class can be used only when the input and output key and value types of the reduce task are the same.

setInputFormatClass(Class<extends InputFormat> cls)

A core interface used to specify the InputFormat class of a MapReduce job. The default InputFormat class is TextInputFormat. You can also configure mapreduce.job.inputformat.class in mapred-site.xml. This interface can be used to specify the InputFormat class for processing data in different formats, reading data, and splitting data into data blocks.

setJarByClass(Class< > cls)

A core interface used to specify the local location of the JAR package of a class. Java locates the JAR package based on the class file and uploads the JAR package to the Hadoop distributed file system (HDFS).

setJar(String jar)

Specifies the local location of the JAR package of a class. You can directly set the location of a JAR package and upload the JAR package to the HDFS. Use either setJar(String jar) or setJarByClass(Class< > cls). You can also configure mapreduce.job.jar in mapred-site.xml.

setOutputFormatClass(Class<extends OutputFormat> theClass)

A core interface used to specify the OutputFormat class of a MapReduce job. The default OutputFormat class is TextOutputFormat. You can also configure mapred.output.format.class in mapred-site.xml. In the default TextOutputFormat, each key and value are recorded in text. OutputFormat is not specified usually.

setOutputKeyClass(Class< > theClass)

A core interface used to specify the output key type of a MapReduce job. You can also configure mapreduce.job.output.key.class in mapred-site.xml.

setOutputValueClass(Class< > theClass)

A core interface used to specify the output value type of a MapReduce job. You can also configure mapreduce.job.output.value.class in mapred-site.xml.

setPartitionerClass(Class<extends Partitioner> theClass)

Specifies the Partitioner class of a MapReduce job. You can also configure mapred.partitioner.class in mapred-site.xml. This method is used to allocate Map output results to reduce classes. HashPartitioner is used by default, which evenly allocates the key-value pairs of a map task. For example, in HBase applications, different key-value pairs belong to different regions. In this case, you must specify the Partitioner class to allocate map output results.

setSortComparatorClass(Class<extends RawComparator> cls)

Specifies the compression class for output results of a map task. Compression is not implemented by default. You can also configure mapreduce.map.output.compress and mapreduce.map.output.compress.codec in mapred-site.xml. You can compress data for transmission when the map task outputs a large amount of data.

setPriority(JobPriority priority)

Specifies the priority of a MapReduce job. Five priorities can be set: VERY_HIGH, HIGH, NORMAL, LOW, and VERY_LOW. The default priority is NORMAL. You can also configure mapreduce.job.priority in mapred-site.xml.

Table 2 Common interfaces of org.apache.hadoop.mapred.JobConf

Interface

Description

setNumMapTasks(int n)

A core interface used to specify the number of map tasks in a MapReduce job. You can also configure mapreduce.job.maps in mapred-site.xml.

NOTE:

The InputFormat class controls the number of map tasks. Ensure that the InputFormat class supports setting the number of map tasks on the client.

setNumReduceTasks(int n)

A core interface used to specify the number of reduce tasks in a MapReduce job. Only one reduce task is started by default. You can also configure mapreduce.job.reduces in mapred-site.xml. The number of reduce tasks is controlled by users. In most cases, the number of reduce tasks is one-fourth the number of map tasks.

setQueueName(String queueName)

Specifies the queue where a MapReduce job is submitted. The default queue is used by default. You can also configure mapreduce.job.queuename in mapred-site.xml.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback