Help Center/ MapReduce Service/ Developer Guide (Normal_Earlier Than 3.x)/ MapReduce Development Guide/ MapReduce Application Development Overview/ Introduction to MapReduce Application Development

Updated on 2024-08-16 GMT+08:00

View PDF

Introduction to MapReduce Application Development

Hadoop MapReduce is an easy-to-use parallel computing software framework. Applications developed based on MapReduce can run on large clusters consisting of thousands of servers and concurrently process TB-level data sets in fault tolerance mode.

A MapReduce job (application or job) splits an input data set into several independent data blocks, which are processed by Map tasks in parallel mode. The framework sorts output results of the Map task, sends the results to Reduce tasks, and returns a result to the client. Input and output information is stored in the HDFS. The framework schedules and monitors tasks as well as re-executes failed tasks.

MapReduce has the following characteristics:

Large-scale parallel computing
Large data set processing
High fault tolerance and reliability
Proper resource scheduling

Parent topic: MapReduce Application Development Overview

Previous topic: MapReduce Application Development Overview

Next topic: Common Concepts of MapReduce Application Development

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.