Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Situation Awareness
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive

What Is MRS?

Updated on 2024-10-11 GMT+08:00

Big data presents both exciting opportunities and a huge challenge. As the data volume and types increase rapidly, conventional data processing technologies, such as standalone storage systems and relational databases, are struggling to keep up. Rising to this challenge, the Apache Software Foundation (ASF) launched an open source project called Hadoop. Hadoop is an open source distributed computing platform that can fully utilize the computing and storage capabilities of large compute clusters to process massive amounts of data. Hadoop is a powerful framework, but it is not easy to deploy and operationalize — If enterprises try to deploy Hadoop systems all by themselves, they may encounter problems such as high costs, long rollout, difficult maintenance, and inflexible use.

The MapReduce Service (MRS) offers a one-stop service that helps you quickly deploy and manage Hadoop systems on Huawei Cloud with ease. With MRS, you can create an enterprise-class Hadoop cluster with just a few clicks of your mouse. Tenants have total control over their Hadoop clusters and can effortlessly run big data components such as Storm, Hadoop, Spark, HBase, and Kafka. MRS supports a full range of open source APIs, and leveraging Huawei Cloud's deep expertise in compute, storage, and big data, it offers customers a full-stack big data platform featuring high performance, high cost-effectiveness, flexibility, and ease-of-use. Furthermore, the platform can be easily customized to meet new requirements and help enterprises quickly build a massive data processing system and discover new value and business opportunities by analyzing and mining massive amounts of data in real time or in non-real time.

Product Architecture

List of MRS Component Versions lists the MRS component versions.

Figure 1 shows the MRS logical architecture.

Figure 1 MRS architecture

MRS includes the infrastructure and an end-to-end big data processing pipeline.

  • Infrastructure
    MRS big data clusters fully utilize the high scalability, reliability, and security features of the virtualization layer powered by the cloud platform.
    • Virtual Private Cloud (VPC) provides virtual private networks for each tenant on the cloud.
    • Elastic Volume Service (EVS) provides reliable and high-performance storage.
    • Elastic Cloud Server (ECS) provides VMs that are easily scalable. It works with VPCs, security groups, and the EVS multi-replica mechanism to build an efficient, reliable, and secure computing environment.
  • Data collection

    The data collection layer provides the ability to efficiently ingest data from various data sources. It consists of Flume (data ingestion), Loader (relational data loading), and Kafka (highly reliable message queue). Alternatively, you can use Cloud Data Migration (CDM) service to ingest external data to MRS clusters.

  • Data storage

    MRS clusters can store both structured and unstructured data. They support multiple efficient data formats to meet the requirements of different computing engines, including:

    • HDFS, which is a general-purpose distributed file system for big data platforms.
    • Huawei Cloud OBS is an object storage service that features high availability and low cost.
  • Converged data processing
    • MRS supports multiple mainstream compute engines, including MapReduce (batch processing), Tez (DAG model), Spark (in-memory computing), Spark Streaming (micro-batch stream computing), Storm (stream computing), and Flink (stream computing). They convert data structures and logic into data models that meet the needs of a variety of big data applications.
    • Based on preset data models and easy-to-use SQL data analysis, users can choose Hive (data warehouse), SparkSQL, and Presto (interactive query engine) to run different types of analytical tasks.
  • Data display and scheduling

    Data analysis results are displayed intuitively. MRS also integrates with DataArts Studio to provide a one-stop, collaborative big data development platform, helping you easily run a range of different tasks, such as data modeling, data integration, script development, job scheduling, and O&M monitoring, making big data more accessible than ever before.

  • Cluster management

    All components of the Hadoop-based big data ecosystem are deployed in distributed mode, and their deployment, management, and O&M are complex.

    MRS provides a unified O&M and management platform for cluster management, supporting one-click cluster deployment, multi-version selection, as well as manual scaling and auto scaling of clusters with zero service interruption. In addition, MRS provides job management, resource tag management, and O&M covering all of the Hadoop components. One-stop O&M capabilities include monitoring, alarm reporting, parameter configuration, and patch upgrade.

Product Advantages

MRS has a strong Hadoop kernel team and is built on top of Huawei's enterprise-class FusionInsight big data platform. MRS can guarantee multi-level Service Level Agreements (SLAs).

MRS has the following advantages:

  • High performance

    MRS supports Huawei's own CarbonData storage solution. CarbonData allows a single copy of data to be used for multiple tasks. It supports features such as multi-level indexing, dictionary encoding, pre-aggregation, dynamic partitioning, and quasi-real-time data query. These features improve I/O scanning and computing performance, allowing tens of billions of data records to be analyzed in seconds. In addition, MRS supports the Superior Scheduler also developed by Huawei, which outperforms open-source schedulers in every way and enables efficient scheduling in super large clusters (up to 10,000 nodes).

  • Cost-effectiveness

    MRS supports a heterogeneous compute and storage infrastructure with decoupled storage and compute, offering a cost-effective mass storage solution. MRS supports fast auto scaling to accommodate changing demand, maximizing resource utilization for customers. MRS clusters can be quickly created and scaled out as you needed, and can be deleted or scaled when you no longer need them.

  • High security

    MRS provides enterprise-class multi-tenant permissions management and security management, with support for table-based and column-based access control and data encryption.

  • Easy O&M

    MRS provides an efficient big data cluster management platform that supports one-click rolling patch updates, which ensure the continuity of your services.

  • High reliability

    Tested and proven in numerous projects, the long-term reliability and stability of MRS in large-scale deployments can meet enterprise-class standards for production systems. In addition, MRS supports automatic data backup across AZs and regions, as well as automatic anti-affinity, allowing mission-critical VMs to be distributed on different physical machines.

Using MRS for the First Time

If you are a first-time user, you may get started with the following:

  • Basic concepts

    See Components and Functions to learn the basic information about MRS, including all its components and their enhancements over their open-source counterparts, as well as the unique features of MRS.

  • Getting started

    To learn how to use MRS, see MapReduce Service Getting Started. "Getting Started" provides detailed operation guides with real-world examples. You can create and use MRS clusters by following these guides.

  • Other functions and operation guides

    If you are an MRS cluster user or O&M engineer, you can perform operations such as cluster life cycle management, scaling, and job management by referring to MapReduce Service User Guide. To learn how to use each component, see MapReduce Service Component Operation Guide.

    If you are a developer, you can refer to the operation guides and examples in MapReduce Service Development Guide to develop, run, and commission your own applications. For details about how to call the APIs of MRS. For details, see MapReduce Service API Reference.

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback