Updated on 2024-09-23 GMT+08:00

MRS Cluster Types

MRS consists of multiple big data components, and you can select the cluster type that best fits your service requirements, data types, reliability expectations, and resource budget.

You can quickly buy a cluster using the preset cluster template or select the component list and advanced settings to manually buy a cluster.

Table 1 MRS cluster types

Type

Scenario

Core component

Hadoop analysis cluster

Hadoop cluster uses components in the open source Hadoop ecosystem to analyze and query vast amounts of data. For example, use YARN to manage cluster resources, Hive and Spark to provide offline storage and computing of large-scale distributed data, Spark Streaming and Flink to offer streaming data computing, and Tez to provide a distributed computing framework of directed acyclic graphs (DAGs).

Hadoop, Hive, Spark, Tez, Flink, ZooKeeper, and Ranger

HBase query cluster

An HBase cluster uses Hadoop and HBase components to provide a column-oriented distributed cloud storage system featuring enhanced reliability, great performance, and elastic scalability. It applies to the storage and distributed computing of massive amounts of data. You can use HBase to build a storage system capable of storing TB- or even PB-level data. With HBase, you can filter and analyze data with ease and get responses in milliseconds, rapidly mining data value.

Hadoop, HBase, ZooKeeper, and Ranger

Kafka streaming cluster

Kafka cluster uses Kafka and Storm to provide an open source message system with high throughput and scalability. It is widely used in scenarios such as log collection and monitoring data aggregation to implement efficient streaming data collection and real-time data processing and storage.

Kafka and Storm

ClickHouse cluster

ClickHouse is a columnar database management system for online analysis. It features ultimate compression ratio and fast query performance. It is widely used in Internet advertisement, app and web traffic analysis, telecom, finance, and IoT fields.

ClickHouse and ZooKeeper

Real-time analysis cluster

Real-time analysis clusters use Hadoop, Kafka, Flink, and ClickHouse components to provide a system for collection, real-time analysis, and query of data at scale.

Hadoop, Kafka, Flink, ClickHouse, ZooKeeper, and Ranger