Updated on 2023-07-14 GMT+08:00

Cluster Lifecycle Management

MRS supports cluster lifecycle management, including creating and terminating clusters.

  • Creating a cluster: After you specify a cluster type, components, number of nodes of each type, VM specifications, AZ, VPC, and authentication information, MRS automatically creates a cluster that meets the configuration requirements. You can run customized scripts in the cluster. In addition, you can create clusters of different types for multiple application scenarios, such as Hadoop analysis clusters, HBase clusters, and Kafka clusters. The big data platform supports heterogeneous cluster deployment. That is, VMs of different specifications can be combined in a cluster based on CPU types, disk capacities, disk types, and memory sizes. Various VM specifications can be mixed in a cluster.
  • Terminating a cluster: You can terminate a pay-per-use cluster that is no longer needed (including data and configurations in the cluster). MRS will delete all resources related to the cluster.
  • Renewal: MRS provides two billing modes: pay-per-use and yearly/monthly. In pay-per-use mode, fees are deducted every hour and insufficient balance can lead to overdue payments. In yearly/monthly mode, clusters need to be renewed before they expire. If your subscription for the pay-per-use or yearly/monthly cluster is not renewed, your services will keep running, but enter into a retention period, during which the MRS clusters will stop running but data is retained.
  • Unsubscription: If you have purchased a yearly/monthly cluster and do not need the cluster resources before the cluster resources expire, you can unsubscribe from the cluster resources on MRS.

Buying a Cluster

On the MRS management console, you can buy an MRS cluster on a pay-per-use or yearly/monthly basis. You can select a region and cloud resource specifications to buy an MRS cluster that is suitable for enterprise services with one click. MRS automatically installs and deploys the Huawei Cloud enterprise-level big data platform and optimizes parameters based on the selected cluster type, version, and node specifications.

MRS provides you with fully managed big data clusters. When creating a cluster, you can set a VM login mode (password or key pair). You can use all resources of the created MRS cluster. In addition, MRS allows you to deploy a big data cluster on only two ECSs with 4 vCPUs and 8 GB memory, providing more flexible choices for testing and development.

MRS clusters are classified into analysis, streaming, and hybrid clusters.

  • Analysis cluster: is used for offline data analysis and provides Hadoop components.
  • Streaming cluster: is used for streaming tasks and provides stream processing components.
  • Hybrid cluster: is used for not only offline data analysis but also streaming processing, and provides Hadoop components and stream processing components.
  • Custom: You can flexibly combine required components (MRS 3.x and later versions) based on service requirements.

MRS cluster nodes are classified into Master, Core, and Task nodes.

  • Master node: management node in a cluster. Master processes of a distributed system, Manager, and databases are deployed on Master nodes. Master nodes cannot be scaled out. The processing capability of Master nodes determines the upper limit of the management capability of the entire cluster. MRS supports scale-up of Master node specifications to provide support for management of a larger cluster.
  • Core node: used for both storage and computing and can be scaled in or out. Since Core nodes bear data storage, there are many restrictions on scale-in to prevent data loss and auto scaling cannot be performed.
  • Task node: used only for computing only and can be scaled in or out. Task nodes bear only computing tasks. Therefore, auto scaling can be performed.

You can buy a cluster in two modes: custom buying and quick buying.

  • Custom buying: On the Custom Config page, you can flexibly configure cluster parameters based on application scenarios, such as the billing mode and ECS specifications to better suit your service requirements.
  • Quick buying: On the Quick Config page, you can quickly buy a cluster based on application scenarios, improving cluster configuration efficiency. Currently, Hadoop analysis clusters, HBase clusters, Kafka clusters, ClickHouse clusters, and real-time analysis clusters are supported.
    • Hadoop analysis cluster: uses components in the open-source Hadoop ecosystem to analyze and query vast amounts of data. For example, use Yarn to manage cluster resources, Hive and Spark to provide offline storage and computing of large-scale distributed data, Spark Streaming and Flink to offer streaming data computing, and Presto to enable interactive queries, and Tez to provide a distributed computing framework of directed acyclic graphs (DAGs).
    • HBase cluster: uses Hadoop and HBase components to provide a column-oriented distributed cloud storage system featuring enhanced reliability, great performance, and elastic scalability. It applies to the storage and distributed computing of massive amounts of data. You can use HBase to build a storage system capable of storing TB- or even PB-level data. With HBase, you can filter and analyze data with ease and get responses in milliseconds, rapidly mining data value.
    • Kafka cluster: uses Kafka and Storm to provide an open source message system with high throughput and scalability. It is widely used in scenarios such as log collection and monitoring data aggregation to implement efficient streaming data collection and real-time data processing and storage.
    • ClickHouse cluster: ClickHouse is a columnar database management system used for online analysis. It features the optimal compression rate and fast query performance. It is widely used in Internet advertisement, app and web traffic analysis, telecom, finance, and IoT fields.
    • Real-time analysis clusters: uses Hadoop, Kafka, Flink, and ClickHouse components to provide a system for collection, real-time analysis, and query of data at scale.

Terminating a Cluster

MRS allows you to terminate a cluster when it is no longer needed. After the cluster is terminated, all cloud resources used by the cluster will be released. Before terminating a cluster, you are advised to migrate or back up data. Terminate the cluster only when no service is running in the cluster or the cluster is abnormal and cannot provide services based on O&M analysis. If data is stored on EVS disks or pass-through disks in a big data cluster, the data will be deleted after the cluster is terminated. Therefore, exercise caution when terminating a cluster.