Updated on 2022-12-08 GMT+08:00

Dynamic Resources

Overview

Yarn provides the distributed resource management function for a big data cluster. The total volume of resources allocated to Yarn can be configured. Then Yarn allocates and schedules computing resources for task queues. The computing resources of Mapreduce, Spark, Flink and Hive task queues are allocated and scheduled by Yarn.

Yarn queues are basic units of computing resource allocation.

For tenants, the resources obtained using Yarn task queues are dynamic resources. Users can dynamically create and modify the quotas of task queues and view the status and statistics of task queues.

Resource Pool

Complex cluster environments and upper-layer requirements are facing enterprise IT systems. For example:

  • Heterogeneous cluster: The computing speed, storage capacity, and network performance of each node in the cluster are different. All the tasks of complex applications need to be properly allocated to each compute node in the cluster based on service requirements.
  • Computing isolation: Data must be shared among multiple departments but computing resources must be distributed onto different compute nodes.

Compute nodes must be partitioned.

Resource pools are used to specify the configuration of dynamic resources. Yarn task queues are associated with resource pools for resource allocation and scheduling.

Only one default resource pool can be set for a tenant. Users can bind to the role of a tenant to use the resources in the resource pool of the tenant. If resources in multiple resource pools need to be used, users can bind themselves to multiple tenant roles.

Scheduling Mechanism

Yarn dynamic resources support label based scheduling. This policy creates labels for compute nodes (Yarn NodeManager nodes) of Yarn clusters and adds the compute nodes with the same label into the same resource pool. Then Yarn dynamically associates the task queues with resource pools based on the resource requirements of the task queues.

For example, a cluster has more than 40 nodes. Labels Normal, HighCPU, HighMEM, and HighIO are created based on the hardware and network configurations of nodes and added four resource pools. Table 1 describes the performance of each node in the resource pool.

Table 1 Performance of each node in a resource pool

Label

Number of Nodes

Hardware and Network Configuration

Added To

Association

Normal

10

Minor

Resource pool A

Common task queue

HighCPU

10

High-performance CPU

Resource pool B

Computing-intensive task queue

HighMEM

10

Large memory

Resource pool C

Memory-intensive task queue

HighIO

10

High-performance network

Resource pool D

I/O-intensive task queue

Task queues can use the compute nodes in the associated resource pools only.

  • Common task queues are associated with resource pool A and use nodes with hardware and network configurations labeled with Normal.
  • Computing-intensive task queues are associated with resource pool B and use nodes with CPUs labeled with HighCPU.
  • Memory-intensive task queues are associated with resource pool C and use nodes with memory labeled with HighMEM.
  • I/O-intensive task queues are associated with resource pool C and use nodes with the network labeled with HighIO.

Yarn task queues are associated with specified resource pools to efficiently utilize resources in resource pools and ensure node performance.

FusionInsight Manager supports a maximum of add 50 resource pools. A default resource pool is included in the system by default.

Introduction to Schedulers

Schedulers are divided into the open source Capacity scheduler and Superior scheduler. By default, the Superior scheduler is enabled for the MRS cluster.

  • The Capacity scheduler is an open source capacity regulator.
  • The Superior scheduler is an enhanced version and named after the Lake Superior, indicating that the scheduler can manage a large amount of data.

To meet enterprise requirements and tackle challenges facing the Yarn community in scheduling. The Superior scheduler not only integrates the advantages of the current Capacity scheduler and Fair scheduler, but also provides the following enhancements:

  • Enhanced resource sharing policy

    The Superior scheduler supports queue hierarchy. It integrates the functions of open source schedulers and shares resources based on configurable policies. In terms of instances, MRS cluster administrators can use the Superior scheduler to configure an absolute value or a percentage policy for queue resources. The resource sharing policy of the Superior scheduler enhances the label scheduling policy of Yarn as a resource pool feature. Nodes in a Yarn cluster can be grouped based on the capacity or service type to ensure that queues can more efficiently utilize resources.

  • Tenant-based resource reservation policy

    Resources required by tenants must be ensured for running critical tasks. The Superior scheduler builds a resource reservation mechanism. With this mechanism, reserved resources can be allocated to tasks run by tenant queues in a timely manner to ensure proper task execution.

  • Fair sharing among tenants and resource pool users

    The Superior scheduler allows shared resources to be configured for users in a queue. Each tenant may have users with different weights. Heavily weighted users may require more shared resources.

  • Ensured scheduling performance in a big cluster

    The Superior scheduler receives heartbeats from each NodeManager and saves resource information in memory, which enables the scheduler to control cluster resource usage globally. The Superior scheduler uses the push scheduling model, which makes the scheduling more precise and efficient and remarkably improves cluster resource utilization. Additionally, the Superior scheduler delivers excellent performance when the interval between NodeManager heartbeats is long and prevents heartbeat storms in big clusters.

  • Priority policy

    If the minimum resource requirement of a service cannot be met after the service obtains all available resources, a preemption occurs. The preemption function is disabled by default.