Updated on 2023-10-10 GMT+08:00

Configuring Auto Scaling Metrics

Auto Scaling Policies by Node Group

When you add a rule, you can refer to Table 1 to configure the corresponding metrics.

Table 1 Auto scaling metrics

Cluster Type

Metric

Value Type

Description

Streaming cluster

StormSlotAvailable

Integer

Number of available Storm slots

Value range: 0 to 2147483646

StormSlotAvailablePercentage

Percentage

Percentage of available Storm slots, that is, the proportion of the available slots to total slots

Value range: 0 to 100

StormSlotUsed

Integer

Number of used Storm slots

Value range: 0 to 2147483646

StormSlotUsedPercentage

Percentage

Percentage of the used Storm slots, that is, the proportion of the used slots to total slots

Value range: 0 to 100

StormSupervisorMemAverageUsage

Integer

Average memory usage of the Supervisor process of Storm

Value range: 0 to 2147483646

StormSupervisorMemAverageUsagePercentage

Percentage

Average percentage of the used memory of the Supervisor process of Storm to the total memory of the system

Value range: 0 to 100

StormSupervisorCPUAverageUsagePercentage

Percentage

Average percentage of the used CPUs of the Supervisor process of Storm to the total CPUs

Value range: 0 to 6000

Analysis cluster

YARNAppPending

Integer

Number of pending tasks on YARN

Value range: 0 to 2147483646

YARNAppPendingRatio

Ratio

Ratio of pending tasks on YARN, that is, the ratio of pending tasks to running tasks on YARN

Value range: 0 to 2147483646

YARNAppRunning

Integer

Number of running tasks on YARN

Value range: 0 to 2147483646

YARNContainerAllocated

Integer

Number of containers allocated to YARN

Value range: 0 to 2147483646

YARNContainerPending

Integer

Number of pending containers on YARN

Value range: 0 to 2147483646

YARNContainerPendingRatio

Ratio

Ratio of pending containers on Yarn, that is, the ratio of pending containers to running containers on YARN

Value range: 0 to 2147483646

YARNCPUAllocated

Integer

Number of virtual CPUs (vCPUs) allocated to YARN

Value range: 0 to 2147483646

YARNCPUAvailable

Integer

Number of available vCPUs on YARN

Value range: 0 to 2147483646

YARNCPUAvailablePercentage

Percentage

Percentage of available vCPUs on YARN that is, the proportion of available vCPUs to total vCPUs

Value range: 0 to 100

YARNCPUPending

Integer

Number of pending vCPUs on YARN

Value range: 0 to 2147483646

YARNMemoryAllocated

Integer

Memory allocated to YARN, in MB

Value range: 0 to 2147483646

YARNMemoryAvailable

Integer

Available memory on YARN in MB

Value range: 0 to 2147483646

YARNMemoryAvailablePercentage

Percentage

Percentage of available memory on YARN that is, the proportion of available memory to total memory on YARN

Value range: 0 to 100

YARNMemoryPending

Integer

Pending memory on YARN

Value range: 0 to 2147483646

  • When the value type is percentage or ratio in Table 1, the valid value can be accurate to percentile. The percentage metric value is a decimal value with a percent sign (%) removed. For example, 16.80 represents 16.80%.
  • Hybrid clusters support all metrics of analysis and streaming clusters.

Auto Scaling Policies by Resource Pool

When adding a rule, you can refer to Table 2 to configure the corresponding metrics.

Auto scaling policies can be configured for a cluster by resource pool in MRS 3.1.5 or later.

Table 2 Rule configuration description

Cluster Type

Metric

Value Type

Description

Analysis/Custom cluster

ResourcePoolMemoryAvailable

Integer

Available memory on YARN in the resource pool, in MB

Value range: 0 to 2147483646

ResourcePoolMemoryAvailablePercentage

Percentage

Percentage of available memory on YARN in the resource pool, that is, the proportion of available memory to total memory on YARN

Value range: 0 to 100

ResourcePoolCPUAvailable

Integer

Number of available vCPUs on YARN in the resource pool

Value range: 0 to 2147483646

ResourcePoolCPUAvailablePercentage

Percentage

Percentage of available vCPUs on YARN in the resource pool. that is, the proportion of available vCPUs to total vCPUs

Value range: 0 to 100

When you add a resource plan, you can configure parameters by referring to Table 3.
Table 3 Resource plan configuration items

Parameter

Description

Effective On

The effective date of a resource plan. Daily is selected by default. You can also select one or multiple days from Monday to Sunday.

Time Range

Start time and end time of a resource plan are accurate to minutes, with the value ranging from 00:00 to 23:59. For example, if a resource plan starts at 8:00 and ends at 10:00, set this parameter to 8:00-10:00. The end time must be at least 30 minutes later than the start time.

Node Range

The number of nodes in a resource plan ranges from 0 to 500. In the time range specified in the resource plan, if the number of task nodes is less than the specified minimum number of nodes, it will be increased to the specified minimum value of the node range at a time. If the number of task nodes is greater than the maximum number of nodes specified in the resource plan, the auto scaling function reduces the number of task nodes to the maximum value of the node range at a time. The minimum number of nodes must be less than or equal to the maximum number of nodes.

  • When a resource plan is enabled, the Default Range value on the auto scaling page forcibly takes effect beyond the time range specified in the resource plan. For example, if Default Range is set to 1-2, Time Range is between 08:00-10:00, and Node Range is 4-5 in a resource plan, the number of Task nodes in other periods (0:00-8:00 and 10:00-23:59) of a day is forcibly limited to the default node range (1 to 2). If the number of nodes is greater than 2, auto scale-in is triggered; if the number of nodes is less than 1, auto scale-out is triggered.
  • When a resource plan is not enabled, the Default Range takes effect in all time ranges. If the number of nodes is not within the default node range, the number of Task nodes is automatically increased or decreased to the default node range.
  • Time ranges of resource plans cannot be overlapped. The overlapped time range indicates that two effective resource plans exist at a time point. For example, if resource plan 1 takes effect from 08:00 to 10:00 and resource plan 2 takes effect from 09:00 to 11:00, the time range between 09:00 to 10:00 is overlapped.
  • The time range of a resource plan must be on the same day. For example, if you want to configure a resource plan from 23:00 to 01:00 in the next day, configure two resource plans whose time ranges are 23:00-00:00 and 00:00-01:00, respectively.

Automation Script

When you add an automation script, you can configure related parameters by referring to Table 4.

Table 4 Automation script configuration description

Parameter

Description

Name

Name of an automation script

The value can contain only numbers, letters, spaces, hyphens (-), and underscores (_) and must not start with a space.

The value can contain 1 to 64 characters.

NOTE:

A name must be unique in the same cluster. You can configure the same name for different clusters.

Script Path

Script path. The value can be an OBS file system path or a local VM path.

  • An OBS file system path must start with obs:// and end with .sh, for example, obs://mrs-samples/xxx.sh.
  • A local VM path must start with a slash (/) and end with .sh. For example, the path of the example script for installing the Zepelin is /opt/bootstrap/zepelin/zepelin_install.sh.

Execution Node

Select a type of the node where an automation script is executed.

NOTE:
  • If you select Master nodes, you can choose whether to run the script only on the active Master nodes by enabling or disabling the Active Master switch.
  • If you enable it, the script runs only on the active Master nodes. If you disable it, the script runs on all master nodes. This function is disabled by default.

Parameter

Automation script parameter. The following predefined variables can be imported to obtain auto scaling information:

  • ${mrs_scale_node_num}: Number of auto scaling nodes. The value is always positive.
  • ${mrs_scale_type}: Scale-out/in type. The value can be scale_out or scale_in.
  • ${mrs_scale_node_hostnames}: Host names of the auto scaling nodes. Use commas (,) to separate multiple host names.
  • ${mrs_scale_node_ips}: IP address of the auto scaling nodes. Use commas (,) to separate multiple IP addresses.
  • ${mrs_scale_rule_name}: Name of the triggered auto scaling rule. For a resource plan, this parameter is set to resource_plan.

Executed

Time for executing an automation script. The following four options are supported: Before scale-out, After scale-out, Before scale-in, and After scale-in.

NOTE:

Assume that the execution nodes include Task nodes.

  • The automation script executed before scale-out cannot run on the Task nodes to be added.
  • The automation script executed after scale-out can run on the added Task nodes.
  • The automation script executed before scale-in can run on Task nodes to be deleted.
  • The automation script executed after scale-in cannot run on the deleted Task nodes.

Action upon Failure

Whether to continue to execute subsequent scripts and scale-out/in after the script fails to be executed.
NOTE:
  • You are advised to set this parameter to Continue in the commissioning phase so that the cluster can continue the scale-out/in operation no matter whether the script is executed.
  • If the script fails to be executed, view the log in /var/log/Bootstrap on the cluster VM.
  • The scale-in operation cannot be rolled back. Therefore, the Action upon Failure can only be set to Continue after scale-in.

The automation script is triggered only during auto scaling. It is not triggered when the cluster node is manually scaled out or in.