Help Center > > User Guide> Managing Active Clusters> Using Auto Scaling in a Cluster

Using Auto Scaling in a Cluster

Updated at: Apr 28, 2020 GMT+08:00

In big data application scenarios, especially real-time data analysis and processing, the number of cluster nodes needs to be dynamically increased or decreased according to data volume changes to add or reduce resources. The auto scaling function of MRS enables clusters to be elastically scaled out or in based on cluster loads. In addition, if the data volume changes regularly every day and you want to scale out or in a cluster before the data volume changes, you can use the MRS resource plan feature (setting a Task node range based on the time range).

  • Auto scaling rules: You can increase or decrease Task nodes based on real-time cluster loads. Auto scaling will be triggered when the data volume changes but there may be some delay.
  • Resource plans (setting a Task node range based on the time range): If the data volume changes periodically, you can create resource plans to resize the cluster before the data volume changes, thereby avoiding a delay in increasing or decreasing resources.

You can configure either auto scaling rules or resource plans or both of them to trigger the auto scaling. Configuring both resource plans and auto scaling rules improves the cluster node scalability to cope with occasionally unexpected data volume peaks.

In some service scenarios, resources need to be reallocated or service logic needs to be modified after cluster scale-out or scale-in. If you manually scale out or scale in a cluster, you can log in to cluster nodes to reallocate resources or modify service logic. If you use auto scaling, MRS enables you to customize automation scripts for resource reallocation and service logic modification. Automation scripts can be executed before and after auto scaling and automatically adapt to service load changes, all of which eliminates manual operations. In addition, automation scripts can be fully customized and executed at various moments, which can meet your personalized requirements and improve auto scaling flexibility.

You can configure auto scaling rules when creating a cluster or after a cluster has been created. This section describes how to configure auto scaling rules after a cluster has been created. For details about how to configure auto scaling rules during cluster creation, see Configuring Auto Scaling Rules When Creating a Cluster.

Background

You can configure either auto scaling rules or resource plans or both of them to trigger the auto scaling.

  • Auto scaling rules:
    • A user can set a maximum of five scale-out or scale-in rules.
    • The system judges rules set by the user in sequence and cluster scale-out rules take priorities over cluster scale-in rules. Place rules according to their importance degrees and put the most important rule in the front to prevent the rules from being repeatedly triggered due to the unexpected result of cluster scale-out or scale-in.
    • Comparison factors are Greater than, Greater than or equal to, Less than, and Less than or equal to.
    • Cluster scale-out or scale-in can be triggered only after the configured metric threshold is reached for 5n (the default value of n is 1) consecutive minutes.
    • After each cluster scale-out or scale-in, there is a cooldown period. The default cooldown period is 20 minutes and the minimum cooldown period is 0 minutes.
    • In each cluster scale-out or scale-in, at least one node and at most 100 nodes can be added or reduced.
  • Resource plans (setting a Task node range based on the time range):
    • You can specify a Task node range (minimum number to maximum number) in a time range. If the number of Task nodes is beyond the Task node range in a resource plan, the system triggers auto scaling.
    • You can set a maximum of five resource plans for a cluster.
    • A resource plan cycle is by day. The start time and end time can be set to any time point between 00:00 and 23:59. The start time must be at least 30 minutes earlier than the end time. Time ranges configured for different resource plans cannot overlap.
    • After a resource plan triggers cluster scale-out or scale-in, there is a 10-minute cooldown period. Auto scaling will not be triggered again within the cooldown period.
    • When a resource plan is enabled, the number of Task nodes is limited to the default node range at any time except the time range set in the resource plan.
    • If the resource plan is not enabled, the number of Task nodes is not limited to the default node range.
  • Automation scripts:
    • You can set an automation script so that it can automatically run on cluster nodes when auto scaling is triggered.
    • You can set a maximum number of 10 automation scripts for a cluster.
    • You can specify an automation script to be executed on one or more types of nodes.
    • Automation scripts can be executed before or after scale-out or scale-in.
    • Before using automation scripts, upload them to a cluster VM or OBS bucket in the same region as the cluster. The automation scripts uploaded to the cluster VM can be executed only on the existing nodes. If you want to make the automation scripts run on the new nodes, upload them to the OBS bucket.

Using Auto Scaling Rules Alone

You can configure auto scaling rules to adjust the number of Task nodes based on data volume changes to increase or decrease resources.

  1. Log in to the MRS management console.
  2. Choose Clusters > Active Clusters, select a running cluster, and click its name to switch to the cluster details page.
  3. On the Nodes tab page, click Auto Scaling in the Operation column of the Task node group. The Auto Scaling page is displayed.
  4. Configure an auto scaling rule.

    You can configure the auto scaling rule to adjust the number of nodes, which affects the actual price. Therefore, exercise caution when performing this operation.

    Figure 1 Auto scaling
    • Auto Scaling: indicates whether to enable auto scaling. Auto scaling is disabled by default. After you enable it, you can configure the following parameters.
    • Node Range
      • Default Range: Enter the minimum and maximum number of nodes. This value range is 0 to 500 and applies to all scale-out and scale-in rules.
      • Configure Node Range in a Specified Time Range: This parameter is used to configure an auto scaling resource plan. This parameter is not configured here.
    • Auto Scaling Rule: To enable Auto Scaling, configure scale-out or scale-in rules.

      Configuration procedure:

      1. Select Scale Out or Scale In.
      2. Click Add Rule. The Add Rule page is displayed.
        Figure 2 Adding a rule
      3. Configure the parameters Rule Name, If, Last, Add, Cooldown Period.
      4. Click OK.

        You can view the rules you configured in the Scale Out or Scale In area on the Auto Scaling page.

  5. Select I agree to authorize MRS to scale out or in nodes based on the above rule.
  6. Click OK.

Using Resource Plans Alone

If the data volume changes regularly every day and you want to scale out or in a cluster before the data volume changes, you can create resource plans to adjust the number of Task nodes as planned in the specified time range.

For example, the service data volume for real-time processing peaks between 7:00 and 13:00 every day and is stable and low for other time. Assume that an MRS streaming cluster is used to process the service data. Between 7:00 and 13:00, five Task nodes are required for processing the peak data volume, and only two task nodes are required for other time. You can perform the following steps to configure a resource plan.

  1. Log in to the MRS management console.
  2. Choose Clusters > Active Clusters, select a running cluster, and click its name to switch to the cluster details page.
  3. On the Nodes tab page, click Auto Scaling in the Operation column of the Task node group. The Auto Scaling page is displayed.
  4. Configure a resource plan.

    You can configure the resource plan to adjust the number of nodes, which affects the actual price. Therefore, exercise caution when performing this operation.

    Configuration procedure:

    1. On the Auto Scaling page, enable Auto Scaling.
      Figure 3 Auto scaling page
    2. Set Default Range to 2-2 for Node Range. This indicates that the number of Task nodes is fixed to 2 beyond the time range specified in the resource plan.
    3. Click Configure Node Range in a Specified Time Range under Default Range.
    4. Configure Time Range to 07:00-13:00, and Node Range to 5-5. This indicates that the number of Task nodes is fixed to 5 in the specified time range. For details about the parameters, see Table 2.

      You can click Configure Node Range in a Specified Time Range to configure multiple resource plans.

  5. (Optional) Configure an automation script.

    1. In Automation Script of Advanced Settings, click Add. The Automation Script page is displayed.
    Figure 4 Automation script
    1. Set the following parameters: Name, Script Path, Execution Node Type, Parameter, Execution Time, and Action upon Failure. For details about the parameters, see Table 3.
    2. Click OK to save the automation script configurations.

  6. Select I agree to authorize MRS to scale out or in nodes based on the above policy.
  7. Click OK.

Using Auto Scaling Rules and Resource Plans Together

If the data volume is not stable and the expected fluctuation may occur, the fixed Task node range cannot guarantee that the requirements in some service scenarios are met. In this case, it is necessary to adjust the number of Task nodes based on the real-time loads and resource plans.

For example, even though the service data volume for real-time processing changes regularly from 7:00 to 13:00 every day, it is still unstable.

Assume that from five to eight Task nodes are needed from 7:00 to 13:00 and two to four Task nodes are required for other time. Therefore, you can set an auto scaling rule in addition to a resource plan. Therefore, when the data volume exceeds the expected value, the number of Task nodes can be adjusted based on the loads, without exceeding the node range specified in the resource plan. When a resource plan is triggered, the number of nodes is adjusted within the specified node range with minimum affect. That is, increase nodes to the minimum value of the node range and decrease nodes to the maximum value of the node range. Perform the following steps to configure both the auto scaling rule and the resource plan:

  1. Log in to the MRS management console.
  2. Choose Clusters > Active Clusters, select a running cluster, and click its name to switch to the cluster details page.
  3. On the Nodes tab page, click Auto Scaling in the Operation column of the Task node group. The Auto Scaling page is displayed.
  4. Configure the auto scaling rule.

    You can configure the auto scaling rule to adjust the number of nodes, which affects the actual price. Therefore, exercise caution when performing this operation.

    Figure 5 Auto scaling
    • Auto Scaling: indicates whether to enable auto scaling. Auto scaling is disabled by default. After you enable it, you can configure the following parameters.
    • Default Range of Node Range: Enter the node range of Task nodes. This constraint applies to all cluster scale-out and scale-in rules. Set this parameter to 2 to 4.
    • Auto Scaling Rule: To enable Auto Scaling, configure the scale-out or scale-in rules.

      Configuration procedure:

      1. Select Scale Out or Scale In.
      2. Click Add Rule. The Add Rule page is displayed.
        Figure 6 Adding a rule
      3. Configure the following parameters: Rule Name, If, Last, If, Cooldown Period.
      4. Click OK.

        You can view the rules you configured in the Scale Out or Scale In area on the Auto Scaling page.

  5. Configure a resource plan.

    You can configure the resource plan to adjust the number of nodes, which affects the actual price. Therefore, exercise caution when performing this operation.

    Configuration procedure:

    1. Click Configure Node Range in a Specified Time Range under Default Range on the Auto Scaling page.
      Figure 7 Auto scaling page
    2. Configure Time Range to 07:00-13:00 and Node Range to 5-8. For details about the parameters, see Table 2.
    3. You can click Configure Node Range in a Specified Time Range to configure multiple resource plans.

  6. Configure automation scripts.

    1. In Automation Script of Advanced Settings, click Add The Automation Script page is displayed.
    Figure 8 Automation script
    1. Set the following parameters: Name, Script Path, Execution Node Type, Parameter, Execution Time, and Action upon Failure. For details about the parameters, see Table 3.
    2. Click OK to save the automation script configurations.

  7. Select I agree to authorize MRS to scale out or in nodes based on the above policy.
  8. Click OK

Related Information

When adding rules, you can refer to Table 1 to configure auto scaling metrics.

Table 1 Auto scaling metrics

Cluster Type

Metric Name

Type

Description

Streaming cluster

StormSlotAvailable

Integer

Number of available Storm slots

Value range: 0 to 2147483646

StormSlotAvailablePercentage

Percentage

Percentage of available Storm slots, that is, the proportion of available slots to total slots

Value range: 0 to 100

StormSlotUsed

Integer

Number of the used Storm slots

Value range: 0 to 2147483646

StormSlotUsedPercentage

Percentage

Percentage of the used Storm slots, that is, the proportion of the used slots to total slots

Value range: 0 to 100

StormSupervisorMemAverageUsage

Integer

Average memory usage of the Supervisor process of Storm

Value range: 0 to 2147483646

StormSupervisorMemAverageUsagePercentage

Percentage

Average percentage of the used memory of the Supervisor process of Storm to the total memory of the system

Value range: 0 to 100

StormSupervisorCPUAverageUsagePercentage

Percentage

Average percentage of the used CPUs of the Supervisor process of Storm to the total CPUs

Value range: 0 to 6000

Analysis cluster

YARNAppPending

Integer

Number of pending tasks on YARN

Value range: 0 to 2147483646

YARNAppPendingRatio

Ratio

Ratio of pending tasks on YARN, that is, the ratio of pending tasks to running tasks on Yarn

Value range: 0 to 2147483646

YARNAppRunning

Integer

Number of running tasks on YARN

Value range: 0 to 2147483646

YARNContainerAllocated

Integer

Number of containers allocated to YARN

Value range: 0 to 2147483646

YARNContainerPending

Integer

Number of pending containers on YARN

Value range: 0 to 2147483646

YARNContainerPendingRatio

Ratio

Ratio of pending containers on YARN, the ratio of pending containers to running containers on Yarn

Value range: 0 to 2147483646

YARNCPUAllocated

Integer

Number of virtual CPUs (vCPUs) allocated to YARN

Value range: 0 to 2147483646

YARNCPUAvailable

Integer

Number of available vCPUs on YARN

Value range: 0 to 2147483646

YARNCPUAvailablePercentage

Percentage

Percentage of available vCPUs on YARN, that is, the proportion of available vCPUs to total vCPUs

Value range: 0 to 100

YARNCPUPending

Integer

Number of pending vCPUs on YARN

Value range: 0 to 2147483646

YARNMemoryAllocated

Integer

Memory allocated to YARN. The unit is MB.

Value range: 0 to 214748364

YARNMemoryAvailable

Integer

Available memory on YARN. The unit is MB.

Value range: 0 to 2147483646

YARNMemoryAvailablePercentage

Percentage

Percentage of available memory on YARN, that is, the proportion of available memory to total memory on YARN

Value range: 0 to 100

YARNMemoryPending

Integer

Pending memory on YARN

Value range: 0 to 2147483646

  • When the value type is percentage or ratio in Table 1, the valid value can be accurate to percentile. The percentage metric value is a decimal value with a percent sign (%) removed. For example, 16.80 represents 16.80%.
  • Hybrid clusters support all metrics of analysis and streaming clusters.

When adding a resource plan, you can refer to Table 2 to set the corresponding parameters.

Table 2 Configuration items of a resource plan

Configuration Item

Description

Time Range

Start time and end time of a resource plan, accurate to minutes. The parameter value ranges from 00:00 to 23:59. For example, if a resource plan starts at 8:00 and ends at 10:00 in the morning, set this parameter to 08:00-10:00. The end time must be at least 30 minutes later than the start time.

Node Range

The minimum and maximum number of nodes in a resource plan. The parameter value ranges from 0 to 500.

In the time range specified in the resource plan, if the number of Task nodes is less than the minimum number of nodes specified in the resource plan, the auto scaling function increases the number of Task nodes to the minimum value of the node range at a time. If the number of Task nodes is greater than the maximum number of nodes specified in the resource plan, the auto scaling function reduces the number of Task nodes to the maximum value of the node range at a time. The minimum number of nodes must be less than or equal to the maximum number of nodes.

  • When a resource plan is enabled, the Default Range parameter on the Auto Scaling page forcibly takes effect beyond the time range specified in the resource plan. For example, the Default Range is set to 1 to 2, and the time range is between 08:00 and 10:00 and node range is 4 to 5 in a resource plan. The number of Task nodes in other periods (0:00-8:00 and 10:00-23:59) of a day is forcibly limited to the default node range (1 to 2). If the number of nodes is greater than 2, automatic scale-in is triggered; if the number of nodes is less than 1, automatic scale-out is triggered.
  • When a resource plan is not enabled, the default node range takes effect in all time ranges. If the number of nodes is not within the default node range, the number of Task nodes is automatically increased or decreased to the default node range.
  • The time range cannot be overlapped among resource plans. The overlapped time range indicates that two effective resource plans exist at one time point. For example, if resource plan 1 takes effect from 08:00 to 10:00 and resource plan 2 takes effect from 09:00 to 11:00, the time range between 09:00 to 10:00 is overlapped.
  • Trans-day configuration for resource plans is not allowed. For example, if you want to configure a resource plan from 23:00 in the former day to 01:00 in the next day, configure two resource plans whose time ranges are 23:00-00:00 and 00:00-01:00 respectively.
When adding an automation script, you can set related parameters by referring to Table 3.
Table 3 Configuration items of an automation script

Configuration Item

Description

Name

Automation script name.

The value can contain only digits, letters, spaces, hyphens (-), and underscores (_) and cannot start with a space.

The value can contain a maximum of 1 to 64 characters.

NOTE:

A name must be unique in the same cluster. You can set the same name for different clusters.

Script Path

Script path.

The value can be an OBS bucket path or a local VM path.

  • An OBS bucket path must start with s3a:// and end with .sh.
  • A local VM path must start with a slash (/) and end with .sh

Execution Node Type

Select a type of the node where the bootstrap action script is executed.

NOTE:
  • If you select Master nodes, you can choose whether to run the script only on the active Master nodes by enabling or disabling the switch .
  • If you enable it, the script runs only on the active Master nodes. If you disable it, the script runs on all Master nodes. This switch is disabled by default.

Parameter

Automation script parameter.

The following predefined variables can be imported to obtain auto scaling information:

  • ${mrs_scale_node_num}: Number of auto scaling nodes. The value is always positive.
  • ${mrs_scale_type}: Scale-out/in type. The value can be scale_out or scale_in.
  • ${mrs_scale_node_hostnames}: Host names of the auto scaling nodes. Use commas (,) to separate multiple host names.
  • ${mrs_scale_node_ips}: IP address of the auto scaling nodes. Use commas (,) to separate multiple IP addresses.
  • ${mrs_scale_rule_name}: Name of the triggered auto scaling rule. For a resource plan, this parameter is set to resource_plan.

Execution Time

Select the execution time for the automation script.

The following four options are supported: Before scale-out, after scale-out, before scale-in, and after scale-in.

NOTE:

Assume that the execution node includes the Task node.

  • The automation script executed before scale-out cannot run on the newly added Task nodes.
  • The automation script executed after scale-out can run on the newly added Task nodes.
  • The automation script executed before scale-in can run on Task nodes to be deleted.
  • The automation script executed after scale-in cannot run on the deleted Task nodes.

Action upon Failure

Indicates whether to continue to execute subsequent scripts and scale-out/in after the script fails to be executed.

NOTE:
  • You are advised to set this parameter to Continue in the commissioning phase so that the cluster can continue the scale-out/in operation no matter whether the script is executed successfully.
  • If the script fails to be executed, view the log in /var/log/Bootstrap on the cluster VM.
  • The scale-in operation cannot be undone. Therefore, the Action upon Failure can only be set to Continue.

The automation script is triggered only during auto scaling. It is not triggered when the cluster node is manually adjusted.

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel