Help Center> MapReduce Service> User Guide (ME-Abu Dhabi Region)> Managing an Existing Cluster> Configuring Auto Scaling Rules When Creating a Cluster
Updated on 2022-06-09 GMT+08:00

Configuring Auto Scaling Rules When Creating a Cluster

In big data application scenarios, especially real-time data analysis and processing, the number of cluster nodes needs to be dynamically increased or decreased according to data volume changes to add or reduce resources. The auto scaling function of MRS enables clusters to be automatically scaled out or in based on cluster loads. In addition, if the data volume changes in a cycle by day and you want to scale out or in a cluster before the data volume changes, you can use the MRS resource plan feature (setting the Task node quantity based on the time range).

  • Auto scaling rules: You can increase or decrease Task nodes based on real-time cluster loads. Auto scaling will be triggered when the data volume changes but there may be some delays.
  • Resource plan (setting the Task node quantity based on the time range): If the data volume changes periodically, you can create resource plans to resize the cluster before the data volume changes, thereby avoiding delays in increasing or decreasing resources.

You can configure either auto scaling rules or resource plans or both of them to trigger the auto scaling. Configuring both resource plans and auto scaling rules improves the cluster node scalability to cope with occasionally unexpected data volume peaks.

In some service scenarios, resources need to be reallocated or service logic needs to be modified after cluster scale-out or scale-in. If you manually scale out or scale in a cluster, you can log in to cluster nodes to reallocate resources or modify service logic. If you use auto scaling, MRS enables you to customize automation scripts for resource reallocation and service logic modification. Automation scripts can be executed before and after auto scaling and automatically adapt to service load changes, all of which eliminates manual operations. In addition, automation scripts can be fully customized and executed at various moments, which can meet your personalized requirements and improve auto scaling flexibility.

You can configure auto scaling rules when creating a cluster or after a cluster has been created. This section describes how to configure auto scaling rules during cluster creation. For details about how to configure auto scaling rules after cluster creation, see Configuring an Auto Scaling Rule.

Background

You can configure either auto scaling rules or resource plans or both of them to trigger the auto scaling.

  • Auto scaling rules:
    • You can set a maximum of five rules for scaling out or in a cluster, respectively.
    • The system determines the scale-out and then scale-in based on your configuration sequence. Important policies take precedence over other policies to prevent repeated triggering when the expected effect cannot be achieved after a scale-out or scale-in.
    • Comparison factors include greater than, greater than or equal to, less than, and less than or equal to.
    • Cluster scale-out or scale-in can be triggered only after the configured metric threshold is reached for consecutive 5n (the default value of n is 1) minutes.
    • After each scale-out or scale-in, there is a cooling duration that is greater than 0 and lasts 20 minutes by defaults.
    • In each cluster scale-out or scale-in, at least one node and at most 100 nodes can be added or reduced.
  • Resource plans (setting the number of Task nodes by time range):
    • You can specify a Task node range (minimum number to maximum number) in a time range. If the number of Task nodes is beyond the Task node range in a resource plan, the system triggers cluster scale-out or scale-in.
    • You can set a maximum of five resource plans for a cluster.
    • A resource plan cycle is by day. The start time and end time can be set to any time point between 00:00 and 23:59. The start time must be at least 30 minutes earlier than the end time. Time ranges configured for different resource plans cannot overlap.
    • After a resource plan triggers cluster scale-out or scale-in, there is 10-minute cooling duration. Auto scaling will not be triggered again within the cooling time.
    • When a resource plan is enabled, the number of Task nodes in the cluster is limited to the default node range configured by you in other time periods except the time period configured in the resource plan.
    • If the resource plan is not enabled, the number of Task nodes is not limited to the default node range.
  • Automation scripts:
    • You can set an automation script so that it can automatically run on cluster nodes when auto scaling is triggered.
    • You can set a maximum number of 10 automation scripts for a cluster.
    • You can specify an automation script to be executed on one or more types of nodes.
    • Automation scripts can be executed before or after scale-out or scale-in.
    • Before using automation scripts, upload them to a cluster VM or OBS file system in the same region as the cluster. The automation scripts uploaded to the cluster VM can be executed only on the existing nodes. If you want to make the automation scripts run on the new nodes, upload them to the OBS file system.

Adding an Auto Scaling Rule

  1. Log in to the MRS management console.
  2. Click Create Cluster, the Create Cluster page is displayed.
  3. Configure the cluster software and hardware by referring to Creating a Custom Cluster.
  4. On the Set Advanced Options tab page, click Add in the Auto Scaling area.
  5. Add an auto scaling rule.

    You can configure the auto scaling rule to adjust the number of nodes, which affects the actual price. Therefore, exercise caution when performing this operation.

    • Node type: Select the type of Task nodes for which an auto scaling rule is to be added. For an analysis cluster, the option is Analysis Task. For a streaming cluster, the option is Streaming Task. For a hybrid cluster, the options are Analysis Task and Streaming Task.
    • Default node range: Enter a Task node range, in which auto scaling is performed. This constraint applies to all scale-in and scale-out rules. The value ranges from 0 to 500.
    • To add the auto scaling rule, perform the following operations:
      1. In Type, select Scale-out or Scale-in.
      2. Configure the Rule Name, If, Last for, Add, and Cooldown Period parameters. For details about monitoring metrics that trigger auto scaling, see Table 1.
      3. Click OK.

        You can view the added scaling rules in the Add Auto Scaling Rule area and edit or delete the rule in the Operation column.

      4. Add more rules by clicking Add Auto Scaling Rule.

  6. Click OK.

    You can view the added scaling rules in the Add Auto Scaling Rule area and edit or delete the rule in the Operation column.

Adding a Resource Plan

If the data volume changes regularly every day and you want to scale out or in a cluster before the data volume changes, you can create resource plans to adjust the number of Task nodes as planned in the specified time range.

For example, the service data volume for real-time processing peaks between 7:00 and 13:00 every day and is stable and low for other time. Assume that an MRS streaming cluster is used to process the service data. Between 7:00 and 13:00, five Task nodes are required for processing the peak data volume, and only two task nodes are required for other time. You can perform the following steps to configure a resource plan.

  1. Log in to the MRS management console.
  2. Click Create Cluster, the Create Cluster page is displayed.
  3. Configure the cluster software and hardware by referring to Creating a Custom Cluster.
  4. On the Set Advanced Options tab page, click Add in the Auto Scaling area.
  5. Add a resource plan.

    You can configure the resource plan to adjust the number of nodes, which affects the actual price. Therefore, exercise caution when performing this operation.
    • Node type: Select the type of Task nodes for which an auto scaling rule is to be added. For an analysis cluster, the option is Analysis Task. For a streaming cluster, the option is Streaming Task. For a hybrid cluster, the options are Analysis Task and Streaming Task.
    • Default node range: Enter a Task node range, in which auto scaling is performed. This constraint applies to all scale-in and scale-out rules. The value ranges from 0 to 500. For example, the default node range 2-2 indicates that the number of Task nodes is fixed to 2 except the time range specified in the resource plan.
    • To add the resource plan, perform the following operations:
      1. Configure the Time Range and Node Range parameters. For example, set Time Range to 07:00-13:00, and Node Range to 5-5. This indicates that the number of Task nodes is fixed to 5 in the time range specified in the resource plan. For details about the parameters, see Table 2.
      2. Add more resource plans by clicking Add Resource Plan.
      3. Click OK.

        You can view or modify the added the auto scaling plans in the Auto Scaling area, and delete the plan in the Operation column.

  6. (Optional) Add an automation script. Currently, MRS 3.x does not support the Bootstrap action.

    1. Click Create.
    2. Configure the Name, Script Path, Parameter, Execution Node, Execution Time, and Action upon Failure parameter. For details about the parameters, see Table 1.
      Table 1 Parameter description

      Parameter

      Description

      Name

      Name of a bootstrap action script

      The value can contain only digits, letters, spaces, hyphens (-), and underscores (_) and must not start with a space.

      The value can contain 1 to 64 characters.

      NOTE:

      A name must be unique in the same cluster. You can set the same name for different clusters.

      Script Path

      Script path. The value can be an OBS file system path or a local VM path.

      • An OBS file system path must start with s3a:// and end with .sh, for example, s3a://mrs-samples/xxx.sh.
      • A local VM path must start with a slash (/) and end with .sh.

      Parameter

      Automatic script parameter.

      Execution Node

      Select a type of the node where the bootstrap action script is executed.

      Executed

      Select the time when the bootstrap action script is executed.

      • Before
      • After
      • Before scale-in
      • After scale-in

      Action upon Failure

      Whether to continue to execute subsequent scripts and create a cluster after the script fails to be executed.
      NOTE:

      You are advised to set this parameter to Continue in the debugging phase so that the cluster can continue to be installed and started no matter whether the bootstrap action is successful.

    3. Click OK to save the bootstrap action.