Help Center > > User Guide> FusionInsight Manager Operation Guide> Tenant Resources> Switching the Scheduler

Switching the Scheduler

Updated at: Mar 25, 2021 GMT+08:00

Scenario

The newly installed MRS cluster uses the Superior scheduler by default. If the cluster is upgraded from an earlier version, the administrator can switch the Yarn scheduler from the Capacity scheduler to the Superior scheduler by one click.

Prerequisites

  • The network connection for the cluster is proper and secure, and the Yarn service status is normal.
  • During scheduler switching, tenants cannot be added, deleted, or modified. In addition, services cannot be started or stopped.

Impact on the System

  • Because the Resource Manager is restarted during scheduler switching, submitting tasks to Yarn fails.
  • During scheduler switching, tasks in a job being executed on Yarn will continue, but new tasks cannot be started.
  • After scheduler switching is complete, tasks on Yarn may fail, causing service interruption.
  • After scheduler switching is complete, parameters of the Superior scheduler are used for tenant management.
  • After the scheduler is switched, resources in the Superior scheduler cannot be allocated to the tenant queue whose capacity is 0 in the Capacity scheduler. As a result, tasks submitted to the tenant queue fail to be executed. You are advised not to set capacity of the tenant queue to 0 in the Capacity scheduler.
  • After scheduler switching is complete, you cannot add or delete resource pools, Yarn node labels, or tenants during the trial period. If resource pools, Yarn node labels, or tenants are added or deleted, rollback to the Capacity scheduler is not allowed.

    The recommended trial period for scheduler switching is one week. If resource pools, Yarn node labels, or tenants are added or deleted during this period, the trial period ends immediately.

  • Rollback of scheduler switching may cause the loss of partial or all Yarn task information.

Switching from the Capacity scheduler to the Superior scheduler

  1. Ensure that the Yarn service status is normal.

    1. Log in to FusionInsight Manager as user admin.
    2. Choose Cluster > Name of the desired cluster > Services and check whether the Yarn service status is normal.

  2. Log in to the active OMS node as user omm.
  3. Switch the scheduler.

    The following switching modes are available:

    0: The Capacity scheduler is switched to the Superior scheduler, and the Capacity scheduler configurations are converted into the Superior scheduler configurations.

    1: Only the Capacity scheduler configurations are converted into the Superior scheduler configurations.

    2: Only the Capacity scheduler is switched to the Superior scheduler.

    • Mode 0 is recommended if the cluster environment is simple and the number of tenants is less than 20.

      Run the following command:

      sh ${BIGDATA_HOME}/om-server/om/sbin/switchScheduler.sh -c Cluster ID -m 0

      Replace Cluster ID with the ID of the cluster to be operated, which can be queried by choosing Cluster > Name of the desired cluster > Cluster Properties on FusionInsight Manager.

      Start to convert Capacity scheduler to Superior Scheduler, clusterId=1
      Start to convert Capacity scheduler configurations to Superior. Please wait... 
      Convert configurations successfully. Start to switch the Yarn scheduler to Superior. Please wait... 
      Switch the Yarn scheduler to Superior successfully.
    • If the cluster environment or tenant information is complex and you need to retain the queue information of the Capacity scheduler on the Superior scheduler, it is recommended that you use mode 1 first to convert the Capacity scheduler configurations. After checking the converted configuration information, use mode 2 to switch the Capacity scheduler to the Superior scheduler.
      1. Run the following command to convert the Capacity scheduler configurations into the Superior scheduler configurations:

        sh ${BIGDATA_HOME}/om-server/om/sbin/switchScheduler.sh -c Cluster ID -m 1

        Start to convert Capacity scheduler to Superior Scheduler, clusterId=1
        Start to convert Capacity scheduler configurations to Superior. Please wait... 
        Convert configurations successfully.
      2. Run the following command to switch the Capacity scheduler to the Superior scheduler:

        sh ${BIGDATA_HOME}/om-server/om/sbin/switchScheduler.sh -c Cluster ID -m 2

        Start to convert Capacity scheduler to Superior Scheduler, clusterId=1
        Start to switch the Yarn scheduler to Superior. Please wait... 
        Switch the Yarn scheduler to Superior successfully.
    • If you do not need the queue information of the Capacity scheduler, use mode 2.
      1. Log in to FusionInsight Manager and delete all tenants except the default tenant.
      2. Log in to FusionInsight Manager and delete all resource pools except the default resource pool.

        Run the following command to switch the Capacity scheduler to the Superior scheduler:

        sh ${BIGDATA_HOME}/om-server/om/sbin/switchScheduler.sh -c Cluster ID -m 2

        Start to convert Capacity scheduler to Superior Scheduler, clusterId=1
        Start to switch the Yarn scheduler to Superior. Please wait... 
        Switch the Yarn scheduler to Superior successfully.

    You can query the scheduler switching logs on the active OMS node.

    • ${BIGDATA_LOG_HOME}/controller/aos/switch_scheduler.log
    • ${BIGDATA_LOG_HOME}/controller/aos/aos.log

Rollback Procedure

You can manually switch the Superior scheduler back to the Capacity scheduler. However, this operation is only a workaround and is not allowed in most cases.

If the customer insists on switching back to the Capacity scheduler, the following conditions must be met:

  • The trial period has not expired.
  • No resource pool, Yarn node label, or tenant is added or deleted during the trial period.

    If resource pools, Yarn node labels, or tenants are added or deleted, resource pools or queues may not exist after the Superior scheduler is switched back to the Capacity scheduler. As a result, the Capacity scheduler cannot run properly.

The procedure is as follows:

  1. Change the scheduler to the Capacity scheduler and start Yarn.

    1. Log in to FusionInsight Manager.
    2. Go to the Yarn configuration page and modify the parameters listed in Table 1.
      Table 1 Yarn configuration items to be modified

      Parameter

      Value

      yarn.resourcemanager.scheduler.class

      org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

      yarn.http.rmwebapp.external.classes

      Null

      hadoop.http.rmwebapp.scheduler.page.classes

      Null

    3. Choose Save > OK. Wait until the operation is successful.
    4. Restart the Yarn service in rolling mode, enter the password, and click OK. Wait until the operation is successful.

  2. Log in to the active OMS node and restart the AOS service.

    1. Log in to the active OMS server as user omm.
    2. Run the following command to disable logout on timeout:

      TMOUT=0

      After the operations in this section are complete, run the TMOUT=Timeout interval command to restore the timeout interval in a timely manner. For example, TMOUT=600 indicates that a user is logged out if the user does not perform any operation within 600 seconds.

    3. Run the following command to restart the AOS service:

      ${BIGDATA_HOME}/om-server/om/sbin/aos_cmd.sh restart

Did you find this page helpful?

Submit successfully!

Thank you for your feedback. Your feedback helps make our documentation better.

Failed to submit the feedback. Please try again later.

Which of the following issues have you encountered?







Please complete at least one feedback item.

Content most length 200 character

Content is empty.

OK Cancel