更新时间:2024-08-05 GMT+08:00

YARN Command介绍

您可以使用YARN Commands对YARN集群进行一些操作,例如启动ResourceManager、提交应用程序、中止应用、查询节点状态、下载container日志等操作。

完整和详细的Command描述可以参考官网文档:

http://hadoop.apache.org/docs/r3.1.1/hadoop-yarn/hadoop-yarn-site/YarnCommands.html

常用Command

YARN Commands可同时供普通用户和管理员用户使用,它包含了少量普通用户可以执行的命令,比如jar、logs。而大部分只有管理员有权限使用。

用户可以通过以下命令查看YARN用法和帮助:

yarn --help

用法:进入Yarn客户端的任意目录,执行source命令导入环境变量,直接运行命令即可。

格式如下所示:

yarn [--config confdir] COMMAND

其中COMMAND内容请参考表1

其中版本号8.1.0.1为示例,具体以实际环境的版本号为准。

表1 常用Command描述

COMMAND

描述

resourcemanager

运行一个ResourceManager。

备注:以omm用户执行服务端命令前需export环境变量(客户端需要有omm用户的执行权限),例如:

  • export YARN_CONF_DIR=${BIGDATA_HOME}/FusionInsight_HD_8.1.0.1/1_10_ResourceManager/etc
  • export HADOOP_CONF_DIR=${BIGDATA_HOME}/FusionInsight_HD_8.1.0.1/1_10_ResourceManager/etc

nodemanager

运行一个NodeManager。

备注:以omm用户执行服务端命令前需export环境变量(客户端需要有omm用户的执行权限),例如:

  • export YARN_CONF_DIR=${BIGDATA_HOME}/FusionInsight_HD_8.1.0.1/1_10_NodeManager/etc
  • export HADOOP_CONF_DIR=${BIGDATA_HOME}/FusionInsight_HD_8.1.0.1/1_10_NodeManager/etc

rmadmin

管理员工具(动态更新信息)。

version

打印版本信息。

jar <jar>

运行jar文件。

logs

获取container日志。

classpath

打印获取Hadoop JAR包和其他库文件所需的CLASSPATH路径。

daemonlog

获取或者设置服务LOG级别。

CLASSNAME

运行一个名字为CLASSNAME的类。

top

运行集群利用率监控工具。

-Dmapreduce.job.hdfs-servers

如果对接了OBS,而服务端依然使用HDFS,那么需要显式在命令行使用该参数指定HDFS的地址。格式为hdfs://{NAMESERVICE}。其中{NAMESERVICE}为hdfs nameservice名称。

如果当前的HDFS具有多个nameservice,那么需要指定所有的nameservice,并以‘,’隔开。例如:hdfs://nameservice1,hdfs://nameservice2

Superior Scheduler Command

Superior Scheduler引擎提供了输出Superior Scheduler引擎具体信息的CLI。为了执行Superior命令,需要使用“<HADOOP_HOME>/bin/superior”脚本。

以下为superior命令格式:

<HADOOP_HOME>/bin/superior

Usage: superior [COMMAND | -help]
   Where COMMAND is one of:
   resourcepool                        prints resource pool status
   queue                               prints queue status
   application                         prints application status
   policy                              prints policy status

不带参数调用大多数命令时会显示帮助信息。

  • Superior resourcepool命令:

    该命令显示Resource Pool和相关策略的相关状态以及配置信息。

    Superior resourcepool命令仅用于管理员用户及拥有yarn管理权限的用户。

    用法输出:

    >superior resourcepool
    
    Usage: resourcepool [-help]
                        [-list]
                        [-status <resourcepoolname>]
     -help                        prints resource pool usage
     -list                        prints all resource pool summary report
     -status <resourcepoolname>   prints status and configuration of specified
                                  resource pool
    • resourcepool -list以表格格式中显示Resource Pool摘要。示例如下:
      > superior resourcepool -list
      NAME       NUMBER_MEMBER     TOTAL_RESOURCE           AVAILABLE_RESOURCE
      Pool1      4                 vcores 30,memory 1000    vcores 21,memory 80
      Pool2      100               vcores 100,memory 12800  vcores 30,memory 1000
      default    2                 vcores 64,memory 128     vcores 40,memory 28
    • resourcepool -status <resourcepoolname>以列表格式显示资源库详细信息。示例如下:
      > superior resourcepool -status default
      NAME: default
      DESCRIPTION: System generated resource pool
      TOTAL_RESOURCE: vcores 64,memory 128
      AVAILABLE_RESOURCE: vcores 40,memory 28
      NUMBER_MEMBER: 2
      MEMBERS: node1,node2
      CONFIGURATION:
      |-- RESOURCE_SELECT:
      |__RESOURCES:
  • Superior queue命令

    该命令输出分层队列信息。

    用法输出:
    >superior queue 
    
    Usage: queue [-help]
                 [-list] [-e] [[-name <queue_name>] [-r|-c]]
                 [-status <queue_name>]
     -c                     only work with -name <queue_name> option. If this
                            option is used, command  will print information of
                            specified queue and its direct children.
     -e                     only work with -list or -list -name option. If
                            this option is used, command will print effective
                            state of specified queue and all of its
                            descendants.
     -help                  prints queue sub command usage
     -list                  prints queue summary report. This option can work
                            with -name <queue_name> and -r options.
     -name <queue_name>     print specified queue, this can work with -r
                            option. By default, it will print queue's own
                            information. When -r is defined, command will
                            print all of its descendant queues. When -c is
                            defined, it will print its direct children queues.
     -r                     only work with -name <queue_name> option. If this
                            option is used, command will print information of
                            specified queue and all of its descendants.
     -status <queue_name>   prints status of specified queue
    • queue -list以表格格式输出队列摘要信息。命令将基于队列分层样式输出信息。用户可通过SUBMIT ACL或ADMIN ACL的队列权限查看队列。示例如下:
      > superior queue -list
      NAME         STATE            NRUN_APP     NPEND_APP     NRUN_CONTAINER   NPEND_REQUEST    RES_INUSE                 RES_REQUEST
      root         OPEN|ACTIVE      10           20            100              200              vcores 100,memory 1000    vcores 200,memory 2000
      root.Q1      OPEN|ACTIVE      5            10            50               100              vcores 50,memory 500      vcores 100,memory 1000
      root.Q1.Q11  OPEN|ACTIVE      5            10            50               100              vcores 50,memory 500      vcores 100,memory 1000
      root.Q1.Q12  CLOSE|INACTIVE   0            0             0                0                vcores 0,memory 0         vcores 0,memory 0
      root.Q2      OPEN|INACTIVE    5            10            50               100              vcores 50,memory 500      vcores 100,memory 1000
      root.Q2.Q21  OPEN|ACTIVE      5            10            50               100              vcores 50,memory 500      vcores 100,memory 1000
    • queue -list -name root.Q1只输出root.Q1。
      > superior queue -list -name root.Q1
      NAME            STATE           NRUN_APP     NPEND_APP      NRUN_CONTAINER  NPEND_REQUEST    RES_INUSE            RES_REQUEST
      root.Q1         OPEN|ACTIVE     5            10             50              100              vcores 50,memory 500 vcores 100,memory 1000
    • queue -list -name root.Q1 -r将输出root.Q1及其所有的分支。
      > superior queue -list -name root.Q1 -r
      NAME         STATE            NRUN_APP     NPEND_APP  NRUN_CONTAINER   NPEND_REQUEST    RES_INUSE               RES_REQUEST
      root.Q1      OPEN|ACTIVE      5            10         50               100              vcores 50,memory 500    vcores 100,memory 1000
      root.Q1.Q11  OPEN|ACTIVE      5            10         50               100              vcores 50,memory 500    vcores 100,memory 1000
      root.Q1.Q12  CLOSE|INACTIVE   0            0          0                0                vcores 0,memory 0       vcores 0,memory 0
    • queue -list -name root -c将会输出root及其直系子目录。
      > superior queue -list -name root -c
      NAME                  STATE             NRUN_APP     NPEND_APP     NRUN_CONTAINER   NPEND_REQUEST    RES_INUSE                RES_REQUEST
      root                  OPEN|ACTIVE       10           20            100              200              vcores 100,memory 1000   vcores 200,memory 2000
      root.Q1               OPEN|ACTIVE       5            10            50               100              vcores 50,memory 500     vcores 100,memory 1000
      root.Q2               OPEN|INACTIVE     5            10            50               100              vcores 50,memory 500     vcores 100,memory 1000
    • queue -status <queue_name>将会输出具体队列状态和配置。

      用户可通过SUBMIT ACL权限查看除队列ACL外的细节信息。

      用户还可通过ADMIN ACL的队列权限查看包括ACL在内的队列细节信息。

      > superior queue -status root.Q1
      NAME: root.Q1
      OPEN_STATE:CLOSED
      ACTIVE_STATE: INACTIVE
      EOPEN_STATE: CLOSED
      EACTIVE_STATE: INACTIVE
      LEAF_QUEUE: Yes
      NUMBER_PENDING_APPLICATION: 100
      NUMBER_RUNNING_APPLICATION: 10
      NUMBER_PENDING_REQUEST: 10
      NUMBER_RUNNING_CONTAINER: 10
      NUMBER_RESERVED_CONTAINER: 0
      RESOURCE_REQUEST: vcores 3,memory 3072
      RESOURCE_INUSE: vcores 2,memory 2048
      RESOURCE_RESERVED: vcores 0,memory 0
      CONFIGURATION:
      |-- DESCRIPTION: Spark session queue
      |-- MAX_PENDING_APPLICATION: 10000
      |--MAX_RUNNING_APPLICATION: 1000
      |--ALLOCATION_ORDER_POLICY: FIFO
      |--DEFAULT_RESOURCE_SELECT: label1
      |--MAX_MASTER_SHARE: 10%
      |--MAX_RUNNING_APPLICATION_PER_USER : -1
      |--MAX_ALLOCATION_UNIT: vcores 32,memory 12800
      |--ACL_USERS: user1,user2
      |--ACL_USERGROUPS: usergroup1,usergroup2
      |-- ACL_ADMINS: user1
      |--ACL_ADMINGROUPS: usergroup1
  • Superior application命令

    该命令输出应用相关信息。

    用法输出:

    >superior application 
    
    Usage: application [-help]
                       [-list]
                       [-status <application_id>]
     -help                      prints application sub command usage
     -list                      prints all application summary report
     -status <application_id>   prints status of specified application

    用户可通过应用的浏览访问权限查看应用相关信息。

    • application -list以表的形式提供所有应用的信息摘要:
      > superior application -list
      ID                                        QUEUE             USER     NRUN_CONTAINER          NPEND_REQUEST               NRSV_CONTAINER       RES_INUSE                        RES_REQUEST                     RES_RESERVED           
      application_1482743319705_0005            root.SEQ.queueB   hbase    1                       100                         0                    vcores 1,memory 1536             vcores 2000,memory 409600       vcores 0,memory 0
      application_1482743319705_0006            root.SEQ.queueB   hbase    0                       1                           0                    vcores 0,memory 0                vcores 1,memory 1536            vcores 0,memory 0
    • application -status <app_id>命令输出指定应用的详细信息。示例如下:
      > superior application -status application_1443067302606_0609
      ID: application_1443067302606_0609
      QUEUE: root.Q1.Q11
      USER: cchen
      RESOURCE_REQUEST: vcores 3,memory 3072
      RESOURCE_INUSE: vcores 2,memory 2048
      RESOURCE_RESERVED:vcores 1, memory 1024
      NUMBER_RUNNING_CONTAINER: 2
      NUMBER_PENDING_REQUEST: 3
      NUMBER_RESERVED_CONTAINER: 1
      MASTER_CONTAINER_ID: application_1443067302606_0609_01
      MASTER_CONTAINER_RESOURCE: node1.domain.com
      BLACKLIST: node5,node8
      DEMANDS:
      |-- PRIORITY: 20
      |-- MASTER: true
      |-- CAPABILITY: vcores 2, memory 2048
      |-- COUNT: 1
      |-- RESERVED_RES : vcores 1, memory 1024
      |-- RELAXLOCALITY: true
      |-- LOCALITY: node1/1
      |-- RESOURCESELECT: label1
      |-- PENDINGREASON: “application limit reached”
      |-- ID: application_1443067302606_0609_03
      |-- RESOURCE: node1.domain.com
      |-- RESERVED_RES: vcores 1, memory 1024
      |
      |--PRIORITY: 1
      |-- MASTER: false
      |-- CAPABILITY: vcores 1,memory 1024
      |-- COUNT: 2
      |-- RESERVED_RES: vcores 0, memory 0
      |-- RELAXLOCLITY: true
      |--LOCALITY: node1/1,node2/1,rackA/2
      |-- RESOURCESELECT: label1
      |-- PENDINGREASON: “no available resource”
      CONTAINERS:
      |-- ID: application_1443067302606_0609_01
      |-- RESOURCE: node1.domain.com
      |-- CAPABILITY: vcores 1,memory 1024
      |
      |-- ID: application_1443067302606_0609_02
      |-- RESOURCE: node2.domain.com
      |-- CAPABILITY: vcores 1,memory 1024
  • Superior policy 命令

    该命令输出决策相关信息。

    Superior policy命令仅限管理员用户及拥有Yarn管理权限的用户使用。

    用法输出:

    >superior policy
    
    Usage: policy [-help]
                  [-list <resourcepoolname>] [-u] [-detail]
                  [-status <resourcepoolname>]
     -detail                      only work with -list option to show a
                                  summary information of resource pool
                                  distribution on queues, including reserve,
                                  minimum and maximum
     -help                        prints policy sub command usage
     -list <resourcepoolname>     prints a summary information of resource
                                  pool distribution on queue
     -status <resourcepoolname>   prints pool distribution policy
                                  configuration and status of specified
                                  resource pool
     -u                           only work with -list option to show a
                                  summary information of resource pool
                                  distribution on queues and also user
                                  accounts
    • policy -list <resourcepoolname>输出队列分布信息摘要。示例如下:
      >superior policy -list default
      NAME: default
      TOTAL_RESOURCE: vcores 16,memory 16384
      AVAILABLE_RESOURCE: vcores 16,memory 16384
      
      NAMERES_INUSERES_REQUEST
      root.defaultvcores 0,memory 0vcores 0,memory 0
      root.productionvcores 0,memory 0vcores 0,memory 0
      root.production.BU1vcores 0,memory 0vcores 0,memory 0
      root.production.BU2 vcores 0,memory 0vcores 0,memory 0
    • policy -list <resourcepoolname> -u输出用户级信息摘要。
      > superior policy -list default -u
      NAME: default
      TOTAL_RESOURCE: vcores 16,memory 16384
      AVAILABLE_RESOURCE: vcores 16,memory 16384
      
      NAMERES_INUSERES_REQUEST
      root.defaultvcores 0,memory 0vcores 0,memory 0
      root.default.[_others_]vcores 0,memory 0vcores 0,memory 0
      root.productionvcores 0,memory 0vcores 0,memory 0
      root.production.BU1vcores 0,memory 0vcores 0,memory 0
      root.production.BU1.[_others_]vcores 0,memory 0vcores 0,memory 0
      root.production.BU2vcores 0,memory 0vcores 0,memory 0
      root.production.BU2.[_others_]vcores 0,memory 0vcores 0,memory 0
    • policy -status <resourcepoolname> 输出指定资源库的策略详细资料。
      > superior policy -status pool1
      NAME: pool1
      TOTAL_RESOURCE: vcores 64,memory 128
      AVAILABLE_RESOURCE: vcores 40,memory 28
      QUEUES:
      |-- NAME: root.Q1
      |-- RESOURCE_USE: vcores 20, memory 1000
      |-- RESOURCE_REQUEST: vcores 2,memory 100
      |--RESERVE: vcores 10, memory 4096
      |--MINIMUM: vcore 11, memory 4096
      |--MAXIMUM: vcores 500, memory 100000
      |--CONFIGURATION:
      |-- SHARE: 50%
      |-- RESERVE: vcores 10, memory 4096
      |-- MINIMUM: vcores 11, memory 4096
      |-- MAXIMUM: vcores 500, memory 100000
      |-- QUEUES:
      |-- NAME: root.Q1.Q11
      |-- RESOURCE_USE: vcores 15, memory, 500
      |-- RESOURCE_REQUEST: vcores 1, memory 50
      |-- RESERVE: vcores 0, memory 0
      |-- MINIMUM: vcores 0, memory 0
      |-- MAXIMUM: vcores -1, memory -1
      |-- USER_ACCOUNTS:
      |-- NAME: user1
      |-- RESOURCE_USE: vcores 1, memory 10
      |-- RESOURCE_REQUEST: vcores 1, memory 50
      |
      |-- NAME: OTHERS
      |--RESOURCE_USE: vcores 0, memory 0
      |- RESOURCE_REQUEST: vcores 0, memory 0
      |-- CONFIGURATION:
      |-- SHARE: 100%
      |-- USER_POLICY:
      |-- NAME: user1
      |-- WEIGHT: 10
      |
      |-- NAME: OTHERS
      |-- WEIGHT: 1
      |-- MAXIMUM: vcores 10, memory 1000