Updated on 2025-08-22 GMT+08:00

Configuring YARN Big Job Scanning

Scenario

YARN's big job scanning function monitors local temporary files (such as shuffle files) and key HDFS directories (OBS is not supported) for Hive, HetuEngine, and Spark jobs. It reports events when jobs consume excessive storage resources (local disks or key HDFS directories).

For details about the monitored HDFS directories, see Table 1.

Table 1 Monitored HDFS directories

Component

Monitored HDFS Directories

Threshold

Hive

hdfs://hacluster/tmp/hive-scratch/*/

400G

HetuEngine

hdfs://hacluster/hetuserverhistory/*/coordinator/

100G

Spark

hdfs://hacluster/sparkJobHistory/

100G

Notes and Constraints

This section applies only to MRS 3.5.0 and later versions.

Procedure

  1. Log in to FusionInsight Manager.

    For details about how to log in to FusionInsight Manager, see Accessing MRS FusionInsight Manager.

  2. Choose Cluster > Services > Yarn > Configurations > All Configurations.
  3. Search for following parameters and modify them as required.

    • Configuring YARN big job monitoring
      Table 2 Parameters for configuring YARN big job monitoring

      Parameter

      Description

      Example Value

      job.monitor.check.period

      Interval for monitoring big jobs.

      • The value 0 indicates that big job monitoring is disabled.
      • Value range: 1 to 720
      • Unit: minute

      10

      job.monitor.local.thread.pool

      Number of threads for obtaining information about big jobs monitored by NodeManager.

      Value range: 1 to 500

      50

      max.job.count

      Number of big jobs displayed in the reported event.

      Value range: 1 to 500

      5

      job.monitor.local.dir.threshold

      Threshold of the size of the job directory on NodeManager's local disk. An event will be triggered once this threshold is reached.

      • Default value: 20
      • Value range: 1 to 1800000000000
      • Unit: GB

      20

    • Configuring HetuEngine large job directory monitoring
      Table 3 Parameters for configuring HetuEngine big job monitoring

      Parameter

      Description

      Example Value

      hetu.job.hdfs.monitor.dir

      Big directory monitoring path of HetuEngine jobs. The root directory cannot be monitored.

      If the directories to be monitored include variable directories such as user directories, replace them with /*/.

      hdfs://hacluster/hetuserverhistory/*/coordinator/

      hetu.job.appId.parser.rule

      Rule for extracting job IDs in the big directory monitoring path of HetuEngine jobs. For example:

      • {subdir}/{appid}: The job ID is in the subdirectory of the monitoring directory. The subdirectory name is not fixed.
      • {appid}: The job ID is in the monitoring directory.

      {appid}

      hetu.job.hdfs.dir.threshold

      Big directory threshold of HetuEngine jobs. If the threshold is exceeded, an event is reported.

      • Value range: 1 to 1800000000000
      • Default value: 100
      • Unit: GB

      100

    • Configuring Hive large job directory monitoring
      • To activate the Hive component configuration in the big job scanning feature, set hive-ext.record.mr.applicationid to true. Here are the steps you need to take:

        On FusionInsight Manager, choose Cluster > Services > Hive. Click Configurations then All Configurations. In the navigation pane on the left, choose HiveServer(Role) > Customization. Add hive-ext.record.mr.applicationid to hive.server.customized.configs set its value to true, and save the configuration.

        On the All Configurations page of the YARN, modify the following parameters.

      • Currently, the Hive big job scanning feature applies only to the MapReduce engine.
      Table 4 Parameters for configuring Hive big job monitoring

      Parameter

      Description

      Example Value

      hive.job.hdfs.monitor.dir

      Big directory monitoring path of Hive jobs. The root directory cannot be monitored.

      If the directories to be monitored include variable directories such as user directories, replace them with /*/.

      hdfs://hacluster/tmp/hive-scratch/*/

      hive.job.appId.parser.rule

      Rule for extracting job IDs in the big directory monitoring path of Hive jobs. For example:

      • {subdir}/{appid}: The job ID is in the subdirectory of the monitoring directory. The subdirectory name is not fixed.
      • {appid}: The job ID is in the monitoring directory.

      {subdir}/{appid}

      hive.job.hdfs.dir.threshold

      Big directory threshold of monitored Hive jobs. If the threshold is exceeded, an event is reported.

      • Value range: 1 to 1800000000000
      • Default value: 400
      • Unit: GB

      400

    • Configuring Spark large job directory monitoring
      Table 5 Parameters for configuring Spark big job monitoring

      Parameter

      Description

      Example Value

      spark.job.hdfs.monitor.dir

      Big directory monitoring path of Spark jobs. The root directory cannot be monitored.

      If the directories to be monitored include variable directories such as user directories, replace them with /*/.

      hdfs://hacluster/sparkJobHistory/

      spark.job.appId.parser.rule

      Rule for extracting the job ID in the large directory monitoring path of the monitored Spark job. For example:

      • {subdir}/{appid}: The job ID is in the subdirectory of the monitoring directory. The subdirectory name is not fixed.
      • {appid}: The job ID is in the monitoring directory.

      {appid}

      spark.job.hdfs.dir.threshold

      Big directory threshold for monitoring Spark jobs. If the threshold is exceeded, an event is reported.

      • Value range: 1 to 1800000000000
      • Default value: 100
      • Unit: GB

      100

  4. Save the modified configuration. Restart the expired service or instance for the configuration to take effect.