Updated on 2024-04-11 GMT+08:00

Running a HadoopStreaming Job

You can submit programs developed by yourself to run them on MRS, and obtain the results. This topic describes how to submit a HadoopStreaming job on the MRS management console.

Submitting a Job on the UI

  1. Log in to the MRS console.
  2. Choose Clusters > Active Clusters, select a running cluster, and click its name to switch to the cluster details page.
  3. If Kerberos authentication is enabled for the cluster, perform the following steps. If Kerberos authentication is not enabled for the cluster, skip this step.

    In the Basic Information area on the Dashboard page, click Synchronize on the right side of IAM User Sync to synchronize IAM users. For details, see Synchronizing IAM Users to MRS.

    • When the policy of the user group to which the IAM user belongs changes from MRS ReadOnlyAccess to MRS CommonOperations, MRS FullAccess, or MRS Administrator, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the SSSD (System Security Services Daemon) cache of cluster nodes needs time to be updated. Then, submit a job. Otherwise, the job may fail to be submitted.
    • When the policy of the user group to which the IAM user belongs changes from MRS CommonOperations, MRS FullAccess, or MRS Administrator to MRS ReadOnlyAccess, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the SSSD cache of cluster nodes needs time to be updated.

  4. Click the Jobs tab.
  5. Click Create. The Create Job page is displayed.
  6. Set Type to HadoopStreaming. Configure job information by referring to Table 1.

    Table 1 Job parameters

    Parameter

    Description

    Name

    Job name. It contains 1 to 64 characters. Only letters, digits, hyphens (-), and underscores (_) are allowed.

    NOTE:

    You are advised to set different names for different jobs.

    Program Parameter

    (Optional) Used to configure optimization parameters such as threads, memory, and vCPUs for the job to optimize resource usage and improve job execution performance.

    Table 2 describes the common parameters of a running program.

    Parameters

    (Optional) Key parameter for program execution. The parameter is specified by the function of the custom program. MRS is only responsible for loading the parameters. Use spaces to separate parameters. To prevent parameters from being saved as plaintext, add an at sign (@) before parameters.

    The value can contain a maximum of 150,000 characters. It cannot contain special characters ;|&><'$, but can be left blank.

    CAUTION:

    If you enter a parameter with sensitive information (such as the login password), the parameter may be exposed in the job details and logs. Exercise caution when performing this operation.

    Service Parameter

    (Optional) Service parameters for the job. The parameters apply only to this job. To apply modifications to the cluster, follow instructions in Configuring Service Parameters.

    To add more parameters, click on the right. To delete a parameter, click Delete on the right.

    Table 3 describes the typical service parameters.

    Command Reference

    Commands submitted to the background when the job is submitted.

    Table 2 Program parameters

    Parameter

    Description

    Example Value

    -ytm

    Memory size of each TaskManager container. (Optional unit. The unit is MB by default.)

    1024

    -yjm

    Memory size of JobManager container. (Optional unit. The unit is MB by default.)

    1024

    -yn

    Number of Yarn containers allocated to applications. The value is the same as the number of TaskManagers.

    2

    -ys

    Number of TaskManager cores

    2

    -ynm

    Custom name of an application on Yarn

    test

    -c

    Class of the program entry method (for example, the main or getPlan() method). This parameter is required only when the JAR file does not specify the class of its manifest.

    com.bigdata.mrs.test

    For MRS 3.x or later, the -yn parameter is not supported.

    Table 3 Service Parameter

    Parameter

    Description

    Example Value

    fs.obs.access.key

    Key ID for accessing OBS

    -

    fs.obs.secret.key

    Key (corresponding to the key ID) for accessing OBS

    -

  7. Confirm job configuration information and click OK.

    After the job is created, you can manage it.