Updated on 2022-12-02 GMT+08:00

Submitting a Streaming Job

Scenario

This section describes how to submit an Oozie job of the Streaming type on the Hue web UI.

Procedure

  1. Create a workflow. For details, see Creating a Workflow.
  2. On the workflow editing page, select next to Streaming and drag it to the operation area.
  3. In the Streaming window that is displayed, set Mapper, for example, to /bin/cat. Set Reducer, for example, to /usr/bin/wc. Click Add.
  4. Click FILE+ to add the files required for running, for example, /user/oozie/share/lib/mapreduce-streaming/hadoop-streaming-xxx.jar and /user/oozie/share/lib/mapreduce-streaming/oozie-sharelib-streaming-5.1.0.jar.
  5. Click the configuration button in the upper right corner. On the configuration page that is displayed, click Delete+ to delete a directory, for example, /user/admin/examples/output-data/streaming_workflow.
  6. Click PROPERTIES+ to add the following properties:

    • Enter the property name mapred.input.dir in the left box and enter the property value /user/admin/examples/input-data/text in the right box.
    • Enter the property name mapred.output.dir in the left box and enter the attribute value /user/admin/examples/output-data/streaming_workflow in the right box.

  7. Click in the upper right corner of the Oozie editor.

    If you need to modify the job name before saving the job (default value: My Workflow), click the name directly for modification, for example, Streaming-Workflow.

  8. After the configuration is saved, click , and submit the job.

    After the job is submitted, you can view the related contents of the job, such as the detailed information, logs, and processes, on Hue.