Updated on 2022-11-18 GMT+08:00

Description

Development Process

  1. Configure the workflow.xml workflow configuration file (coordinator.xml schedules the workflow, and bundle.xml manages a pair of Coordinators) and job.properties.
  2. If you want to implement codes, develop relevant jar packages, for example, Java Action. If you want to use Hive, develop SQL files.
  3. Upload the configuration file and jar packages (including dependent jar packages) to the HDFS. The upload path is oozie.wf.application.path in workflow.xml.
  4. The workflow can be implemented by using the following three methods. For details, see More Information.
    • Shell command
    • Java API
    • Hue
  5. The Oozie client provides examples for your reference, involving various Actions and how to use Coordinator and Bundle. For example, if the installation directory of the Oozie client is /opt/client, the examples directory is /opt/client/Oozie/oozie-client-*/examples.

The following example shows you how to configure a configuration file by using the Mapreduce workflow and invoke the configuration file by running the Shell command.

Description

Provides that a user needs to analyze website logs offline every day, and collect statistics on the access frequency of each module of the website. Log files are stored in the HDFS.

Jobs are submitted through templates and configuration files in the client.