Development Procedure
- Analyze the service.
- Implement the service.
- Log in to the node where the client is located, and create the dataLoad directory, for example, /opt/client/Oozie/oozie-client-*/examples/apps/dataLoad/. This directory is used as a program running directory to store files that are edited subsequently.
You can directly copy the content in the map-reduce directory of the example directory to the dataLoad directory and edit the content.
Replace oozie-client-* in the directory with the actual version number.
- Compile a workflow job property file job.properties.
For details, see job.properties.
- Compile a workflow job using workflow.xml.
Table 1 Actions in a Workflow No.
Procedure
Description
1
Define the startaction.
For details, see Start Action
2
Define the MapReduceaction.
For details, see MapReduce Action
3
Define the FS action.
For details, see FS Action
4
Define the end action.
For details, see End Action
5
Define the killaction.
For details, see Kill Action
Dependent or newly developed JAR packages must be saved in dataLoad/lib.
The following provides an example workflow file:
<workflow-app xmlns="uri:oozie:workflow:1.0" name="data_load"> <start to="mr-dataLoad"/> <action name="mr-dataLoad"> <map-reduce><resource-manager>${resourceManager}</resource-manager> <name-node>${nameNode}</name-node> <prepare> <delete path="${nameNode}/user/${wf:user()}/${dataLoadRoot}/output-data/map-reduce"/> </prepare> <configuration> <property> <name>mapred.job.queue.name</name> <value>${queueName}</value> </property> <property> <name>mapred.mapper.class</name> <value>org.apache.oozie.example.SampleMapper</value> </property> <property> <name>mapred.reducer.class</name> <value>org.apache.oozie.example.SampleReducer</value> </property> <property> <name>mapred.map.tasks</name> <value>1</value> </property> <property> <name>mapred.input.dir</name> <value>/user/oozie/${dataLoadRoot}/input-data/text</value> </property> <property> <name>mapred.output.dir</name> <value>/user/${wf:user()}/${dataLoadRoot}/output-data/map-reduce</value> </property> </configuration> </map-reduce> <ok to="copyData"/> <error to="fail"/> </action> <action name="copyData"> <fs> <delete path='${nameNode}/user/oozie/${dataLoadRoot}/result'/> <move source='${nameNode}/user/${wf:user()}/${dataLoadRoot}/output-data/map-reduce' target='${nameNode}/user/oozie/${dataLoadRoot}/result'/> <chmod path='${nameNode}/user/oozie/${dataLoadRoot}/result' permissions='-rwxrw-rw-' dir-files='true'></chmod> </fs> <ok to="end"/> <error to="fail"/> </action> <kill name="fail"> <message>This workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message> </kill> <end name="end"/> </workflow-app>
- Compile a Coordinator job using coordinator.xml.
The Coordinator job is used to analyze data every day. For details, see coordinator.xml.
- Log in to the node where the client is located, and create the dataLoad directory, for example, /opt/client/Oozie/oozie-client-*/examples/apps/dataLoad/. This directory is used as a program running directory to store files that are edited subsequently.
- Upload the workflow file.
- Use or switch to the user account that is granted with rights to upload files to the HDFS.
- Run the HDFS upload command to upload the dataLoad folder to a specified directory on the HDFS (user oozie_cli must have the read/write permission for the directory).
The specified directory must be the same as oozie.coord.application.path and workflowAppUri defined in job.properties.
- Execute the workflow file.
Command:
oozie job -oozie https://oozie server hostname:port/oozie -config job.propertiesfile path -run
Parameter list:
Table 2 Parameters Parameter
Description
job
Indicates that a job is to be executed.
-oozie
Indicates the (any instance) Oozie server address.
-config
Indicates the path of job.properties.
-run
Indicates the starts workflow.
For example:
oozie job -oozie https://10-1-130-10:21003/oozie -config job.properties -run
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.