Help Center/ DataArts Studio/ More Documents/ API Reference (Kuala Lumpur Region)/ DataArts Factory APIs/ APIs to Be Taken Offline/ Creating a Job

Updated on 2022-08-17 GMT+08:00

Creating a Job

Function

This API is used to create a job. A job consists of one or more nodes, such as Hive SQL and CDM Job nodes. DLF supports two types of jobs: batch jobs and real-time jobs.

URI

URI format
POST /v1/{project_id}/jobs

Parameter description

**Table 1** URI parameter
Parameter	Mandatory	Type	Description
project_id	Yes	String	Project ID. For details about how to obtain a project ID, see Project ID and Account ID.

Request

**Table 2** Request header parameter
Parameter	Mandatory	Type	Description
workspace	No	String	Workspace ID. If this parameter is not set, data in the default workspace is queried by default. To query data in other workspaces, this header must be carried.

**Table 3** Parameters
Parameter	Mandatory	Type	Description
name	Yes	String	Job name. The name contains a maximum of 128 characters, including only letters, numbers, hyphens (-), underscores (_), and periods (.). The job name must be unique.
nodes	Yes	List<Node>	Node definition. For details, see Table 4.
schedule	Yes	Schedule data structure	Scheduling configuration. For details, see Table 5.
params	No	List<Param>	Job parameter definition. For details, see Table 6.
directory	No	String	Directory for saving the job. The value must be an existing directory, for example, /dir/a/. The default value is the root directory.
processType	Yes	String	Job type. REAL_TIME: real-time processing BATCH: batch processing
basicConfig	No	BasicConfig data structure	Basic job information. For details, see Table 26.

**Table 4** Node data structure description
Parameter	Mandatory	Type	Description
name	Yes	String	Node name. The name contains a maximum of 128 characters, including only letters, numbers, hyphens (-), underscores (_), and periods (.). Names of the nodes in a job must be unique.
type	Yes	String	Node type. The options are as follows: Hive SQL: Runs Hive SQL scripts. Spark SQL: Runs Spark SQL scripts. DWS SQL: Runs DWS SQL scripts. DLISQL: Runs DLI SQL scripts. Shell: Runs shell SQL scripts. CDM Job: Runs CDM jobs. CloudTable Manager: Manages CloudTable tables, including creating and deleting tables. OBS Manager: Manages OBS paths, including creating and deleting paths. RESTAPI: Sends REST API requests. SMN: Sends short messages or emails. MRS Spark: Runs Spark jobs of MRS. MapReduce: Runs MapReduce jobs of MRS. DLI Spark: Runs Spark jobs of DLF. RDS SQL: Transfers SQL statements to RDS for execution.
location	Yes	Location data structure	Location of a node on the job canvas. For details, see Table 7.
preNodeName	No	List<String>	Name of the previous node on which the current node depends.
conditions	No	List<Condition>	Node execution condition. Whether the node is executed or not depends on the calculation result of the EL expression saved in the expression field of condition. For details, see Table 8.
properties	Yes	List	Node property. Each type of node has its own property definition. Hive SQL: For details, see Table 13. Spark SQL: For details, see Table 14. DWS SQL: For details, see Table 15. DLI SQL: For details, see Table 16. Shell: For details, see Table 17. CDM Job: For details, see Table 18. CloudTableManager: For details, see Table 19. OBSManager: For details, see Table 20. RESTAPI: For details, see Table 21. SMN: For details, see Table 22. MRS Spark: For details, see Table 23. MapReduce: For details, see Table 24. DLI Spark: For details, see Table 25.
pollingInterval	No	Int	Interval at which node running results are checked. Unit: second; value range: 1 to 60 Default value: 10
maxExecutionTime	No	Int	Maximum execution time of a node. If a node is not executed within the maximum execution time, the node is set to the failed state. Unit: minute; value range: 5 to 1440 Default value: 60
retryTimes	No	Int	Number of the node retries. The value ranges from 0 to 5. 0 indicates no retry. Default value: 0
retryInterval	No	Int	Interval at which a retry is performed upon a failure. The value ranges from 5 to 120. Unit: second Default value: 120
failPolicy	No	String	Node failure policy. FAIL: Terminate the execution of the current job. IGNORE: Continue to execute the next node. SUSPEND: Suspend the execution of the current job. FAIL_CHILD: Terminate the execution of the subsequent node. The default value is FAIL.
eventTrigger	No	Event data structure	Node event triggering configuration. For details, see Table 10.
cronTrigger	No	Cron data structure	Node Cron triggering configuration. For details, see Table 9.

**Table 5** Schedule data structure description
Parameter	Mandatory	Type	Description
type	Yes	String	Scheduling type. EXECUTE_ONCE: The job runs immediately and runs only once. CRON: The job runs periodically. EVENT: The job is triggered by events.
cron	No	Data structure	When type is set to CRON, configure the scheduling frequency and start time. For details, see Table 9.
event	No	Data structure	When type is set to EVENT, configure information such as the event source. For details, see Table 10.

**Table 6** Param data structure description
Parameter	Mandatory	Type	Description
name	Yes	String	Name of a parameter. The name contains a maximum of 64 characters, including only letters, numbers, hyphens (-), and underscores (_).
value	Yes	String	Value of the parameter. It cannot exceed 1024 characters.
type	No	String	Parameter type. variable constants Default value: variable

**Table 7** Location data structure description
Parameter	Mandatory	Type	Description
x	Yes	Int	Position of the node on the horizontal axis of the job canvas.
y	Yes	Int	Position of the node on the vertical axis of the job canvas.

**Table 8** condition data structure description
Parameter	Mandatory	Type	Description
preNodeName	Yes	String	Name of the previous node on which the current node depends.
expression	Yes	String	EL expression. If the calculation result of the EL expression is true, this node is executed.

**Table 9** Cron data structure description
Parameter	Mandatory	Type	Description
startTime	Yes	String	Scheduling start time in the format of yyyy-MM-dd'T'HH:mm:ssZ, which is an ISO 8601 time format. For example, 2018-10-22T23:59:59+08, which indicates that a job starts to be scheduled at 23:59:59 on October 22nd, 2018.
endTime	No	String	Scheduling end time in the format of yyyy-MM-dd'T'HH:mm:ssZ, which is an ISO 8601 time format. For example, 2018-10-22T23:59:59+08, which indicates that a job stops to be scheduled at 23:59:59 on October 22nd, 2018. If the end time is not set, the job will continuously be executed based on the scheduling period.
expression	Yes	String	Cron expression in the format of <second><minute><hour><day><month><week>. For details about the value input in each field, see Table 11.
expressionTimeZone	No	String	Time zone corresponding to the Cron expression, for example, GMT+8. Default value: time zone where DataArts Studio is located
dependPrePeriod	No	Boolean	Indicates whether to depend on the execution result of the current job's dependent job in the previous scheduling period. Default value: false
dependJobs	No	DependJobs data structure	Job dependency configuration. For details, see Table 12.

**Table 10** Event data structure description
Parameter	Mandatory	Type	Description
failPolicy	No	String	Job failure policy. SUSPEND: Suspend the event. IGNORE: Ignore the failure and process with the next event. Default value: SUSPEND
concurrent	No	int	Number of the concurrently scheduled jobs. Value range: 1 to 128 Default value: 1
readPolicy	No	String	Access policy. LAST: Access data from the last location. NEW: Access data from a new location. Default value: LAST

**Table 11** Values in the Cron expression fields
Field	Value Range	Allowed Special Character	Description
Second	0-59	, - * /	In the current version, only 0 is allowed.
Minute	0-59	, - * /	-
Hour	0-23	, - * /	-
Day	1-31	, - * ? / L W C	-
Month	1-12	, - * /	In the current version, only * is allowed.
Week	1-7	, - * ? / L C #	Starting from Sunday.

**Table 12** DependJobs data structure description
Parameter	Mandatory	Type	Description
jobs	Yes	List<String>	A list of dependent jobs. Only the existing jobs can be depended on.
dependPeriod	No	String	Dependency period. SAME_PERIOD: To run a job or not depends on the execution result of its depended job in the current scheduling period. PRE_PERIOD: To run a job or not depends on the execution result of its depended job in the previous scheduling period. Default value: SAME_PERIOD
dependFailPolicy	No	String	Dependency job failure policy. FAIL: Stop the job and set the job to the failed state. IGNORE: Continue to run the job. SUSPEND: Suspend the job. Default value: FAIL

**Table 13** Parameters of the Hive SQL node
Parameter	Mandatory	Type	Description
scriptName	Yes	String	Script name.
database	No	String	Database name. Database in the MRS Hive. The default value is default.
connectionName	No	String	Name of a connection.
scriptArgs	No	String	Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2.

**Table 14** Parameters of the Spark SQL node
Parameter	Mandatory	Type	Description
scriptName	Yes	String	Script name.
database	No	String	Database name. Database in the MRS Spark SQL. The default value is default.
connectionName	No	String	Name of a connection.
scriptArgs	No	String	Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2.

**Table 15** Parameters of the DWS SQL node
Parameter	Mandatory	Type	Description
scriptName	Yes	String	Script name.
database	No	String	Database name. Database in DWS. The default value is postgres.
connectionName	No	String	Name of a connection.
scriptArgs	No	String	Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2.

**Table 16** Parameters of the DLI SQL node
Parameter	Mandatory	Type	Description
scriptName	Yes	String	Script name.
database	No	String	Database name. Database in DLI.
connectionName	No	String	Name of a connection.
scriptArgs	No	String	Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2.

**Table 17** Parameters of the shell node
Parameter	Mandatory	Type	Description
scriptName	Yes	String	Script name.
connectionName	Yes	String	Name of a connection.
arguments	No	String	Shell script parameter.

**Table 18** Parameters of the CDM Job node
Parameter	Mandatory	Type	Description
clusterName	Yes	String	Cluster name. You can obtain the cluster name from the CDM cluster list on the DataArts Migration page of the DataArts Studio console.
jobName	Yes	String	Job name. To obtain the job name, access the DataArts Studio console, choose DataArts Migration, click a cluster name on the Cluster Management page, and click Job Management on the displayed page.

**Table 19** Parameters of the CloudTableManager node
Parameter	Mandatory	Type	Description
namespace	No	String	Namespace. Default value: default
action	Yes	String	Action type. CREATE_TABLE: Create a table. DELETE_TABLE: Delete a table.
table	No	String	Table name.
columnFamily	No	String	Column family.

**Table 20** Parameters of the OBSManager node
Parameter	Mandatory	Type	Description
action	Yes	String	Action type. CREATE_PATH: Create an OBS path. DELETE_PATH: Delete an OBS path.
path	Yes	String	OBS path.

**Table 21** Parameters of the RESTAPI node
Parameter	Mandatory	Type	Description
url	Yes	String	URL address. URL of the cloud service.
method	Yes	String	HTTP method. GET POST PUT DELETE
headers	No	String	HTTP message header in the format of <message header name>=<value>. Multiple message headers are separated by newlines.
body	No	String	Message body.

**Table 22** Parameters of the SMN node
Parameter	Mandatory	Type	Description
topic	Yes	String	SMN topic URN. Perform the following operations to obtain an SMN topic URN: Log in to the management console. Click Simple Message Notification and choose Topic Management > Topics from the list on the left. You can obtain the SMN topic URN in the topic list.
subject	Yes	String	Message title, which is used as the subject of an email sent to a subscriber.
messageType	Yes	String	Message type. NORMAL STRUCTURE TEMPLATE
message	Yes	String	Message to be sent.

**Table 23** Parameters of the MRS Spark node
Parameter	Mandatory	Type	Description
clusterName	Yes	String	MRS cluster name. Perform the following operations to obtain the MRS cluster name: Log in to the management console. Click MapReduce Service and choose Clusters > Active Clusters from the left navigation pane. You can obtain the cluster name from the active clusters.
jobName	Yes	String	MRS job name. The job name is user-defined.
resourcePath	Yes	String	OBS resource path of the custom Spark JAR package
parameters	Yes	String	Custom parameters of the Spark JAR package You can specify parameters for a custom JAR package.
input	No	String	Input path. Input data path of the MRS Spark job. The path can be an HDFS or OBS path.
output	No	String	Output path. Output data path of the MRS Spark job. The path can be an HDFS or OBS path.
programParameter	No	String	Program parameter Multiple key-value pairs are allowed and separated by vertical bars (\|).

**Table 24** Parameters of the MapReduce node
Parameter	Mandatory	Type	Description
clusterName	Yes	String	MRS cluster name. Perform the following operations to obtain the MRS cluster name: Log in to the management console. Click MapReduce Service and choose Clusters > Active Clusters from the left navigation pane. You can obtain the cluster name from the active clusters.
jobName	Yes	String	MRS job name. The job name is user-defined.
resourcePath	Yes	String	Resource path.
parameters	Yes	String	Job parameter.
input	Yes	String	Input path. Input data path of the MapReduce job. The path can be an HDFS or OBS path.
output	Yes	String	Output path. Output data path of the MapReduce job. The path can be an HDFS or OBS path.

**Table 25** Parameters of the DLI Spark node
Parameter	Mandatory	Type	Description
clusterName	Yes	String	DLI queue name Perform the following operations to obtain the DLI queue name: Log in to the management console. Click Data Lake Insight and then Queue Management. You can obtain the queue name from the queue management list.
jobName	Yes	String	DLI job name. Perform the following operations to obtain the job name: Log in to the management console. Click Data Lake Insight and then Spark Jobs. Choose Job Management. You can obtain the job name from the job management list.
resourceType	No	String	Resource type of the DLI job. CUSTOMIZED is returned when the parameter is customized.
jobClass	No	String	Main class name. When the application type is .jar, the main class name cannot be empty.
resourcePath	Yes	String	JAR package resource path.
jarArgs	No	String	Main-class entry parameter.
sparkConfig	No	String	Running parameter of the Spark job.

**Table 26** BasicConfig job information
Parameter	Mandatory	Type	Description
owner	No	String	Job owner. The length cannot exceed 128 characters.
priority	No	int	Job priority. The value ranges from 0 to 2. The default value is 0. 0 indicates a top priority, 1 indicates a medium priority, and 2 indicates a low priority.
executeUser	No	String	Job execution user. The value must be an existing username.
instanceTimeout	No	int	Instance timeout interval. The unit is minute. The value ranges from 5 to 1440. The default value is 60.
customFields	No	Map<String,String>	User-defined field. The length cannot exceed 2048 characters.

Response

None.

Parent topic: APIs to Be Taken Offline

Previous topic: APIs to Be Taken Offline

Next topic: Editing a Job

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.