Creating a Job
Function
This API is used to create a job. A job consists of one or more nodes, such as Hive SQL and CDM Job nodes. DLF supports two types of jobs: batch jobs and real-time jobs.
URI
- Parameter description
Table 1 URI parameter Parameter
Mandatory
Type
Description
project_id
Yes
String
Project ID. For details about how to obtain a project ID, see Project ID and Account ID.
Request
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
workspace |
No |
String |
Workspace ID.
|
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
name |
Yes |
String |
Job name. The name contains a maximum of 128 characters, including only letters, numbers, hyphens (-), underscores (_), and periods (.). The job name must be unique. |
nodes |
Yes |
List<Node> |
Node definition. For details, see Table 4. |
schedule |
Yes |
Schedule data structure |
Scheduling configuration. For details, see Table 5. |
params |
No |
List<Param> |
Job parameter definition. For details, see Table 6. |
directory |
No |
String |
Directory for saving the job. The value must be an existing directory, for example, /dir/a/. The default value is the root directory. |
processType |
Yes |
String |
Job type.
|
basicConfig |
No |
BasicConfig data structure |
Basic job information. For details, see Table 27. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
name |
Yes |
String |
Node name. The name contains a maximum of 128 characters, including only letters, numbers, hyphens (-), underscores (_), and periods (.). Names of the nodes in a job must be unique. |
type |
Yes |
String |
Node type. The options are as follows:
|
location |
Yes |
Location data structure |
Location of a node on the job canvas. For details, see Table 7. |
preNodeName |
No |
List<String> |
Name of the previous node on which the current node depends. |
conditions |
No |
List<Condition> |
Node execution condition. Whether the node is executed or not depends on the calculation result of the EL expression saved in the expression field of condition. For details, see Table 8. |
properties |
Yes |
List |
Node property. Each type of node has its own property definition.
|
pollingInterval |
No |
Int |
Interval at which node running results are checked. Unit: second; value range: 1 to 60 Default value: 10 |
maxExecutionTime |
No |
Int |
Maximum execution time of a node. If a node is not executed within the maximum execution time, the node is set to the failed state. Unit: minute; value range: 5 to 1440 Default value: 60 |
retryTimes |
No |
Int |
Number of the node retries. The value ranges from 0 to 5. 0 indicates no retry. Default value: 0 |
retryInterval |
No |
Int |
Interval at which a retry is performed upon a failure. The value ranges from 5 to 120. Unit: second Default value: 120 |
failPolicy |
No |
String |
Node failure policy.
|
eventTrigger |
No |
Event data structure |
Event trigger for the real-time job node. For details, see Table 11. |
cronTrigger |
No |
Cron data structure |
Cron trigger for the real-time job node. For details, see Table 9. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
type |
Yes |
String |
Scheduling type.
|
cron |
No |
Data structure |
When type is set to CRON, configure the scheduling frequency and start time. For details, see Table 10. |
event |
No |
Data structure |
When type is set to EVENT, configure information such as the event source. For details, see Table 11. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
name |
Yes |
String |
Name of a parameter. The name contains a maximum of 64 characters, including only letters, numbers, hyphens (-), and underscores (_). |
value |
Yes |
String |
Value of the parameter. It cannot exceed 1024 characters. |
type |
No |
String |
Parameter type. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
x |
Yes |
Int |
Position of the node on the horizontal axis of the job canvas. |
y |
Yes |
Int |
Position of the node on the vertical axis of the job canvas. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
preNodeName |
Yes |
String |
Name of the previous node on which the current node depends. |
expression |
Yes |
String |
EL expression. If the calculation result of the EL expression is true, this node is executed. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
startTime |
Yes |
String |
Scheduling start time in the format of yyyy-MM-dd'T'HH:mm:ssZ, which is an ISO 8601 time format. For example, 2018-10-22T23:59:59+08, which indicates that a job starts to be scheduled at 23:59:59 on October 22nd, 2018. |
endTime |
No |
String |
Scheduling end time in the format of yyyy-MM-dd'T'HH:mm:ssZ, which is an ISO 8601 time format. For example, 2018-10-22T23:59:59+08, which indicates that a job stops to be scheduled at 23:59:59 on October 22nd, 2018. If the end time is not set, the job will continuously be executed based on the scheduling period. |
expression |
Yes |
String |
Cron expression in the format of <second><minute><hour><day><month><week>. For details about the value input in each field, see Table 12. |
expressionTimeZone |
No |
String |
Time zone corresponding to the Cron expression, for example, GMT+8. Default value: time zone where DataArts Studio is located |
period |
Yes |
String |
Job execution interval consisting of a time and time unit Example: 1 hours, 1 days, 1 weeks, 1 months The value must match the value of expression. |
dependPrePeriod |
No |
Boolean |
Indicates whether to depend on the execution result of the current job's dependent job in the previous scheduling period. Default value: false |
dependJobs |
No |
DependJobs data structure |
Job dependency configuration. For details, see Table 13. |
concurrent |
No |
Integer |
Number of concurrent executions allowed |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
startTime |
Yes |
String |
Scheduling start time in the format of yyyy-MM-dd'T'HH:mm:ssZ, which is an ISO 8601 time format. For example, 2018-10-22T23:59:59+08, which indicates that a job starts to be scheduled at 23:59:59 on October 22nd, 2018. |
endTime |
No |
String |
Scheduling end time in the format of yyyy-MM-dd'T'HH:mm:ssZ, which is an ISO 8601 time format. For example, 2018-10-22T23:59:59+08, which indicates that a job stops to be scheduled at 23:59:59 on October 22nd, 2018. If the end time is not set, the job will continuously be executed based on the scheduling period. |
expression |
Yes |
String |
Cron expression in the format of <second><minute><hour><day><month><week>. For details about the value input in each field, see Table 12. |
expressionTimeZone |
No |
String |
Time zone corresponding to the Cron expression, for example, GMT+8. Default value: time zone where DataArts Studio is located |
dependPrePeriod |
No |
Boolean |
Indicates whether to depend on the execution result of the current job's dependent job in the previous scheduling period. Default value: false |
dependJobs |
No |
DependJobs data structure |
Job dependency configuration. For details, see Table 13. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
eventType |
Yes |
String |
Select the corresponding connection name and topic. When a new Kafka message is received, the job is triggered. Set this parameter to KAFKA. Event type. Currently, only newly reported data events from the DIS stream can be monitored. Each time a data record is reported, the job runs once. This parameter is set to DIS. Select the OBS path to be listened to. If new files exist in the path, scheduling is triggered. The path name can be referenced using variable Job.trigger.obsNewFiles. The prerequisite is that DIS notifications have been configured for the OBS path. Set this parameter to OBS. |
failPolicy |
No |
String |
Job failure policy.
Default value: SUSPEND |
concurrent |
No |
int |
Number of the concurrently scheduled jobs. Value range: 1 to 128 Default value: 1 |
readPolicy |
No |
String |
Access policy.
Default value: LAST |
Field |
Value Range |
Allowed Special Character |
Description |
---|---|---|---|
Second |
0-59 |
, - * / |
In the current version, only 0 is allowed. |
Minute |
0-59 |
, - * / |
- |
Hour |
0-23 |
, - * / |
- |
Day |
1-31 |
, - * ? / L W C |
- |
Month |
1-12 |
, - * / |
In the current version, only * is allowed. |
Week |
1-7 |
, - * ? / L C # |
Starting from Sunday. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
jobs |
Yes |
List<String> |
A list of dependent jobs. Only the existing jobs can be depended on. |
dependPeriod |
No |
String |
Dependency period.
Default value: SAME_PERIOD |
dependFailPolicy |
No |
String |
Dependency job failure policy.
Default value: FAIL |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
scriptName |
Yes |
String |
Script name. |
database |
No |
String |
Database name. Database in the MRS Hive. The default value is default. |
connectionName |
No |
String |
Name of a connection. |
scriptArgs |
No |
String |
Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
scriptName |
Yes |
String |
Script name. |
database |
No |
String |
Database name. Database in the MRS Spark SQL. The default value is default. |
connectionName |
No |
String |
Name of a connection. |
scriptArgs |
No |
String |
Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
scriptName |
Yes |
String |
Script name. |
database |
No |
String |
Database name. Database in DWS. The default value is postgres. |
connectionName |
No |
String |
Name of a connection. |
scriptArgs |
No |
String |
Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
scriptName |
Yes |
String |
Script name. |
database |
No |
String |
Database name. Database in DLI. |
connectionName |
No |
String |
Name of a connection. |
scriptArgs |
No |
String |
Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
scriptName |
Yes |
String |
Script name. |
connectionName |
Yes |
String |
Name of a connection. |
arguments |
No |
String |
Shell script parameter. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
clusterName |
Yes |
String |
Cluster name. You can obtain the cluster name from the CDM cluster list on the DataArts Migration page of the DataArts Studio console. |
jobName |
Yes |
String |
Job name. To obtain the job name, access the DataArts Studio console, choose DataArts Migration, click a cluster name on the Cluster Management page, and click Job Management on the displayed page. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
namespace |
No |
String |
Namespace. Default value: default |
action |
Yes |
String |
Action type.
|
table |
No |
String |
Table name. |
columnFamily |
No |
String |
Column family. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
action |
Yes |
String |
Action type.
|
path |
Yes |
String |
OBS path. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
url |
Yes |
String |
URL address. URL of the cloud service. |
method |
Yes |
String |
HTTP method.
|
headers |
No |
String |
HTTP message header in the format of <message header name>=<value>. Multiple message headers are separated by newlines. |
body |
No |
String |
Message body. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
topic |
Yes |
String |
SMN topic URN. Perform the following operations to obtain an SMN topic URN:
You can obtain the SMN topic URN in the topic list. |
subject |
Yes |
String |
Message title, which is used as the subject of an email sent to a subscriber. |
messageType |
Yes |
String |
Message type.
|
message |
Yes |
String |
Message to be sent. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
clusterName |
Yes |
String |
MRS cluster name. Perform the following operations to obtain the MRS cluster name:
You can obtain the cluster name from the active clusters. |
jobName |
Yes |
String |
MRS job name. The job name is user-defined. |
resourcePath |
Yes |
String |
OBS resource path of the custom Spark JAR package |
parameters |
Yes |
String |
Custom parameters of the Spark JAR package You can specify parameters for a custom JAR package. |
input |
No |
String |
Input path. Input data path of the MRS Spark job. The path can be an HDFS or OBS path. |
output |
No |
String |
Output path. Output data path of the MRS Spark job. The path can be an HDFS or OBS path. |
programParameter |
No |
String |
Program parameter Multiple key-value pairs are allowed and separated by vertical bars (|). |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
clusterName |
Yes |
String |
MRS cluster name. Perform the following operations to obtain the MRS cluster name:
You can obtain the cluster name from the active clusters. |
jobName |
Yes |
String |
MRS job name. The job name is user-defined. |
resourcePath |
Yes |
String |
Resource path. |
parameters |
Yes |
String |
Job parameter. |
input |
Yes |
String |
Input path. Input data path of the MapReduce job. The path can be an HDFS or OBS path. |
output |
Yes |
String |
Output path. Output data path of the MapReduce job. The path can be an HDFS or OBS path. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
clusterName |
Yes |
String |
DLI queue name Perform the following operations to obtain the DLI queue name:
You can obtain the queue name from the queue management list. |
jobName |
Yes |
String |
DLI job name. Perform the following operations to obtain the job name:
You can obtain the job name from the job management list. |
resourceType |
No |
String |
Resource type of the DLI job. CUSTOMIZED is returned when the parameter is customized. |
jobClass |
No |
String |
Main class name. When the application type is .jar, the main class name cannot be empty. |
resourcePath |
Yes |
String |
JAR package resource path. |
jarArgs |
No |
String |
Main-class entry parameter. |
sparkConfig |
No |
String |
Running parameter of the Spark job. |
Parameter |
Mandatory |
Type |
Description |
---|---|---|---|
owner |
No |
String |
Job owner. The length cannot exceed 128 characters. |
priority |
No |
int |
Job priority. The value ranges from 0 to 2. The default value is 0. 0 indicates a top priority, 1 indicates a medium priority, and 2 indicates a low priority. |
executeUser |
No |
String |
Job execution user. The value must be an existing username. |
instanceTimeout |
No |
int |
Instance timeout interval. The unit is minute. The value ranges from 5 to 1440. The default value is 60. |
customFields |
No |
Map<String,String> |
User-defined field. The length cannot exceed 2048 characters. |
Response
None.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.