Help Center/ DataArts Studio/ API Reference/ DataArts Factory APIs/ Job Development APIs/ Creating a Job

Updated on 2025-11-17 GMT+08:00

Creating a Job

Function

This API is used to create a job. A job consists of one or more nodes, such as Hive SQL and CDM Job nodes. DLF supports two types of jobs: batch jobs and real-time jobs.

URI

URI format
POST /v1/{project_id}/jobs

Parameter description

Table 1 URI parameters

Parameter

Mandatory

Type

Description

project_id

Yes

String

Project ID. For details about how to obtain a project ID, see Project ID and Account ID.

**Table 1** URI parameters
Parameter	Mandatory	Type	Description
project_id	Yes	String	Project ID. For details about how to obtain a project ID, see Project ID and Account ID.

Request Parameters

**Table 2** Request header parameter
Parameter	Mandatory	Type	Description
workspace	No	String	Workspace ID. If this parameter is not set, data in the default workspace is queried by default. To query data in other workspaces, this header must be carried. NOTE: You need to specify a workspace for multiple DataArts Studio instances. This parameter is mandatory if no default workspace is available. Otherwise, an error is reported.

**Table 3** Parameters
Parameter	Mandatory	Type	Description
name	Yes	String	Job name. The name contains a maximum of 128 characters, including only letters, numbers, hyphens (-), underscores (_), and periods (.). The job name must be unique.
nodes	Yes	List<Node>	Node definition. For details, see Table 4.
schedule	Yes	Schedule data structure	Scheduling configuration. For details, see Table 5.
params	No	List<Param>	Job parameter definition. For details, see Table 6.
directory	No	String	Path of a job in the directory tree. If the directory of the path does not exist during job creation, a directory is automatically created in the root directory /, for example, /dir/a/.
processType	Yes	String	Job type. REAL_TIME: real-time processing BATCH: batch processing
singleNodeJobFlag	No	Boolean	Whether the job is a single-task job. The default value is false.
singleNodeJobType	No	String	Single task type. If processType is BATCH, the following values are available for this parameter: DLISQL DWSSQL HiveSQL SparkSQL RDSSQL If Type is set to REAL_TIME, Type can be set to FlinkSQL, FlinkJar , or DLISpark.
lastUpdateUser	No	String	User who last updated the job
logPath	No	String	OBS path for storing job run logs
basicConfig	No	BasicConfig data structure	Basic job information. For details, see Table 28.
emptyRunningJob	No	String	The value can be 0 or 1. 1 indicates dry run, and 0 indicates canceling dry run. If this parameter is not set, the default value 0 is used.
targetStatus	No	String	This parameter is required if the review function is enabled. It indicates the target status of the job. The value can be SAVED, SUBMITTED, or PRODUCTION. SAVED indicates that the job is saved but cannot be scheduled or executed. It can be executed only after submitted and approved. SUBMITTED indicates that the job is automatically submitted after it is saved and can be executed after it is approved. PRODUCTION indicates that the job can be directly executed after it is created. Note: Only the workspace administrator can create jobs in the PRODUCTION state.
approvers	No	List<JobApprover>	Job approver. This parameter is required if the review function is enabled. For details, see Table 32.

**Table 4** Node data structure description
Parameter	Mandatory	Type	Description
name	Yes	String	Node name. The name contains a maximum of 128 characters, including only letters, numbers, hyphens (-), underscores (_), and periods (.). Names of the nodes in a job must be unique.
type	Yes	String	Node type. The options are as follows: Hive SQL: runs Hive SQL scripts. Spark SQL: runs Spark SQL scripts. DWS SQL: runs DWS SQL scripts. DLI SQL: runs DLI SQL scripts. Shell: runs shell SQL scripts. CDM Job: runs CDM jobs. CloudTable Manager: manages CloudTable tables, including creating and deleting tables. OBS Manager: manages OBS paths, including creating and deleting paths. RESTAPI: sends REST API requests. SMN: sends short messages or emails. MRS Spark: runs Spark jobs of MRS. MapReduce: runs MapReduce jobs of MRS. MRSFlinkJob: runs FlinkJob jobs of MRS. MRS HetuEngine: runs HetuEngine jobs of MRS. DLI Spark: runs Spark jobs of DLF. RDSSQL: transfers SQL statements to RDS for execution. ModelArts Train: executes workflow jobs of ModelArts. Dummy: job with no node
location	Yes	Location data structure	Location of a node on the job canvas. For details, see Table 7.
preNodeName	No	List<String>	Name of the previous node on which the current node depends
conditions	No	List<Condition>	Node execution condition. Whether the node is executed or not depends on the calculation result of the EL expression saved in the expression field of condition. For details, see Table 8.
properties	Yes	List<Property>	Node properties. For details, see Table 14. Each type of node has its own property definition. Hive SQL: For details, see Table 15. Spark SQL: For details, see Table 16. DWS SQL: For details, see Table 17. DLI SQL: For details, see Table 18. Shell: For details, see Table 19. CDM Job: For details, see Table 20. CloudTableManager: For details, see Table 21. OBSManager: For details, see Table 22. RESTAPI: For details, see Table 23. SMN: For details, see Table 24. MRS Spark: For details, see Table 25. MapReduce: For details, see Table 26. DLI Spark: For details, see Table 27. MRS Flink: For details, see Table 29. MRS HetuEngine: For details, see Table 30. ModelArts Train: For details, see Table 31.
pollingInterval	No	Int	Interval at which node running results are checked. Unit: second; value range: 1 to 60 Default value: 10
execTimeOutRetry	No	String	Whether to retry a node upon timeout. The default value is false.
maxExecutionTime	No	Int	Maximum execution time of a node. If a node is not executed within the maximum execution time, the node is set to the failed state. The unit is minute. The value ranges from 5 to 7200. Other values do not take effect. Default value: 60
retryTimes	No	Int	Number of the node retries. The value ranges from 1 to 100. Default value: 1
retryInterval	No	Int	Interval at which a retry is performed upon a failure. The value ranges from 5 to 600. Unit: second Default value: 120
failPolicy	No	String	Node failure policy FAIL: Terminate the execution of the current job. IGNORE: Continue to execute the next node. SUSPEND: Suspend the execution of the current job. FAIL_CHILD: Terminate the execution of the subsequent node. The default value is FAIL.
eventTrigger	No	Event data structure	Event trigger for the real-time job node. For details, see Table 11.
cronTrigger	No	Cron data structure	Cron trigger for the real-time job node. For details, see Table 9.

**Table 5** Schedule data structure description
Parameter	Mandatory	Type	Description
type	Yes	String	Scheduling type. EXECUTE_ONCE: The job runs immediately and runs only once. CRON: The job runs periodically. EVENT: The job is triggered by events.
cron	No	Data structure	When type is set to CRON, configure the scheduling frequency and start time. For details, see Table 10.
event	No	Data structure	When type is set to EVENT, configure information such as the event source. For details, see Table 11.

**Table 6** Param data structure description
Parameter	Mandatory	Type	Description
name	Yes	String	Name of a parameter. The name contains a maximum of 64 characters, including only letters, numbers, hyphens (-), and underscores (_).
value	Yes	String	Value of the parameter. It cannot exceed 1,024 characters.
type	No	String	Parameter type variable constants Default value: variable

**Table 7** Location data structure description
Parameter	Mandatory	Type	Description
x	Yes	Int	Position of the node on the horizontal axis of the job canvas
y	Yes	Int	Position of the node on the vertical axis of the job canvas

**Table 8** condition data structure description
Parameter	Mandatory	Type	Description
preNodeName	Yes	String	Name of the previous node on which the current node depends
expression	Yes	String	EL expression. If the calculation result of the EL expression is true, this node is executed.

**Table 9** CronTrigger data structure description
Parameter	Mandatory	Type	Description
startTime	Yes	String	Scheduling start time in the format of yyyy-MM-dd'T'HH:mm:ssZ, which is an ISO 8601 time format. For example, 2018-10-22T23:59:59+08, which indicates that a job starts to be scheduled at 23:59:59 on October 22nd, 2018.
endTime	No	String	Scheduling end time in the format of yyyy-MM-dd'T'HH:mm:ssZ, which is an ISO 8601 time format. For example, 2018-10-22T23:59:59+08, which indicates that a job stops to be scheduled at 23:59:59 on October 22nd, 2018. If the end time is not set, the job will continuously be executed based on the scheduling period.
expression	Yes	String	Cron expression in the format of <second><minute><hour><day><month><week>. For details about the value input in each field, see Table 12.
expressionTimeZone	No	String	Time zone corresponding to the Cron expression, for example, GMT+8. Default value: time zone where DataArts Studio is located
period	Yes	String	Job execution interval consisting of a time and time unit Example: 1 hour, 1 day, 1 week, 1 month The value must match the value of expression.
dependPrePeriod	No	Boolean	Indicates whether to depend on the execution result of the current job's dependent job in the previous scheduling period. Default value: false
dependJobs	No	DependJobs data structure	Job dependency configuration. For details, see Table 13.
concurrent	No	Integer	Number of concurrent executions allowed

**Table 10** Cron data structure description
Parameter	Mandatory	Type	Description
startTime	Yes	String	Scheduling start time in the format of yyyy-MM-dd'T'HH:mm:ssZ, which is an ISO 8601 time format. For example, 2018-10-22T23:59:59+08, which indicates that a job starts to be scheduled at 23:59:59 on October 22nd, 2018.
endTime	No	String	Scheduling end time in the format of yyyy-MM-dd'T'HH:mm:ssZ, which is an ISO 8601 time format. For example, 2018-10-22T23:59:59+08, which indicates that a job stops to be scheduled at 23:59:59 on October 22nd, 2018. If the end time is not set, the job will continuously be executed based on the scheduling period.
expression	Yes	String	Cron expression in the format of <second><minute><hour><day><month><week>. For details about the value input in each field, see Table 12.
expressionTimeZone	No	String	Time zone corresponding to the Cron expression, for example, GMT+8. Default value: time zone where DataArts Studio is located
dependPrePeriod	No	Boolean	Indicates whether to depend on the execution result of the current job's dependent job in the previous scheduling period. Default value: false
dependJobs	No	DependJobs data structure	Job dependency configuration. For details, see Table 13.

**Table 11** Event data structure description
Parameter	Mandatory	Type	Description
eventType	Yes	String	Select the corresponding connection name and topic. When a new Kafka message is received, the job is triggered. Set this parameter to KAFKA. Event type. Currently, only newly reported data events from the DIS stream can be monitored. Each time a data record is reported, the job runs once. This parameter is set to DIS. Select the OBS path to be listened to. If new files exist in the path, scheduling is triggered. The path name can be referenced using variable Job.trigger.obsNewFiles. The prerequisite is that DIS notifications have been configured for the OBS path. Set this parameter to OBS.
failPolicy	No	String	Job failure policy SUSPEND: Suspend the event. IGNORE: Ignore the failure and process with the next event. Default value: SUSPEND
concurrent	No	int	Number of the concurrently scheduled jobs Value range: 1 to 128 Default value: 1
readPolicy	No	String	Access policy. LAST: Access data from the last location. NEW: Access data from a new location. Default value: LAST

**Table 12** Values in the Cron expression fields
Field	Value Range	Allowed Special Character	Description
Second	0-59	, - * /	In the current version, only 0 is allowed.
Minute	0-59	, - * /	None
Hour	0-23	, - * /	None
Day	1-31	, - * ? / L W C	None
Month	1-12	, - * /	In the current version, only * is allowed.
Week	1-7	, - * ? / L C #	Starting from Sunday.

**Table 13** DependJobs data structure description
Parameter	Mandatory	Type	Description
jobs	Yes	List<String>	A list of dependent jobs. Only the existing jobs can be depended on.
dependPeriod	No	String	Dependency period. SAME_PERIOD: To run a job or not depends on the execution result of its depended job in the current scheduling period. PRE_PERIOD: To run a job or not depends on the execution result of its depended job in the previous scheduling period. Default value: SAME_PERIOD
dependFailPolicy	No	String	Dependency job failure policy. FAIL: Stop the job and set the job to the failed state. IGNORE: Continue to run the job. SUSPEND: Suspend the job. Default value: FAIL

**Table 14** Property parameters
Parameter	Mandatory	Type	Description
name	Yes	String	Property name
value	Yes	String	Property value

**Table 15** Parameters of the Hive SQL node
Parameter	Mandatory	Type	Description
scriptName	Yes	String	Script name.
database	No	String	Database name. Database in the MRS Hive. The default value is default.
connectionName	No	String	Name of a connection
scriptArgs	No	String	Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2.

**Table 16** Parameters of the Spark SQL node
Parameter	Mandatory	Type	Description
scriptName	Yes	String	Script name.
database	No	String	Database name. Database in the MRS Spark SQL. The default value is default.
connectionName	No	String	Name of a connection
scriptArgs	No	String	Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2.

**Table 17** Parameters of the DWS SQL node
Parameter	Mandatory	Type	Description
scriptName	Yes	String	Script name.
database	No	String	Database name. Database in DWS. The default value is postgres.
connectionName	No	String	Name of a connection
scriptArgs	No	String	Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2.

**Table 18** Parameters of the DLI SQL node
Parameter	Mandatory	Type	Description
scriptName	Yes	String	Script name.
database	No	String	Database name. Database in DLI.
connectionName	No	String	Name of a connection
scriptArgs	No	String	Script parameter in format of key and value. Multiple parameters are separated by newlines (\n), for example, key1=value1\nkey2=value2.

**Table 19** Parameters of the shell node
Parameter	Mandatory	Type	Description
scriptName	Yes	String	Script name.
connectionName	Yes	String	Name of a connection
arguments	No	String	Shell script parameter.

**Table 20** Parameters of the CDM Job node
Parameter	Mandatory	Type	Description
clusterName	Yes	String	Cluster name. You can obtain the cluster name from the CDM cluster list on the DataArts Migration page of the DataArts Studio console.
jobName	Yes	String	Job name. To obtain the job name, access the DataArts Studio console, choose DataArts Migration, click a cluster name on the Cluster Management page, and click Job Management on the displayed page.

**Table 21** Parameters of the CloudTableManager node
Parameter	Mandatory	Type	Description
namespace	No	String	Namespace. Default value: default
action	Yes	String	Action type. CREATE_TABLE: Create a table. DELETE_TABLE: Delete a table.
table	No	String	Table name.
columnFamily	No	String	Column family.

**Table 22** Parameters of the OBSManager node
Parameter	Mandatory	Type	Description
action	Yes	String	Action type. CREATE_PATH: Create an OBS path. DELETE_PATH: Delete an OBS path.
path	Yes	String	OBS path.

**Table 23** Parameters of the RestClient node
Parameter	Mandatory	Type	Description
url	Yes	String	URL address. URL of the cloud service.
method	Yes	String	HTTP method. GET POST PUT DELETE
agentName	Yes	String	Name of the agent cluster You can obtain the cluster name from the CDM cluster list on the DataArts Migration page of the DataArts Studio console.
securityAuthentication	Yes	String	API authentication mode IAM: IAM token NONE: no authentication usernamePassword: username and password
connectionName	No	String	Data connection Name of the RestClient data connection. This parameter is mandatory when securityAuthentication is usernamePassword.
headers	No	String	HTTP message header in the format of <message header name>=<value>. Multiple message headers are separated by newlines.
body	No	String	Message body.

**Table 24** Parameters of the SMN node
Parameter	Mandatory	Type	Description
topic	Yes	String	SMN topic URN Perform the following operations to obtain an SMN topic URN: Log in to the management console. Click Simple Message Notification and choose Topic Management > Topics from the list on the left. You can obtain the SMN topic URN in the topic list.
subject	Yes	String	Message title, which is used as the subject of an email sent to a subscriber.
messageType	Yes	String	Message type. NORMAL STRUCTURE TEMPLATE
message	Yes	String	Message to be sent.

**Table 25** Parameters of the MRS Spark node
Parameter	Mandatory	Type	Description
clusterName	Yes	String	MRS cluster name Perform the following operations to obtain the MRS cluster name: Log in to the management console. Click MapReduce Service and choose Clusters > Active Clusters from the left navigation pane. You can obtain the cluster name from the active clusters.
jobName	Yes	String	MRS job name The job name is user-defined.
resourcePath	Yes	String	OBS resource path of the custom Spark JAR package
parameters	Yes	String	Custom parameters of the Spark JAR package You can specify parameters for a custom JAR package.
input	No	String	Input path. Input data path of the MRS Spark job. The path can be an HDFS or OBS path.
output	No	String	Output path. Output data path of the MRS Spark job. The path can be an HDFS or OBS path.
programParameter	No	String	Program parameter Multiple key-value pairs are allowed and separated by vertical bars (\|).

**Table 26** Parameters of the MapReduce node
Parameter	Mandatory	Type	Description
clusterName	Yes	String	MRS cluster name Perform the following operations to obtain the MRS cluster name: Log in to the management console. Click MapReduce Service and choose Clusters > Active Clusters from the left navigation pane. You can obtain the cluster name from the active clusters.
jobName	Yes	String	MRS job name The job name is user-defined.
resourcePath	Yes	String	Resource path.
parameters	Yes	String	Job parameter.
input	Yes	String	Input path. Input data path of the MapReduce job. The path can be an HDFS or OBS path.
output	Yes	String	Output path. Output data path of the MapReduce job. The path can be an HDFS or OBS path.

**Table 27** Parameters of the DLI Spark node
Parameter	Mandatory	Type	Description
clusterName	Yes	String	DLI queue name Perform the following operations to obtain the DLI queue name: Log in to the management console. Click Data Lake Insight and then Queue Management. You can obtain the queue name from the queue management list.
jobName	Yes	String	DLI job name. Perform the following operations to obtain the job name: Log in to the management console. Click Data Lake Insight and then Spark Jobs. Choose Job Management. You can obtain the job name from the job management list.
resourceType	No	String	Type of the running resource of the DLI job . This parameter is optional. 1. OBS path: OBS 2. DLI package: DLIResources
jobClass	No	String	Main class name. When the application type is .jar, the main class name cannot be empty.
resourcePath	Yes	String	JAR package resource path.
jarArgs	No	String	Main-class entry parameter.
sparkConfig	No	String	Running parameter of the Spark job.

**Table 28** BasicConfig job information
Parameter	Mandatory	Type	Description
owner	No	String	Job owner. The length cannot exceed 128 characters.
agency	No	String	Job agency
isIgnoreWaiting	No	int	Whether to ignore the waiting time in the instance timeout duration. The value can be 0 or 1 (default). 0: The waiting time is not ignored. 1: The waiting time is ignored.
priority	No	int	Job priority. The value ranges from 0 to 2. The default value is 0. 0 indicates a top priority, 1 indicates a medium priority, and 2 indicates a low priority.
executeUser	No	String	Job execution user. The value must be an existing username.
instanceTimeout	No	int	Instance timeout interval. The unit is minute. The value ranges from 5 to 1440. The default value is 60.
customFields	No	Map<String,String>	User-defined field. The length cannot exceed 2048 characters.

**Table 29** Parameters of the MRS Flink node
Parameter	Mandatory	Type	Description
clusterName	Yes	String	MRS cluster name Perform the following operations to obtain the MRS cluster name: Log in to the management console. Click MapReduce Service and choose Clusters > Active Clusters from the left navigation pane. You can obtain the cluster name from the active clusters.
jobName	Yes	String	MRS job name The job name is user-defined.
flinkJobType	Yes	String	Flink job type, which can be FLink SQL or Flink JAR
flinkJobProcessType	Yes	String	Flink job processing mode, which can be batch or stream
scriptName	No	String	SQL script associated with the Flink SQL job
resourcePath	No	String	OBS resource path of the custom Flink JAR package
input	No	String	Input path. Input data path of the MRS Flink job. The path can be an HDFS or OBS path.
output	No	String	Output path. Output data path of the MRS Flink job. The path can be an HDFS or OBS path.
programParameter	No	String	Program parameter Multiple key-value pairs are allowed and separated by vertical bars (\|).

**Table 30** Parameters of the MRS HetuEngine node
Parameter	Mandatory	Type	Description
clusterName	Yes	String	MRS cluster name Perform the following operations to obtain the MRS cluster name: Log in to the management console. Click MapReduce Service and choose Clusters > Active Clusters from the left navigation pane. You can obtain the cluster name from the active clusters.
jobName	Yes	String	MRS job name The job name is user-defined.
statementOrScript	Yes	String	Whether to use an SQL statement for the node or associate an SQL script with the node
scriptName	No	String	SQL script to be associated with the node
statement	No	String	Custom content of the SQL statement
Data Warehouse	Yes	String	Data connection required by HetuEngine
Schema	Yes	String	Name of the schema to be accessed through HetuEngine
Database	Yes	String	Name of the database to be accessed through HetuEngine
Queue	No	String	Name of the resource queue required by HetuEngine

**Table 31** Parameters of the ModelArts Train node
Parameter	Mandatory	Type	Description
clusterName	Yes	String	MRS cluster name Perform the following operations to obtain the MRS cluster name: Log in to the management console. Click MapReduce Service and choose Clusters > Active Clusters from the left navigation pane. You can obtain the cluster name from the active clusters.
jobName	Yes	String	MRS job name You can set a custom value.
statementOrScript	Yes	String	Whether to use an SQL statement for the node or associate an SQL script with the node
scriptName	No	String	SQL script to be associated with the node

**Table 32** Approver attributes
Parameter	Mandatory	Type	Description
approverName	Yes	String	Approver name

Response Parameters

None.

Example Request

Create a job named myJob whose type is BATCH, scheduling configuration is CRON, path in the directory tree is /myDir, and OBS path for storing job run logs is obs://dlf-test-log.

POST /v1/b384b9e9ab9b4ee8994c8633aabc9505/jobs
{
    "basicConfig": {
        "customFields": {},
        "executeUser": "",
        "instanceTimeout": 0,
        "owner": "test_user",
        "priority": 0
    },
    "directory": "/myDir",
    "logPath": "obs://dlf-test-log",
    "name": "myJob",
    "nodes": [
        {
            "failPolicy": "FAIL_CHILD",
            "location": {
                "x": "-45.5",
                "y": "-134.5"
            },
            "maxExecutionTime": 360,
            "name": "MRS_Hive_SQL",
            "pollingInterval": 20,
            "preNodeName": [],
            "properties": [
                {
                    "name": "scriptName",
                    "value": "test_hive_sql"
                },
                {
                    "name": "connectionName",
                    "value": "mrs_hive_test"
                },
                {
                    "name": "database",
                    "value": "default"
                },
                {
                    "name": "scriptArgs",
                    "value": "test_var=111"
                }
            ],
            "retryInterval": 120,
            "retryTimes": 0,
            "type": "HiveSQL"
        }
    ],
    "processType": "BATCH",
    "schedule": {
        "type": "CRON"
    }
}

Create a job when the review function is enabled.

POST /v1/b384b9e9ab9b4ee8994c8633aabc9505/jobs
{
    "basicConfig": {
        "customFields": {},
        "executeUser": "",
        "instanceTimeout": 0,
        "owner": "test_user",
        "priority": 0
    },
    "directory": "/myDir",
    "logPath": "obs://dlf-test-log",
    "name": "myJob",
    "nodes": [
        {
            "failPolicy": "FAIL_CHILD",
            "location": {
                "x": "-45.5",
                "y": "-134.5"
            },
            "maxExecutionTime": 360,
            "name": "MRS_Hive_SQL",
            "pollingInterval": 20,
            "preNodeName": [],
            "properties": [
                {
                    "name": "scriptName",
                    "value": "test_hive_sql"
                },
                {
                    "name": "connectionName",
                    "value": "mrs_hive_test"
                },
                {
                    "name": "database",
                    "value": "default"
                },
                {
                    "name": "scriptArgs",
                    "value": "test_var=111"
                }
            ],
            "retryInterval": 120,
            "retryTimes": 0,
            "type": "HiveSQL"
        }
    ],
    "processType": "BATCH",
    "schedule": {
        "type": "CRON"
    },
    "targetStatus":"SUBMITTED",
    "approvers": [
        {
            "approverName": "userName1"
        },
        {
            "approverName": "userName2"
        }
    ]
}

Example Response

Success response
HTTP status code 204

Failure response

HTTP status code 400

{
    "error_code":"DLF.0102",
    "error_msg":"The job name already exists."
}

Parent Topic: Job Development APIs

Previous topic: Job Development APIs

Next topic: Modifying a Job

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.

The system is busy. Please try again later.