On this page

Show all

Help Center/ DataArts Studio/ User Guide/ DataArts Factory/ Job Development/ Developing a Real-Time Processing Single-Task DLI Spark Job

Developing a Real-Time Processing Single-Task DLI Spark Job

Updated on 2025-02-18 GMT+08:00

View PDF

Prerequisites

A single-task real-time processing DLI Spark job has been created. For details, see Creating a Job.

Configuring a DLI Spark job

**Table 1** Properties
Parameter	Mandatory	Description
Job Name	Yes	Enter the DLI Spark job name. The job name can contain 1 to 64 characters. Only letters, digits, hyphens (-), and underscores (_) are allowed.
DLI Queue	Yes	Select a DLI queue.
Spark Version	No	2.3.2 2.4.5 3.1.1
Job Type	No	Type of the Spark image used by the job. The following options are available: Basic AI-enhanced Image If you select this option, select an image, and its version is automatically displayed. You can create images by following the instructions in Image Management.
Job Running Resource	No	8 vCPUs, 32 GB memory 16 vCPUs, 64 GB memory 32 vCPUs, 128 GB memory
Major Job Class	No	Java/Scala main class of the job
Spark program resource package	Yes	Resource package on which the Spark program depends
Resource Type	Yes	OBS path DLI program package DLI program package: The resource package file will not be uploaded to the DLI resource management system before the job is executed. OBS path: The resource package file will not be uploaded to DLI resource management system before the job is executed. The OBS path where the file is located is part of the message body for starting the job. This type is recommended.
Group	No	This parameter is required when Resource Type is set to DLI program package. A Spark program resource package is uploaded to a specified group. The main JAR package and dependency package are uploaded to the same group. Use Existing: Select an existing group. Create New: Create a group. The group name can contain only letters, digits, periods (.), hyphens (-), and underscores (_). Do not use
Major-Class Entry Parameters	No	Press Enter to separate parameters.
Spark program resource package	No	Enter parameters in key=value format and separate parameters by pressing Enter.
Module Name	No	Select one or more module names.
Metadata Access	No	Whether metadata can be accessed To access the OBS table created by the DLI SQL job in the DLI Spark job, enable metadata access.

**Table 2** Advanced settings
Parameter	Mandatory	Description
Job Status Polling Interval (s)	Yes	Set the interval at which the system checks whether the job is complete. The interval can range from 30s to 60s, or 120s, 180s, 240s, or 300s. During job execution, the system checks the job status at the configured interval.
Maximum Wait Time	Yes	Set the timeout interval for the job. If the job is not complete within the timeout interval and retry is enabled, the job will be executed again. NOTE: If the job is in starting state and fails to start, it will fail upon timeout.
Retry upon Failure	No	Whether to re-execute a node if it fails to be executed. Yes: The node task will be re-executed, and the following parameters must be configured: Retry upon Timeout Maximum Retries Retry Interval (seconds) No: The node will not be re-executed. This is the default setting. NOTE: If retry is configured for a job node and the timeout duration is configured, the system allows you to retry a node when the node execution times out. If a node is not re-executed when it fails upon timeout, you can go to the Default Configuration page to modify this policy. Retry upon Timeout is displayed only when Retry upon Failure is set to Yes.

After setting the parameters, click Save and submit the job.

Click Start to run the job.

Configuring Basic Job Information

**Table 3** Basic job information
Parameter	Description
Owner	An owner configured during job creation is automatically matched. This parameter value can be modified.
Executor	This parameter is available when Scheduling Identities is set to Yes. User that executes the job. When you enter an executor, the job is executed by the executor. If the executor is left unspecified, the job is executed by the user who submitted the job for startup. NOTE: You can configure execution users only after you apply for the whitelist membership. To enable it, contact customer service or technical support.
Job Agency	This parameter is available when Scheduling Identities is set to Yes. After an agency is configured, the job interacts with other services as an agency during job execution.
Priority	Priority configured during job creation is automatically matched. This parameter value can be modified.
Execution Timeout	Timeout of the job instance. If this parameter is set to 0 or is not set, this parameter does not take effect. If the notification function is enabled for the job and the execution time of the job instance exceeds the preset value, the system sends a specified notification, and the job keeps running.
Exclude Waiting Time from Instance Timeout Duration	Whether to exclude the wait time from the instance execution timeout duration If you select this option, the time to wait before an instance starts running is excluded from the timeout duration. You can modify this setting in Default Configuration > Exclude Waiting Time from Instance Timeout Duration. If you do not select this option, the time to wait before an instance starts running is included in the timeout duration.
Custom Parameter	Set the name and value of the parameter.
Job Tag	Configure job tags to manage jobs by category. Click Add to add a tag to the job. You can also select a tag configured in Managing Job Tags.
Job Description	Description of the job

Configuring job parameters

Click Parameter Setup on the right of the editor and set the parameters described in Table 4.

**Table 4** Job parameter setup
Module	Description
Variables
Add	Click Add and enter the variable parameter name and parameter value in the text boxes. Parameter Only letters, numbers, hyphens, and underscores (_) are allowed. Parameter Value The string type of parameter value is a character string, for example, str1. The numeric type of parameter value is a number or operation expression. After the parameter is configured, it is referenced in the format of ${Parameter name} in the job.
Edit Parameter Expression	Click next to the parameter value text box. In the displayed dialog box, edit the parameter expression. For more expressions, see Expression Overview.
Modifying a Job	Change the parameter name or value in the corresponding text boxes.
Mask	If the parameter value is a key, click to mask the value for security purposes.
Delete	Click next to the parameter name and value text boxes to delete the job parameter.
Constant Parameter
Add	Click Add and enter the constant parameter name and parameter value in the text boxes. Parameter Only letters, numbers, hyphens, and underscores (_) are allowed. Parameter Value The string type of parameter value is a character string, for example, str1. The numeric type of parameter value is a number or operation expression. After the parameter is configured, it is referenced in the format of ${Parameter name} in the job.
Edit Parameter Expression	Click next to the parameter value text box. In the displayed dialog box, edit the parameter expression. For more expressions, see Expression Overview.
Modifying a Job	Modify the parameter name and parameter value in text boxes and save the modifications.
Delete	Click next to the parameter name and value text boxes to delete the job parameter.
Workspace Environment Variables
View the variables and constants that have been configured in the workspace.

Click the Parameter Preview tab and configure the parameters listed in Table 5.

**Table 5** Job parameter preview
Module	Description
Current Time	This parameter is displayed only when Scheduling Type is set to Run once. The default value is the current time.
Event Triggering Time	This parameter is displayed only when Scheduling Type is set to Event-based. The default value is the time when an event is triggered.
Scheduling Period	This parameter is displayed only when Scheduling Type is set to Run periodically. The default value is the scheduling period.
Start Time	This parameter is displayed only when Scheduling Type is set to Run periodically. The value is the configured job execution time.
Start Time	This parameter is displayed only when Scheduling Type is set to Run periodically. The value is the time when the periodic job scheduling starts.
Subsequent Instances	Number of job instances scheduled. The default value is 1 when Scheduling Type is set to Run once. The default value is 1 when Scheduling Type is set to Event-based. When Scheduling Type is set to Run periodically: If the number of instances exceeds 10, a maximum of 10 instances can be displayed, and the system displays message "A maximum of 10 instances are supported."

NOTE:

In Parameter Preview, if a job parameter has a syntax error, the system displays a message.

If a parameter depends on the data generated during job execution, such data cannot be simulated and displayed in Parameter Preview.

Parent topic: Job Development

Previous topic: Developing a Real-Time Processing Single-Task MRS Flink Jar Job

Next topic: Setting Up Scheduling for a Job

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel

For any further questions, feel free to contact us through the chatbot.

Chatbot