Updated on 2022-02-22 GMT+08:00

CS Job

Functions

The CS Job node is used to execute a predefined Cloud Stream Service (CS) job for real-time analysis of streaming data.

Context

This node enables you to start a CS job or query whether a CS job is running. If you do not select an existing real-time job, DLF creates and starts the job based on the job status configured on the node. You can customize jobs and use DLF job parameters.

Parameters

Table 1 and Table 2 describe the parameters of the CS Job node.

Table 1 Parameters of CS Job nodes

Parameter

Mandatory

Description

Job Type

Yes

Select a job type for CS.

  • Existing CS job
  • Flink SQL job
  • User-defined Flink job
  • User-defined Spark job

Existing CS job

Streaming Job Name

Yes

Name of the CS job to be executed.

To create a CS job, you can use either of the following methods:
  • Click . On the Data Integration page of DLF, create a CS job.
  • Go to the CS console to create a CS job.

Node Name

Yes

Name of the node. Must consist of 1 to 128 characters and contain only letters, digits, underscores (_), hyphens (-), slashes (/), less-than signs (<), and greater-than signs (>).

Flink SQL job

SQL Script

Yes

Path to a script to be executed. If the script is not created, create and develop the script by referring to Creating a Script and Developing an SQL Script.

Script Parameter

Yes

If the associated SQL script uses a parameter, the parameter name is displayed. Set the parameter value in the text box next to the parameter name. The parameter value can be a built-in function and EL expression. For details about built-in functions and EL expressions, see Expression Overview.

CloudStream Cluster

Yes

Name of the CS cluster. To create a CS cluster, go to the CS console.

SPUs

Yes

1 SPU = 1 core and 4 GB memory

Parallelism

Yes

Number of tasks that run CS jobs at the same time. You are advised to set this parameter to 1 or 2 times of the SPU.

UDF JAR

No

After the JAR package is imported, you can use SQL statements to call the custom functions in the package. You need to upload the JAR package to the OBS bucket.

Auto Restart upon Exception

No

If you enable this function, the system automatically restarts and restores abnormal CS jobs upon job exceptions.

Streaming Job Name

Yes

Name of the Flink SQL job. The name is 1 to 57 characters long and consists only of letters, digits, hyphens (-), and underlines (_).

Node Name

Yes

Name of the node. Must consist of 1 to 128 characters and contain only letters, digits, underscores (_), hyphens (-), slashes (/), less-than signs (<), and greater-than signs (>).

User-defined Flink job

JAR Package Path

Yes

The JAR package path can be selected only after you have uploaded the custom JAR package to the OBS bucket.

Main Class

No

Name of the main class in the JAR file to be uploaded, for example, KafkaMessageStreaming. If this parameter is not specified, the main class name is determined based on the Manifest file in the JAR file.

Main Class Parameter

No

List of parameters for the main class. Parameters are separated by spaces, for example, test tmp/result.txt.

CloudStream Cluster

Yes

Name of the CS cluster. To create a CS cluster, go to the CS console.

SPUs

Yes

1 SPU = 1 core and 4 GB memory

Driver SPU

Yes

Number of SPUs used for each driver node.

Parallelism

Yes

Number of tasks that run CS jobs at the same time. You are advised to set this parameter to 1 or 2 times of the SPU.

Auto Restart upon Exception

No

If you enable this function, the system automatically restarts and restores abnormal CS jobs upon job exceptions.

Streaming Job Name

Yes

Name of the user-defined Flink job. The name is 1 to 57 characters long and consists only of letters, digits, hyphens (-), and underlines (_).

Node Name

Yes

Name of the node. Must consist of 1 to 128 characters and contain only letters, digits, underscores (_), hyphens (-), slashes (/), less-than signs (<), and greater-than signs (>).

User-defined Spark job

JAR Package Path

Yes

The JAR package path can be selected only after you have uploaded the custom JAR package to the OBS bucket.

Main Class

No

Name of the main class in the JAR file to be uploaded, for example, KafkaMessageStreaming. If this parameter is not specified, the main class name is determined based on the Manifest file in the JAR file.

Main Class Parameter

No

List of parameters for the main class. Parameters are separated by spaces, for example, test tmp/result.txt.

CloudStream Cluster

Yes

Name of the CS cluster. To create a CS cluster, go to the CS console.

SPUs

Yes

1 SPU = 1 core and 4 GB memory

Driver SPU

Yes

Number of SPUs used for each driver node.

Executors

Yes

Number of the executor nodes.

Executor SPUs

Yes

Number of SPUs used for each executor node.

Auto Restart upon Exception

No

If you enable this function, the system automatically restarts and restores abnormal CS jobs upon job exceptions.

Streaming Job Name

Yes

Name of the user-defined Spark job. The name is 1 to 57 characters long and consists only of letters, digits, hyphens (-), and underlines (_).

Node Name

Yes

Name of the node. Must consist of 1 to 128 characters and contain only letters, digits, underscores (_), hyphens (-), slashes (/), less-than signs (<), and greater-than signs (>).

Table 2 Advanced parameters

Parameter

Mandatory

Description

Node Status Polling Interval (s)

Yes

Specifies how often the system check completeness of the node task. The value ranges from 1 to 60 seconds.

Max. Node Execution Duration

Yes

Execution timeout interval for the node. If retry is configured and the execution is not complete within the timeout interval, the node will not be retried and is set to the failed state.

Retry upon Failure

Yes

Indicates whether to re-execute a node task if its execution fails. Possible values:

  • Yes: The node task will be re-executed, and the following parameters must be configured:
    • Maximum Retries
    • Retry Interval (seconds)
  • No: The node task will not be re-executed. This is the default setting.
NOTE:

If Timeout Interval is configured for the node, the node will not be executed again after the execution times out. Instead, the node is set to the failure state.

Failure Policy

Yes

Operation that will be performed if the node task fails to be executed. Possible values:

  • End the current job execution plan
  • Go to the next job
  • Suspend the current job execution plan
  • Suspend execution plans of the current and subsequent nodes