Developing a Flink Jar Job
This section describes how to create and schedule a Flink Jar job.
- Log in to the DataArts Studio console by following the instructions in Accessing the DataArts Studio Instance Console.
- On the DataArts Studio console, locate a workspace and click DataArts Factory.
- In the left navigation pane of DataArts Factory, choose .
- In the job directory list, right-click a directory and select Create Job.
- Select Real-time processing for Job Type and Single task Flink JAR for Mode. Set other parameters as you need.
- Click OK.
- Configure parameters for the Flink Jar job.
Table 1 MRS Flink Jar job parameters Parameter
Mandatory
Description
Flink Job Name
Yes
Name of the Flink job
A name in Workspace-Job name format is automatically generated.
The job name can contain 1 to 64 characters. Only letters, digits, hyphens (-), and underscores (_) are allowed. Chinese characters are not allowed.
MRS Cluster
Yes
MRS cluster.
NOTE:Currently, jobs with a single Flink Jar node support MRS 3.2.0-LTS.1 and later versions.
Program Parameter
No
Job running parameters. This parameter is displayed only after an MRS cluster is selected.
Configure optimization parameters such as threads, memory, and vCPUs for the job to improve resource utilization and job performance.
CAUTION:You can query historical checkpoints and select a specified checkpoint to start a Flink Jar job. To make a Flink checkpoint take effect, configure the following two parameters:
NOTE:This parameter is mandatory if the cluster version is MRS 1.8.7 or later than MRS 2.0.1.
Click Select Template and select one or multiple script templates.
For details about the program parameters of MRS Flink jobs, see Running a Flink Job in the MapReduce Service User Guide.
Flink Job Parameter
No
Flink job parameters
Parameters required for executing the Flink job. These parameters are specified by the functions in the user program. Separate multiple parameters with spaces.
MRS Resource Queue
No
MRS resource queue
Select a queue you configured in the queue permissions of DataArts Security. If you have configured multiple resource queues, the resource queue you select here has the highest priority.
Flink job resource package
Yes
Jar package. You must upload a Jar package to an OBS bucket and create a resource on the Manage Resource page to add the Jar package to the resource list. For details, see Creating a Resource.
Rerun Policy
No
- Rerun from the previous checkpoint
- Rerun the job
Input Data Path
No
Input data path. You can select an HDFS or OBS path.
Output Data Path
No
Output data path. You can select an HDFS or OBS path.
Table 2 Advanced settings Parameter
Mandatory
Description
Job Status Polling Interval (s)
Yes
Set the interval at which the system checks whether the job is complete. The interval can range from 30s to 60s, or 120s, 180s, 240s, or 300s.
During job execution, the system checks the job status at the configured interval.
Maximum Wait Time
Yes
Set the timeout interval for the job. If the job is not complete within the timeout interval and retry is enabled, the job will be executed again.
NOTE:If the job is in starting state and fails to start, it will fail upon timeout.
Retry upon Failure
No
Whether to re-execute a node if it fails to be executed.
- Yes: The node task will be re-executed, and the following parameters must be configured:
- Retry upon Timeout
- Maximum Retries
- Retry Interval (seconds)
- No: The node will not be re-executed. This is the default setting.
NOTE:If retry is configured for a job node and the timeout duration is configured, the system allows you to retry a node when the node execution times out.
If a node is not re-executed when it fails upon timeout, you can go to the Default Configuration page to modify this policy.
Retry upon Timeout is displayed only when Retry upon Failure is set to Yes.
- Configure basic information of the job.
Table 3 Basic job information Parameter
Description
Owner
The owner configured during job creation is automatically selected. It can be changed.
Executor
This parameter is available when Scheduling Identities is set to Yes.
It specifies the user that executes the job. If you set an executor, the job is executed by the executor. If you do not set an executor, the job is executed by the user who starts the job.
NOTE:You can configure an executor only after you apply for the whitelist membership. To use this feature, contact customer service or technical support.
Job Agency
This parameter is available when Scheduling Identities is set to Yes.
After an agency is configured, the job interacts with other services using the agency during job execution.
Priority
The priority configured during job creation is automatically selected. It can be changed.
Execution Timeout
Timeout of the job instance. If this parameter is set to 0 or is not set, this parameter does not take effect. If notifications are enabled for the job and the execution time of the job exceeds the timeout duration, the system sends a notification, and the job keeps running.
Exclude Waiting Time from Instance Timeout Duration
Whether to exclude waiting time from instance timeout duration
If you do not select this option, the time to wait before an instance starts running is included in the timeout duration.
Custom parameter
Custom parameters, including parameter names and values
Job Tag
Tags used to manage jobs by category
Click Add to add a tag to the job. You can also select a tag you have configured in Managing Job Tags.
- (Optional) Configure job parameters, including variables and constants as needed.
Table 4 Job parameters Function
Description
Variables
Adding parameters
Click Add and enter a parameter name and a parameter value.
- Parameter name
Only letters, digits, hyphens (-), and underscores (_) are allowed.
- Parameter value
- The value of a string parameter is a string, for example, str1.
- The value of a numeric parameter is a number or an expression.
The configured parameters are referenced in ${Parameter name} format in the job.
Editing parameter expressions
Click
next to the parameter value text box to edit the parameter expression. For more expressions, see Expression Overview.
Changing parameter names and values
Change a parameter name or value in the corresponding text box.
Masking data
If the parameter value is a key, click
to mask the value to ensure security.
Deleting parameters
Click
next to the parameter value text box to delete a parameter.
Constants
Adding parameters
Click Add and enter a parameter name and a parameter value.
- Parameter name
Only letters, digits, hyphens (-), and underscores (_) are allowed.
- Parameter value
- The value of a string parameter is a string, for example, str1.
- The value of a numeric parameter is a number or an expression.
The configured parameters are referenced in ${Parameter name} format in the job.
Editing parameter expressions
Click
next to the parameter value text box to edit the parameter expression. For more expressions, see Expression Overview.
Changing parameter names and values
Change a parameter name or value in the corresponding text box.
Deleting parameters
Click
next to the parameter value text box to delete a parameter.
Workspace Environment Variables
View the variables and constants that have been configured in the workspace.
- Parameter name
- Save and submit the job version.
- Click Start to start the job.
- Choose Job Monitoring and click the Real-Time Jobs tab to view the job execution result.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.