Running a Flink Job
You can submit programs developed by yourself to MRS to execute them, and obtain the results. This section describes how to submit a Flink job on the MRS management console. Flink jobs are used to submit JAR programs to process streaming data.
You have uploaded the program packages and data files required for running jobs to OBS or HDFS.
Submitting a Job on the GUI
- Log in to the MRS management console.
- Choose , select a running cluster, and click its name to switch to the cluster details page.
- If Kerberos authentication is enabled for the cluster, perform the following steps. If Kerberos authentication is not enabled for the cluster, skip this step.
In the Basic Information area on the Dashboard tab page, click Click to synchronize on the right side of IAM User Sync to synchronize IAM users. For details, see Synchronizing IAM Users to MRS.
- In versions earlier than MRS 1.8.7, the job management function is unavailable in a cluster with Kerberos authentication enabled. You need to submit a job in the background.
- When the policy of the user group to which the IAM user belongs changes from MRS ReadOnlyAccess to MRS CommonOperations, MRS FullAccess, or MRS Administrator, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the SSSD (System Security Services Daemon) cache of cluster nodes needs time to be updated. Then, submit a job. Otherwise, the job may fail to be submitted.
- When the policy of the user group to which the IAM user belongs changes from MRS CommonOperations, MRS FullAccess, or MRS Administrator to MRS ReadOnlyAccess, wait for 5 minutes until the new policy takes effect after the synchronization is complete because the sssd cache of cluster nodes needs time to be updated.
- Click the Jobs tab.
- Click Create. The Create Job page is displayed.
- Set Type to Flink. Configure Flink job information by referring to Table 1.
Table 1 Job configuration information
Job name. It contains 1 to 64 characters. Only letters, digits, hyphens (-), and underscores (_) are allowed.NOTE:
You are advised to set different names for different jobs.
Path of the program package to be executed. The following requirements must be met:
- Contains a maximum of 1,023 characters, excluding special characters such as ;|&><'$. The parameter value cannot be empty or full of spaces.
- The path of the program to be executed can be stored in HDFS or OBS. The path varies depending on the file system.
- OBS: The path must start with s3a://. Example: s3a://wordcount/program/xxx.jar
- OBS: The path must start with obs://. Example: obs://wordcount/program/xxx.jar (supported in MRS 2.1.0 or later)
- HDFS: The path must start with /user. For details about how to import data to HDFS, see Importing Data.
- For SparkScript, the path must end with .sql. For MapReduce and Spark, the path must end with .jar. The .sql and .jar are case-insensitive.
If you use an OBS path starting with s3a://, configure permission for accessing OBS by referring to Accessing OBS.If you use an OBS path starting with obs://, configure permission for accessing OBS as follows:
- If the OBS permission control function is enabled during cluster creation, you can use the obs:// directory without extra configuration.
- If the OBS permission control function is not enabled or not available during cluster creation, perform the following steps:
- On the cluster details page, click the Nodes tab and expand a node group.
- Click a node name to go to the cloud server console.
- Click on the right of Agency, select MRS_ECS_DEFAULT_AGENCY and add it.
- Repeat the preceding steps to add agencies for all nodes in the cluster.
(Optional) Used to configure optimization parameters such as threads, memory, and vCPUs for the job to optimize resource usage and improve job execution performance.
Table 2 describes the common parameters of a running program.
(Optional) Key parameter for program execution. The parameter is specified by the function of the user's program. MRS is only responsible for loading the parameter. Multiple parameters are separated by space.
The parameter contains a maximum of 2,047 characters, excluding special characters such as ;|&><'$, and can be left blank.NOTE:
When entering a parameter containing sensitive information (for example, login password), you can add an at sign (@) before the parameter name to encrypt the parameter value. This prevents the sensitive information from being persisted in plaintext. When you view job information on the MRS management console, the sensitive information is displayed as *.
Example: username=admin @password=admin_123
(Optional) It is used to modify service parameters for the job. The parameter modification applies only to the current job. To make the modification take effect permanently for the cluster, follow instructions in Configuring Service Parameters.
To add multiple parameters, click on the right. To delete a parameter, click Delete on the right.
Table 3 describes the common parameters of a service.
Command submitted to the background for execution when a job is submitted.
Table 2 Program Parameter parameters
Memory size of each TaskManager container. (Optional unit. The unit is MB by default.)
Memory size of JobManager container. (Optional unit. The unit is MB by default.)
Number of Yarn containers allocated to applications. The value is the same as the number of TaskManagers.
Number of TaskManager cores.
Custom name of an application on Yarn.
Class of the program entry point (for example, the main or getPlan() method). This parameter is required only when the JAR file does not specify the class of its manifest.
- Confirm job configuration information and click OK.
After the job is created, you can manage it.