Help Center/ MapReduce Service/ User Guide/ Submitting an MRS Job/ MRS Job Types

Updated on 2025-03-17 GMT+08:00

View PDF

MRS Job Types

Category

An MRS job is the program execution platform of MRS. It is used to process and analyze user data. You can create jobs online using the MRS console or submit jobs in the background through the cluster client.

MRS jobs typically process data from OBS or HDFS. To create a job, you must first upload the data to be analyzed to OBS. MRS utilizes the data stored in OBS for computing and analysis.

MRS allows exporting data from OBS to HDFS for computing and analyzing. After data analysis and computing is complete, you can store the data in the HDFS or export it to OBS. HDFS and OBS can store compressed data in bz2 and gz formats.

You can create the following types of jobs online in an MRS cluster:

MapReduce can quickly process large-scale data in parallel. It is a distributed data processing model and execution environment. MRS supports the submission of MapReduce JAR programs.
Spark is a distributed in-memory computing framework. MRS supports SparkSubmit, Spark Script, and Spark SQL jobs.
- SparkSubmit: You can submit Spark JAR and Spark Python programs, execute the Spark Application, and compute and process user data.
- SparkScript: You can submit SparkScript scripts and batch execute Spark SQL statements.
- Spark SQL: You can use Spark SQL statements (similar to SQL statements) to query and analyze user data in real time.
Hive is an open-source data warehouse based on Hadoop. MRS allows you to submit HiveScript scripts and directly execute Hive SQL statements.
Flink is a distributed big data processing engine that can perform stateful computations over both unbounded and bounded data streams.
HadoopStreaming works similarly to a standard Hadoop job, where you can define the input and output HDFS paths, as well as the mapper and reducer executable programs.

Job Execution Permission Description

For a security cluster with Kerberos authentication enabled, a user needs to synchronize an IAM user before submitting a job on the MRS web UI. After the synchronization is completed, the MRS system generates a user with the same IAM username. Whether a user has the permission to submit jobs depends on the IAM policy bound to the user during IAM synchronization. For details about the job submission policy, see Table 1 in Synchronizing IAM Users to MRS.

When a user submits a job that involves the resource usage of a specific component, such as accessing HDFS directories and Hive tables, user admin (Manager administrator) must grant the relevant permission to the user.

Log in to Manager of the cluster as user admin.
Add the role of the component whose permission is required by the user. For details, see Managing MRS Cluster Roles.
Change the user group to which the user who submits the job belongs and add the new component role to the user group. For details, see Managing MRS Cluster User Groups.

After the component role bound to the user group to which the user belongs is modified, it takes some time for the role permissions to take effect.

Parent topic: Submitting an MRS Job

Previous topic: Submitting an MRS Job

Next topic: Uploading Application Data to an MRS Cluster

Feedback

Was this page helpful?

Helpful Not helpful

Provide feedback

Thank you very much for your feedback. We will continue working to improve the documentation.

The system is busy. Please try again later.

Which of the following issues have you encountered?

Content is inconsistent with the product UI

Unclear descriptions

Lack of examples or code

Incorrect steps

Can't find what I need

Lack of best practices

Feedback (optional)

0/500

Select at least one type of issue, and enter your comments or suggestions.

Enter a maximum of 500 characters.

Submit Cancel