Updated on 2022-02-22 GMT+08:00

MRS Spark Python

Functions

The MRS Spark Python node is used to execute a predefined Spark Python job on MRS.

Parameters

Table 1 and Table 2 describe the parameters of the MRS Spark Python node.

Table 1 Parameters of MRS Spark Python nodes

Parameter

Mandatory

Description

Node Name

Yes

Name of the node. Must consist of 1 to 128 characters and contain only letters, digits, underscores (_), hyphens (-), slashes (/), less-than signs (<), and greater-than signs (>).

MRS Cluster Name

Yes

Select an MRS cluster that supports Spark Python. Only a specific version of MRS supports Spark Python. Test the cluster first to ensure that it supports Spark Python.

To create an MRS cluster, use either of the following methods:
  • Click . On the Clusters page, create an MRS cluster.
  • Go to the MRS console to create an MRS cluster.

For details about how to create an MRS cluster, see Custom Purchase of a Cluster in MapReduce User Guide.

Job Name

Yes

Name of the MRS Spark Python job. It must consist of 1 to 64 characters and contain only letters, digits, and underscores (_).

Parameter

Yes

Enter the parameters of the executable program of MRS. Use Enter to separate multiple parameters.

Attribute

No

Enter parameters in the key=value format. Use Enter to separate multiple parameters.

Table 2 Advanced parameters

Parameter

Mandatory

Description

Max. Node Execution Duration

Yes

Execution timeout interval for the node. If retry is configured and the execution is not complete within the timeout interval, the node will not be retried and is set to the failed state.

Retry upon Failure

Yes

Indicates whether to re-execute a node task if its execution fails. Possible values:

  • Yes: The node task will be re-executed, and the following parameters must be configured:
    • Maximum Retries
    • Retry Interval (seconds)
  • No: The node task will not be re-executed. This is the default setting.
NOTE:

If Timeout Interval is configured for the node, the node will not be executed again after the execution times out. Instead, the node is set to the failure state.

Failure Policy

Yes

Operation that will be performed if the node task fails to be executed. Possible values:

  • End the current job execution plan
  • Go to the next job
  • Suspend the current job execution plan
  • Suspend execution plans of the current and subsequent nodes