Updated on 2024-07-11 GMT+08:00

Python

Before using a Python node, ensure that the host connected to the node has an environment for executing Python scripts.

Functions

The Python node is used to execute Python statements.

For details about how to use the Python node, see Developing a Python Script.

Python nodes support script parameters and job parameters.

Parameters

Table 1 and Table 2 describe the parameters of the Python node.

Table 1 Parameters of the Python node

Parameter

Mandatory

Description

Python Statement or Script

Yes

You can select Python statement or Python script.

  • Python statement

    Click the text box under Python Statement. In the displayed Python Statement dialog box, enter the Python statement to be executed.

  • Python script

    In Python Script, select the Python script to be executed. The Python version is displayed by default, for example, Python3. If no script is available, create and develop a script by referring to Creating a Script and Developing a Python Script.

    NOTE:
    • If you select Python statement, the DataArts Factory module cannot parse the parameters contained in the Python statement.
    • If you select Python script, the system displays the Python version selected during Python script creation by default.
    • For existing jobs, Python2 is used by default.
    • The execution result of a Python node cannot be larger than 30 MB. Otherwise, an error is reported.

Host Connection

Yes

Select the host where the Python statement is to be executed. Ensure that the host has an environment for executing Python scripts.

NOTICE:
  • The maximum number of shell or Python scripts that can run concurrently on the ECS is determined by the value of MaxSessions in the /etc/ssh/sshd_config file on the ECS. Set MaxSessions based on the scheduling frequency of shell or Python scripts.
  • You have the permission to create and execute files in the /tmp directory on the host.
  • Shell and Python scripts are executed in the /tmp directory on an ECS. Ensure that the disk space of the /tmp directory is not used up.

Script Parameters

No

Parameter transferred to the script when the Python statement is executed. Parameters are separated by spaces. For example: a b c. The parameter must be referenced by the Python statement. Otherwise, the parameter is invalid.

Interactive Input

No

Interactive information (passwords, for example) provided during Python statement execution. Interactive parameters are separated by spaces. The Python statement reads parameter values in sequence according to the interaction situation.

Node Name

Yes

Name of the node. The value must consist of 1 to 128 characters and contain only letters, digits, and the following special characters: _-/<>.

By default, the node name is the same as that of the selected script. If you want the node name to be different from the script name, disable this function by referring to Disabling Auto Node Name Change.

Table 2 Advanced parameters

Parameter

Mandatory

Description

Node Status Polling Interval (s)

Yes

How often the system check completeness of the node. The value ranges from 1 to 60 seconds.

Max. Node Execution Duration

Yes

Execution timeout interval for the node. If retry is configured and the execution is not complete within the timeout interval, the node will be executed again.

Retry upon Failure

Yes

Whether to re-execute a node if it fails to be executed. Possible values:

  • Yes: The node will be re-executed, and the following parameters must be configured:
    • Retry upon Timeout
    • Maximum Retries
    • Retry Interval (seconds)
  • No: The node will not be re-executed. This is the default setting.
    NOTE:

    If retry is configured for a job node and the timeout duration is configured, the system allows you to retry a node when the node execution times out.

    If a node is not re-executed when it fails upon timeout, you can go to the Default Configuration page to modify this policy.

    Retry upon Timeout is displayed only when Retry upon Failure is set to Yes.

Policy for Handling Subsequent Nodes If the Current Node Fails

Yes

Operation that will be performed if the node fails to be executed. Possible values:

  • Suspend execution plans of the subsequent nodes: stops running subsequent nodes. The job instance status is Failed.
  • End the current job execution plan: stops running the current job. The job instance status is Failed.
  • Go to the next node: ignores the execution failure of the current node. The job instance status is Failure ignored.
  • Suspend the current job execution plan: If the current job instance is in abnormal state, the subsequent nodes of this node and the subsequent job instances that depend on the current job are in waiting state.

Enable Dry Run

No

If you select this option, the node will not be executed, and a success message will be returned.

Task Groups

No

Select a task group. If you select a task group, you can control the maximum number of concurrent nodes in the task group in a fine-grained manner in scenarios where a job contains multiple nodes, a data patching task is ongoing, or a job is rerunning.