Updated on 2023-04-28 GMT+08:00

schedule-tool Usage Guide

Overview

schedule-tool is used to submit jobs of SFTP data sources. You can modify the input path and file filtering criteria before submitting a job. You can modify the output path if the target source is HDFS.

Parameters

Table 1 Configuration parameters of schedule.properties

Configuration parameters

Description

Example Value

server.url

Floating IP address and port for Loader. The default port is 21351.

For compatibility, multiple IP addresses and ports can be configured and need to be separated by commas (,). The first IP address and port must be those of Loader. The others can be configured based on service requirements.

10.96.26.111:21351,127.0.0.2:21351

authentication.type

Login authentication mode.

  • kerberos indicates that the security mode is used and Kerberos authentication is performed. Kerberos authentication provides two authentication modes: the password mode and the keytab file mode.
  • simple indicates that the normal mode is used and Kerberos authentication is not performed.

kerberos

authentication.user

User for login when the normal mode or password authentication is used.

In the keytab login mode, this parameter does not need to be set.

bar

authentication.password

User password for login when the password authentication mode is used. In the normal mode or keytab login mode, this parameter does not need to be set.

The password needs to be encrypted. The encryption method is described as follows:

  1. Go to the directory where encrypt_tool is located. For example, if the Loader client installation directory is /opt/hadoopclient/Loader, run the following command:

    cd /opt/hadoopclient/Loader/loader-tools-1.99.3

  2. Run the following command to encrypt the non-encrypted password:

    ./encrypt_tool Unencrypted password

    The obtained encrypted password is used as the value of authentication.password.

    NOTE:

    If a non-encrypted password contains special characters, the special characters must be escaped. For example, the dollar sign ($) is a special character and can be escaped using single quotation marks ('). If a non-encrypted password contains single quotation marks, use double quotation marks to escape the single quotation marks. If a non-encrypted password contains double quotation marks, use backslashes (\) to escape the double quotation marks. For details, see the shell escape character rules.

-

use.keytab

Whether to use the keytab mode to log in.

  • true indicates using the keytab file to log in.
  • false indicates using the password to log in.

true

client.principal

User principal for accessing the Loader service when the keytab authentication mode is used.

In the normal mode or password login mode, this parameter does not need to be set.

loader/hadoop.System domain name

NOTE:

You can log in to FusionInsight Manager, choose System > Permission > Domain and Mutual Trust, and view the value of Local Domain, which is the current system domain name.

client.keytab

Directory where the used keytab file is located when the keytab authentication mode is used.

In the normal mode or password login mode, this parameter does not need to be set.

/opt/client/conf/loader.keytab

krb5.conf.file

Directory where the krb5.conf file is located when the keytab authentication mode is used.

In the normal mode or password login mode, this parameter does not need to be set.

/opt/client/conf/krb5.conf

Table 2 Configuration parameters of job.properties

Configuration parameters

Description

Example Value

job.jobName

Job name.

job1

file.fileName.prefix

File name prefix.

table1

file.fileName.posfix

File name suffix.

.txt

file.filter

File filter, which filters files by matching file names.

  • true indicates that the preceding prefix or suffix is used to match all files in the input path. For details, see the example.
  • false indicates that the preceding prefix or suffix is used to match a file in the input path. For details, see the example.

true

date.day

Number of delayed days, which is matched with the date in the name of an imported file. For example, if the input date is 20160202 and the number of delayed days is 3, files that contain the 20160205 date field in the input path are matched. For details, see schedule-tool Usage Example.

3

file.date.format

Log format included in the name of the file to be imported.

yyyyMMdd

parameter.date.format

Entered date format when a script is invoked, which is usually consistent with file.date.format.

yyyyMMdd

file.format.iscompressed

Whether the file to be imported is a compressed file.

false

storage.type

Storage type. The final type of the file to be imported include HDFS, HBase, and Hive.

HDFS

schedule-tool supports the configuration of multiple jobs at the same time. When multiple jobs are configured at the same time, job.jobName, file.fileName.prefix, and file.fileName.posfix in Table 2 need to be configured with multiple values, and the values need to be separated by commas (,).

Precautions

server.url must be set to a format string of two IP addresses and port numbers, and the IP addresses and ports need to be separated by commas (,).