Updated on 2024-11-29 GMT+08:00

loader-tool Usage Guide

Overview

loader-tool is a Loader client tool. It consists of three tools: lt-ucc, lt-ucj, lt-ctl.

Loader supports two modes, parameter mode and job template mode. Either mode can be used to create, update, query, and delete connectors, and to create, update, query, delete, start, and stop Loader jobs.

loader-tool implements an asynchronous interface. After a command is submitted, the command output is not returned to the console in real time. Therefore, the results of the creation, update, query, and deletion operations on a connector and the creation, update, query, deletion, start, and stop operations on a Loader job must be confirmed on the Loader WebUI or by querying server logs.

  • Parameter mode:

    Add a parameter invoking script with specific parameters.

  • Job template mode:

    Change the values of all parameters in a job template and reference the job template when invoking a script.

    After a Loader client is installed, the system automatically generates job templates for various scenarios in the Loader client installation directory/loader-tools-1.99.3/loader-tool/job-config/ directory. The parameters vary according to job templates. Job templates contain information about jobs and associated connectors.

    Job templates are XML files. The file name format is original data location-to-new data location.xml, for example, sftp-to-hdfs.xml. If a job supports conversion step, a json conversion step configuration file with the same name exists, for example, sftp-to-hdfs.json.

    Job templates contain the configuration information of connectors. During the connector creation and updating, only the connector information in job templates is invoked.

Scenarios

The parameters vary according to connectors or jobs.

  • To modify some parameters, use the parameter mode.
  • To create a connector or job, use the job template mode.

    This tool currently supports the FTP, HDFS, JDBC, MySQL, Oracle, and Oracle dedicated connectors. If other types of connectors are used, you are advised to use the open-source sqoop-shell tool.

Parameters

For example, the Loader client installation directory is /opt/client/Loader/.

  • lt-ucc usage description

    lt-ucc is a connector configuration tool of loader-tool user-configuration-connection and is used to create, update, and delete connectors.

    Table 1 lt-ucc script parameter description

    Parameter

    Description

    Example Value

    -help

    Help information.

    -

    -a <arg>

    Connector action. The values include create, update and delete for creating, updating, and deleting connectors respectively.

    create

    -at <arg>

    Login authentication type. The values include kerberos and simple.

    kerberos

    -uk <arg>

    Whether to use the keytab file.

    true

    -au <arg>

    Login authentication username.

    bar

    -ap <arg>

    Login authentication password. The value must be an encrypted password.

    The password encryption method is described as follows:

    sh Loader client installation directory/Loader/loader-tools-1.99.3/encrypt_tool non-encrypted user password

    NOTE:

    If a non-encrypted password contains special characters, the special characters must be escaped. For example, the dollar sign ($) is a special character and can be escaped using single quotation marks ('). If a non-encrypted password contains single quotation marks, use double quotation marks to escape the single quotation marks. If a non-encrypted password contains double quotation marks, use backslashes (\) to escape the double quotation marks. For details, see the shell escape character rules.

    -

    -c <arg>

    Login authentication principal.

    bar

    -k <arg>

    Login authentication keytab file.

    /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/hadoop-config/user.keytab

    -h <arg>

    Specifies the configuration file path of the MRS cluster.

    -h /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/hadoop-config

    -l <arg>

    Login template file.

    /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml

    -s <arg>

    Floating IP address and port for Loader.

    Format: floating IP address: port

    The default port is 21351.

    127.0.0.1:21351

    -w <arg>

    Job template file path for obtaining job details.

    /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/sftp-to-hdfs.xml

    -z <arg>

    Service IP address and port number of ZooKeeper quorum instances, in the format of IP address:Port number. Use commas (,) to separate multiple IP addresses and port numbers.

    127.0.0.0:2181, 127.0.0.1:2181

    -n <arg>

    Connector name

    vt_sftp_test

    -t <arg>

    Connector type

    sftp-connector

    -P <arg>

    Used to update the value of an attribute. The format is -Pparam1=value1. param1 indicates the attribute name of the connector in the job template. Password parameters are required for updating SFTP and FTP connector information.

    -Pconnection.sftpPassword=Encrypted password

    -Pconnection.sftpServerIp=10.6.26.11

    A complete example is as follows:

    ./bin/lt-ucc -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -n vt_sftp_test -t sftp-connector -Pconnection.sftpPassword=Password ciphertext -Pconnection.sftpServerIp=10.6.26.111 -a update

    Configuration description of a lt-ucc script job template:

    Use the operation of saving SFTP data to HDFS as an example. Edit the sftp-to-hdfs.xml file in Loader client installation directory/loader-tools-1.99.3/loader-tool/job-config/ directory. The connector configuration is as follows:

    <!-- Database connection information -->
    <sqoop.connection name="vt_sftp_test" type="sftp-connector">
    <connection.sftpServerIp>10.96.26.111</connection.sftpServerIp>
    <connection.sftpServerPort>22</connection.sftpServerPort>
    <connection.sftpUser>root</connection.sftpUser>
    <connection.sftpPassword>Password ciphertext</connection.sftpPassword>
    </sqoop.connection>

    • Creation command:

      ./lt-ucc -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -w /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/ftp-to-hdfs.xml -a create

    • Update command:

      ./lt-ucc -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -w /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/ftp-to-hdfs.xml -a update

    • Deletion command:

      ./lt-ucc -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -w /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/ftp-to-hdfs.xml -a delete

  • lt-ucj usage description

    lt-ucj is a job configuration tool of loader-tool user-configuration-job and is used to create, update, and delete jobs.

    Table 2 lt-ucj script parameter description

    Parameter

    Description

    Example Value

    -help

    Help information.

    -

    -a <arg>

    Job action. The values include create, update, and delete for creating, updating and deleting jobs respectively.

    create

    -at <arg>

    Login authentication type. The values include kerberos and simple.

    kerberos

    -uk <arg>

    Whether to use the keytab file.

    true

    -au <arg>

    Login authentication username.

    bar

    -ap <arg>

    Login authentication password. The value must be an encrypted password.

    The password encryption method is described as follows:

    sh Loader client installation directory/Loader/loader-tools-1.99.3/encrypt_tool non-encrypted user password

    NOTE:

    If a non-encrypted password contains special characters, the special characters must be escaped. For example, the dollar sign ($) is a special character and can be escaped using single quotation marks ('). If a non-encrypted password contains single quotation marks, use double quotation marks to escape the single quotation marks. If a non-encrypted password contains double quotation marks, use backslashes (\) to escape the double quotation marks. For details, see the shell escape character rules.

    -

    -c <arg>

    Login authentication principal.

    bar

    -k <arg>

    Login authentication keytab file.

    /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/hadoop-config/user.keytab

    -h <arg>

    Specifies the configuration file path of the MRS cluster.

    -h /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/hadoop-config

    -l <arg>

    Login template file.

    /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml

    -s <arg>

    Floating IP address and port for Loader.

    Format: floating IP address: port

    The default port is 21351.

    127.0.0.1:21351

    -w <arg>

    Job template file for obtaining job details.

    /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/sftp-to-hdfs.xml

    -z <arg>

    Service IP address and port number of ZooKeeper quorum instances, in the format of IP address:Port number. Use commas (,) to separate multiple IP addresses and port numbers.

    127.0.0.0:2181, 127.0.0.1:2181

    -n <arg>

    Name of the job.

    Sftp.to.Hdfs

    -cn <arg>

    Connector name

    vt_sftp_test

    -ct <arg>

    Connector type

    sftp-connector

    -t <arg>

    Job type. The values include IMPORT and EXPORT.

    IMPORT

    -trans <arg>

    Job associated conversion step file.

    /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/sftp-to-hdfs.json

    -priority <arg>

    Job priority. The values include LOW, NORMAL, and HIGH.

    NORMAL

    -queue <arg>

    Queues

    default

    -storageType <arg>

    Storage type

    HDFS

    -P <arg>

    Used to update the value of an attribute. The format is -Pparam1=value1. param1 indicates the attribute name of the connector in the job template. Password parameters are required for updating SFTP and FTP connector information.

    -Pconnection.sftpPassword=Encrypted password

    -Pconnection.sftpServerIp=10.6.26.11

    A complete example is as follows:

    ./bin/lt-ucj -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -n Sftp.to.Hdfs -t IMPORT -ct sftp-connector -Poutput.outputDirectory=/user/loader/sftp-to-hdfs-test8888 -a update

    Configuration description of a lt-ucj script job template:

    Use the operation of saving SFTP data to HDFS as an example. Edit the file loader client installation directory/loader-tools-1.99.3/loader-tool/job-config/sftp-to-hdfs.xml. The job configuration is as follows:

    <!--Job name, globally unique.-->
    <sqoop.job name="Sftp.to.Hdfs" type="IMPORT" queue="default" priority=" Priority NORMAL ">
    
    <!-- External data source parameter configuration -->
    <data.source connectionName="vt_sftp_test" connectionType="sftp-connector">
    <file.inputPath>/opt/houjt/hive/all</file.inputPath>
    <file.splitType>FILE</file.splitType>
    <file.filterType>WILDCARD</file.filterType>
    <file.pathFilter>*</file.pathFilter>
    <file.fileFilter>*</file.fileFilter>
    <file.encodeType>GBK</file.encodeType>
    <file.suffixName></file.suffixName>
    <file.isCompressive>FALSE</file.isCompressive>
    </data.source>
    
    <!-- MRS cluster, parameter configuration -->
    <hadoop.source storageType="HDFS" >
    <output.outputDirectory>/user/loader/sftp-to-hdfs</output.outputDirectory>
    <output.fileOprType>OVERRIDE</output.fileOprType>
    <throttling.extractors>3</throttling.extractors>
    <output.fileType>TEXT_FILE</output.fileType>
    </hadoop.source>
    
    <!-- Job associated conversion step file -->
    <sqoop.job.trans.file>/opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/sftp-to-hdfs.json</sqoop.job.trans.file>
    </sqoop.job>
    • Creation command:

      ./bin/lt-ucj -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -w /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/sftp-to-hdfs.xml -a create

    • Update command:

      ./bin/lt-ucj -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -w /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/sftp-to-hdfs.xml -a update

    • Deletion command:

      ./bin/lt-ucj -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -w /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/sftp-to-hdfs.xml -a delete

  • lt-ctl usage description

    lt-ctl is a job management tool of loader-tool controller and is used to start or stop jobs, query job status and progress, and check whether jobs are running.

    Table 3 lt-ctl script parameter description

    Parameter

    Description

    Example Value

    -help

    Help information.

    -

    -a <arg>

    Job action. The values include status, start, stop, and is running for querying job status, starting or stopping jobs, and checking whether jobs are running.

    create

    -at <arg>

    Login authentication type. The values include kerberos and simple.

    kerberos

    -uk <arg>

    Whether to use the keytab file.

    true

    -au <arg>

    Login authentication username.

    bar

    -ap <arg>

    Login authentication password. The value must be an encrypted password.

    The password encryption method is described as follows:

    sh Loader client installation directory/Loader/loader-tools-1.99.3/encrypt_tool non-encrypted user password

    NOTE:

    If a non-encrypted password contains special characters, the special characters must be escaped. For example, the dollar sign ($) is a special character and can be escaped using single quotation marks ('). If a non-encrypted password contains single quotation marks, use double quotation marks to escape the single quotation marks. If a non-encrypted password contains double quotation marks, use backslashes (\) to escape the double quotation marks. For details, see the shell escape character rules.

    -

    -c <arg>

    Login authentication principal.

    bar

    -k <arg>

    Login authentication keytab file.

    /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/hadoop-config/user.keytab

    -h <arg>

    Specifies the configuration file path of the MRS cluster.

    -h /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/hadoop-config

    -l <arg>

    Login template file.

    /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml

    -n <arg>

    Name of the job.

    Sftp.to.Hdfs

    -s <arg>

    Floating IP address and port for Loader.

    Format: floating IP address: port

    The default port is 21351.

    127.0.0.1:21351

    -w <arg>

    Job template file for obtaining job details.

    /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/sftp-to-hdfs.xml

    -z <arg>

    Service IP address and port number of ZooKeeper quorum instances, in the format of IP address:Port number. Use commas (,) to separate multiple IP addresses and port numbers.

    127.0.0.0:2181, 127.0.0.1:2181

    • Command for starting jobs:

      ./bin/lt-ctl -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -n Sftp.to.Hdfs -a start

    • Command for viewing job status:

      ./bin/lt-ctl -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -n Sftp.to.Hdfs -a status

    • Command for checking whether jobs are running:

      ./bin/lt-ctl -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -n Sftp.to.Hdfs -a isrunning

    • Command for stopping jobs:

      ./bin/lt-ctl -l /opt/hadoopclient/Loader/loader-tools-1.99.3/loader-tool/job-config/login-info.xml -n Sftp.to.Hdfs -a stop