Using Loader from Scratch

You can use Loader to import data from the SFTP server to HDFS.

This section applies to MRS clusters earlier than 3.x.

Access the Loader page.
1. Go to the cluster details page and choose Services.
2. Choose Hue. In Hue Web UI of Hue Summary, click Hue (Active). The Hue web UI is displayed.
3. Choose Data Browsers > Sqoop.
  The job management tab page is displayed by default on the Loader page.
On the Loader page, click Manage links.
Click New link and create sftp-connector. For details, see File Server Link.
Click New link, enter the link name, select hdfs-connector, and create hdfs-connector.
On the Loader page, click Manage jobs.
Click New Job.
In Connection, set parameters.
1. In Name, enter a job name.
2. Select the source link created in 3 and the target link created in 4.
In From, configure the job of the source link.

For details, see ftp-connector or sftp-connector.
In To, configure the job of the target link.

For details, see hdfs-connector.

In Task Config, set job running parameters.

**Table 1** Loader job running properties
Parameter	Description
Extractors	Number of Map tasks
Loaders	Number of Reduce tasks This parameter is displayed only when the destination field is HBase or Hive.
Max. Error Records in a Single Shard	Error record threshold. If the number of error records of a single Map task exceeds the threshold, the task automatically stops and the obtained data is not returned. NOTE: Data is read and written in batches for MYSQL and MPPDB of generic-jdbc-connector by default. Errors are recorded once at most for each batch of data.
Dirty Data Directory	Directory for saving dirty data. If you leave this parameter blank, dirty data will not be saved.