Using Loader from Scratch
You can use Loader to import data from the SFTP server to HDFS.
This section applies to MRS clusters earlier than 3.x.
Prerequisites
- You have prepared service data.
- You have created an analysis cluster.
Procedure
- Access the Loader page.
- Go to the cluster details page and choose Services.
- Choose Hue Web UI of Hue Summary, click Hue (Active). The Hue web UI is displayed. . In
- Choose
The job management tab page is displayed by default on the Loader page.
.
- On the Loader page, click Manage links.
- Click New link and create sftp-connector. For details, see File Server Link.
- Click New link, enter the link name, select hdfs-connector, and create hdfs-connector.
- On the Loader page, click Manage jobs.
- Click New Job.
- In Connection, set parameters.
- In From, configure the job of the source link.
For details, see ftp-connector or sftp-connector.
- In To, configure the job of the target link.
For details, see hdfs-connector.
- In Task Config, set job running parameters.
Table 1 Loader job running properties Parameter
Description
Extractors
Number of Map tasks
Loaders
Number of Reduce tasks
This parameter is displayed only when the destination field is HBase or Hive.
Max. Error Records in a Single Shard
Error record threshold. If the number of error records of a single Map task exceeds the threshold, the task automatically stops and the obtained data is not returned.
NOTE:Data is read and written in batches for MYSQL and MPPDB of generic-jdbc-connector by default. Errors are recorded once at most for each batch of data.
Dirty Data Directory
Directory for saving dirty data. If you leave this parameter blank, dirty data will not be saved.
- Click Save.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.