Updated on 2024-11-20 GMT+08:00

From HTTP

If the source link of a job is an HTTP link, configure the source job parameters based on Table 1. Currently, data can only be exported from the HTTP URLs.

Table 1 Parameter description

Parameter

Description

Example Value

File URL

Use the GET method to obtain data from the HTTP/HTTPS URL.

These connectors are used to read files with an HTTP/HTTPS URL, such as reading public files on the third-party object storage system and web disks.

https://bucket.obs.myhuaweicloud.com/object-key

Pull List File

If this parameter is set to Yes, the system pulls the files corresponding to the URLs in the text file to be uploaded and stores them on OBS. The text file records the file paths on HDFS.

Yes

OBS Link of List File

Select an existing OBS link.

obs_link

OBS Bucket of entries files

Name of the OBS bucket that stores the text file

obs-cdm

Path/Directory of entries files

Custom OBS directories that store the text file. Use slashes (/) to separate different directories.

test1

File Format

Format used for transmitting data. The CSV and JSON formats are supported for migration to tables, and the binary format is supported for file migration.

Binary

Compression Format

Compression format of the source files. The options are as follows:
  • NONE: Files in all formats can be transferred.
  • GZIP: Only files in gzip format can be transferred.
  • ZIP: Only files in Zip format can be transferred.
  • TAR.GZ: Files in TAR.GZ format are transferred.

NONE

Compressed File Suffix

This parameter is displayed when Compression Format is not NONE.

This parameter specifies the extension of the files to be decompressed. The decompression operation is performed only when the file name extension is used in a batch of files. Otherwise, files are transferred in the original format. If you enter * or leave the parameter blank, all files are decompressed.

*

File Separator

File separator. When multiple files are transferred, CDM uses the file separator to identify files. The default value is |. This parameter is not displayed if Pull List File is set to Yes.

|

Query Parameter

  • If you set this parameter to Yes, the name of the objects uploaded to OBS does not include the query parameter.
  • If you set this parameter to No, the name of the objects uploaded to OBS includes the query parameter.

No

Disregard Non-existent Path or File

If this is set to Yes, the job can be successfully executed even if the source path does not exist.

No

MD5 File Extension

This parameter is used to check whether the files extracted by CDM are consistent with source files. For details, see MD5 Verification.

.md5

Query Parameter

If this parameter is set to Yes, the name of the object to be uploaded is a string with the query parameter removed.

No