Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
On this page

Show all

From FTP/SFTP

Updated on 2024-10-24 GMT+08:00

If the source link of a job is an FTP or SFTP link, configure the source job parameters based on Table 1.

Advanced attributes are optional and not displayed by default. You can click Show Advanced Attributes to display them.

Table 1 Parameter description

Category

Parameter

Description

Example Value

Basic parameters

Source Directory/File

Directory or file path from which data will be extracted. You can enter a maximum of 50 file paths. By default, the file paths are separated by vertical bars (|). You can also customize a file separator. For details, see Migration of a List of Files.

Directory from which data is to be migrated. All files (including all nested subdirectories and their subfiles) in the directory will be migrated.

This parameter can be configured as a macro variable of date and time and a path name can contain multiple macro variables. When the macro variable of date and time works with a scheduled job, the incremental data can be synchronized periodically. For details, see Incremental Synchronization Using the Macro Variables of Date and Time.

NOTE:

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

/ftp/a.csv|/ftp/b.txt

File Format

Format in which CDM parses data. The options are as follows:
  • CSV: Source files will be migrated to tables after being converted to CSV format.
  • Binary: Files (even not in binary format) will be transferred directly. This format is used to copy data from a file to another.
  • JSON: Source files will be migrated to tables after being converted to JSON format.
NOTE:

If the destination is OBS, only the binary format is supported.

CSV

JSON Type

This parameter is displayed only when File Format is set to JSON. Type of a JSON object stored in a JSON file. The options are JSON object and JSON array.

JSON object

JSON Reference Node

This parameter is used only when File Format is set to JSON and JSON Type is set to JSON Object. CDM parses the data under the JSON node. If the node's corresponding data is a JSON array, the system will extract data from the array in the same pattern. Use periods (.) to separate multi-layer nested JSON nodes.

data.list

Advanced attributes

Use rfc4180 Parser

This parameter is displayed only when File Format is set to CSV. It specifies whether to use the rfc4180 parser to parse CSV files.

No

Line Separator

Lind feed character in a file. By default, the system automatically identifies \n, \r, and \r\n. This parameter is displayed only when File Format is set to CSV.

\n

Field Delimiter

Character used to separate fields in the file. To set the Tab key as the delimiter, set this parameter to \t. This parameter is displayed only when File Format is set to CSV.

,

Use Quote Character

If you set this parameter to Yes, the field delimiters in the encircling symbol are regarded as a part of the string value. Currently, the default encircling symbol of CDM is ".

No

Using Escape Char

If you select Yes, the backslash (\) in the data row is used as an escape character. If you select No, the backslash (\) in the CSV file will not be escaped. CSV supports only the backslash (\) as the escape character.

Yes

Use RE to Separate Fields

Whether to use regular expressions to separate fields. If you set this parameter to Yes, Field Delimiter becomes invalid. This parameter is displayed only when File Format is set to CSV.

Yes

Regular Expression

This parameter is available only when Using RE to separate fields is set to Yes.

Regular expression used to separate fields. For details about regular expressions, see Regular Expressions for Separating Semi-structured Text.

^(\d.*\d) (\w*) \[(.*)\] ([\w\.]*) (\w.*).*

Use First Row as Header

This parameter is displayed only when File Format is set to CSV. When you migrate a CSV file to a table, CDM writes all data to the table by default. If you set this parameter to Yes, CDM uses the first N rows of the CSV file as the heading row and does not write the row to the destination table.

Yes

Encoding Type

Encoding type, for example, UTF-8 or GBK. You can set the encoding type for text files only. This parameter is invalid when File Format is set to Binary.

UTF-8

Compression Format

The options are as follows:
  • NONE: Files in all formats can be transferred.
  • GZIP: Only files in gzip format can be transferred.
  • ZIP: Only files in Zip format can be transferred.
  • TAR.GZ: Files in TAR.GZ format are transferred.

NONE

Compressed File Suffix

This parameter is displayed when Compression Format is not NONE.

This parameter specifies the extension of the files to be decompressed. The decompression operation is performed only when the file name extension is used in a batch of files. Otherwise, files are transferred in the original format. If you enter * or leave the parameter blank, all files are decompressed.

*

Start Job by Marker File

Whether to start a job by a marker file. A job is only started if there is a marker file for starting the job in the source path. If there is no marker file, the job will be suspended for a period of time specified by Suspension Period.

Yes

File Separator

File separator. If you enter multiple file paths in Source Directory/Files, CDM uses the file separator to identify files. The default value is |.

|

Marker File

Name of the marker file for starting a job. If you specify a marker file, the migration job is executed only when the marker file exists in the source path. The marker file will not be migrated.

ok.txt

Suspension Period

Waiting period for a marker file. If you set Start Job by Marker File to Yes but there is no marker file in the source path, the job fails when the suspension period times out.

If you set this parameter to 0 and there is no marker file in the source path, the job will fail immediately.

Unit: second

10

Filter Type

Only paths or files that meet the filtering conditions are transferred. The options are None, Wildcard, and Regex. For details, see Incremental File Migration.

None

Directory Filter

If you set Filter Type to Wildcard or Regex, enter a wildcard character to filter paths. The paths that meet the filtering condition are migrated. You can configure multiple paths separated by commas (,).

NOTE:

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

*input,*out

File Filter

If you set Filter Type to Wildcard or Regex, enter a wildcard character to filter paths. The files that meet the filtering condition are migrated. You can configure multiple files separated by commas (,).

NOTE:

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

*.csv

Time Filter

If you select Yes, files are transferred based on their modification time.

Yes

Minimum Timestamp

If you set Time Filter to Yes, you can specify a point in time for Minimum Timestamp, and then only the files modified at or after the specified time are transferred. The time format must be yyyy-MM-dd HH:mm:ss.

This parameter can be set to a macro variable of date and time. For example, ${timestamp(dateformat(yyyy-MM-dd HH:mm:ss,-90,DAY))} indicates that only files generated within the latest 90 days are migrated.

NOTE:

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

2019-07-01 00:00:00

Maximum Timestamp

If you set Time Filter to Yes, you can specify a point in time for Maximum Timestamp, and then only the files modified before the specified time are transferred. The time format must be yyyy-MM-dd HH:mm:ss.

This parameter can be set to a macro variable of date and time. For example, ${timestamp(dateformat(yyyy-MM-dd HH:mm:ss))} indicates that only the files whose modification time is earlier than the current time are migrated.

NOTE:

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

2019-07-30 00:00:00

Disregard Non-existent Path or File

If this parameter is set to Yes, the job can be successfully executed even if the source path does not exist.

No

Marker File Type

This parameter is available only when Start Job by Marker File is set to Yes.

  • MARK_DONE: The migration job is executed only when the marker file exists in the source path.
  • MARK_DOING: The migration job is executed only when the marker file does not exist in the source path.

MARK_DOING

Whether to skip empty lines

This parameter is available only when File Format is set to CSV.

If a line is empty, it is skipped.

No

null value

This parameter is available only when File Format is set to Binary.

No string can be used to define a null value in text files. This parameter specifies the string to be identified as a null value.

No

MD5 File Extension

This parameter is displayed only when File Format is set to Binary.

This parameter is used to check whether the files extracted by CDM are consistent with source files. For details, see MD5 Verification.

.md5

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback