Updated on 2022-09-22 GMT+08:00

To OBS

If the destination link of a job is the Link to OBS, configure the destination job parameters based on Table 1.

Advanced attributes are optional and not displayed by default. You can click Show Advanced Attributes to display them.

Table 1 Parameter description

Category

Parameter

Description

Example Value

Basic parameters

Bucket Name

Name of the OBS bucket that data will be written to

bucket_2

Write Directory

OBS directory to which data will be written. Do not add / in front of the directory name.

This parameter can be configured as a macro variable of date and time and a path name can contain multiple macro variables. When the macro variable of date and time works with a scheduled job, the incremental data can be synchronized periodically. For details, see Incremental Synchronization Using the Macro Variables of Date and Time.

NOTE:

If you have configured a macro variable of date and time and schedule a CDM job through DataArts Studio DataArts Factory, the system replaces the macro variable of date and time with (Planned start time of the data development jobOffset) rather than (Actual start time of the CDM jobOffset).

directory/

File Format

Format in which data is written. The options are as follows:
  • CSV: Data is written in CSV format, which is used for migrating data tables to files.
  • Binary: Files will be transferred directly. CDM writes the files without changing their format. This setting is suitable for file migration.

If data is migrated between file-related data sources, such as FTP, SFTP, HDFS, and OBS, the value of File Format must the same as the source file format.

CSV

Duplicate File Processing Method

Files with the same name and size are identified as duplicate files. If there are duplicate files during data writing, the following methods are available:
  • Replace
  • Skip
  • Stop job

For details, see Incremental File Migration.

Skip

Advanced attributes

Encryption

Whether to encrypt the uploaded data and the encryption mode. The options are as follows:
  • None: Data is written without encryption.
  • KMS: KMS in Data Encryption Workshop (DEW) is used for encryption. If KMS encryption is enabled, MD5 verification for data cannot be performed.
  • AES-256-GCM: The AES 256-bit encryption algorithm is used to encrypt data. Currently, only the AES-256-GCM (NoPadding) encryption algorithm is supported. This parameter is used for encryption at the migration destination and decryption at the migration source.

For details, see Encryption and Decryption During File Migration.

KMS

Key ID

Data encryption key. This parameter is displayed when Encryption is set to KMS. Click next to the text box to select the KMS key that was created in DEW.

  • If the KMS key of the same project as that of the CDM cluster is used, you do not need to modify Project ID.
  • If the KMS key of another project is used, you need to modify Project ID.

53440ccb-3e73-4700-98b5-71ff5476e621

Project ID

ID of the project to which KMS ID belongs. The default value is the ID of the project to which the current CDM cluster belongs.

  • If KMS and the CDM cluster are in the same project, retain the default value of Project ID.
  • If KMS of another project is used, set this parameter to the ID of the project to which KMS belongs.

9bd7c4bd54e5417198f9591bef07ae67

DEK

This parameter is displayed only when Encryption is set to AES-256-GCM. The key consists of 64 hexadecimal numbers.

Remember the key configured here because the decryption key must be the same as that configured here. If the encryption and decryption keys are inconsistent, the system does not report an exception, but the decrypted data is incorrect.

DD0AE00DFECD78BF051BCFDA25BD4E320DB0A7AC75A1F3FC3D3C56A457DCDC1B

IV

This parameter is displayed only when Encryption is set to AES-256-GCM. The initialization vector consists of 32 hexadecimal numbers.

Remember the initialization vector configured here because the initialization vector used for decryption must be the same as that configured here. If the initialization vectors are inconsistent, the system does not report an exception, but the decrypted data is incorrect.

5C91687BA886EDCD12ACBC3FF19A3C3F

Copy Content-Type

This parameter is displayed only when File Format is Binary, and both the migration source and destination are object storage.

If you set this parameter to Yes, the Content-Type attribute of the source file is copied during object file migration. This function is mainly used for static website migration.

The Content-Type attribute cannot be written to Archive buckets. Therefore, if you set this parameter to Yes, the migration destination must be a non-Archive bucket.

No

Line Separator

Lind feed character in a file. By default, the system automatically identifies \n, \r, and \r\n. This parameter is not used when File Format is set to Binary.

\n

Field Delimiter

Field delimiter in the file. This parameter is not used when File Format is set to Binary.

,

File Size

This parameter is displayed only when the migration source is a database. Files are partitioned as multiple files by size so that they can be exported in proper size. The unit is MB.

1024

Validate MD5 Value

The MD5 value can be verified only when files are transferred in Binary format. KMS encryption cannot be used if the MD5 value needs to be verified.

Calculate the MD5 value of the source files and verify it with the MD5 value returned by OBS. If an MD5 file exists on the migration source, the system directly reads the MD5 file from the migration source and verifies it with the MD5 value returned by OBS. For details, see MD5 Verification.

Yes

Record MD5 Verification Result

Whether to record the MD5 verification result when Validate MD5 Value is set to Yes

Yes

Record MD5 Link

OBS link to which the MD5 verification result will be written

obslink

Record MD5 Bucket

OBS bucket to which the MD5 verification result will be written

cdm05

Record MD5 Directory

Directory to which the MD5 verification result will be written

/md5/

Encoding Type

Encoding type, for example, UTF-8 or GBK. This parameter is not used when File Format is set to Binary.

GBK

Use Quote Character

This parameter is displayed only when File Format is CSV. It is used when database tables are migrated to file systems.

If you set this parameter to Yes and a field in the source data table contains a field delimiter or line separator, CDM uses double quotation marks (") as the quote character to quote the field content as a whole to prevent a field delimiter from dividing a field into two fields, or a line separator from dividing a field into different lines. For example, if the hello,world field in the database is quoted, it will be exported to the CSV file as a whole.

No

Use First Row as Header

This parameter is displayed only when data is exported from a relational database to OBS and File Format is set to CSV.

When a table is migrated to a CSV file, CDM does not migrate the heading line of the table by default. If you set this parameter to Yes, CDM writes the heading line of the table to the file.

No

Job Success Marker File

Whether to generate a marker file with a custom name in the destination directory after a job is executed successfully. If you do not specify a file name, this function is disabled by default.

finish.txt

Customize Hierarchical Directory

If this parameter is set to Yes, the files after migration can be stored in a custom directory. That is, only files are migrated. The directories to which the files belong are not migrated.

Yes

Hierarchical Directory

Custom storage directory for files after migration. The time macro variable is supported.

${dateformat(yyyy-MM-dd HH:mm:ss, -1, DAY)}

Customize File Name

This parameter is displayed only when data is exported from a relational database to OBS and File Format is set to CSV.

This parameter specifies the name of the file generated by OBS. The options are as follows:
  • Character string: Special characters are allowed. For example, if this parameter is set to cdm#, the name of the generated file is cdm#.csv.
  • Macro variable of time: If this parameter is set to ${timestamp()}, the name of the generated file is 1554108737.csv.
  • Macro variable of table name: If this parameter is set to ${tableName}, the name of the generated file is sqltabname.csv.
  • Macro variable of version number: If this parameter is set to ${version}, the name of the generated file is v1.csv.
  • Any combination of the character string and macro variable (macro variable of time, table name, or version number). For example, if this parameter is set to cdm#${timestamp()}_${version}, the name of the generated file is cdm#1554108737_v1.csv.

cdm