Updated on 2022-05-18 GMT+08:00

Synchronously Uploading Incremental Objects

Function

This function synchronizes all content in the local source path to the specified target bucket on OBS, ensuring that the content is consistent between the local path and the target bucket. Incremental synchronization has the following meanings: 1) Increment: Compare the source file with the target object and upload only the source file that has changes. 2) Synchronization: After the command is executed, ensure that the local source path is a subset of the target bucket specified by OBS. That is, any file in the local source path has its corresponding object in the target bucket on OBS.

  • Do not change the local file or folder during synchronization. Otherwise, the synchronization may fail or data may be inconsistent.
  • Each file can be synchronously uploaded only when it does not exist in the bucket, its size is different from the namesake one in the bucket, or it has the latest modification time.

Command Line Structure

  • In Windows
    • Uploading a file synchronously
      obsutil sync file_url obs://bucket[/key] [-arcDir=xxx] [-dryRun] [-link] [-vlength] [-vmd5] [-p=1] [-threshold=5248800] [-acl=xxx] [-sc=xxx] [-meta=aaa:bbb#ccc:ddd] [-ps=auto] [-o=xxx] [-cpd=xxx] [-fr] [-config=xxx] 
    • Uploading a folder synchronously
      obsutil sync folder_url obs://bucket[/key] [-arcDir=xxx] [-dryRun] [-link] [-vlength] [-vmd5] [-j=1] [-p=1] [-threshold=52428800] [-acl=xxx] [-sc=xxx] [-meta=aaa:bbb#ccc:ddd] [-ps=auto] [-include=*.xxx] [-exclude=*.xxx] [-timeRange=time1-time2] [-at] [-mf] [-o=xxx] [-cpd=xxx] [-config=xxx] 
  • In Linux or macOS
    • Uploading a file synchronously
      ./obsutil sync file_url obs://bucket[/key] [-arcDir=xxx] [-dryRun] [-link] [-vlength] [-vmd5] [-p=1] [-threshold=5248800] [-acl=xxx] [-sc=xxx] [-meta=aaa:bbb#ccc:ddd] [-ps=auto] [-o=xxx] [-cpd=xxx] [-fr] [-config=xxx] 
    • Uploading a folder synchronously
      ./obsutil sync folder_url obs://bucket[/key] [-arcDir=xxx] [-dryRun] [-link] [-vlength] [-vmd5] [-j=1] [-p=1] [-threshold=52428800] [-acl=xxx] [-sc=xxx] [-meta=aaa:bbb#ccc:ddd] [-ps=auto] [-include=*.xxx] [-exclude=*.xxx] [-timeRange=time1-time2] [-at] [-mf] [-o=xxx] [-cpd=xxx] [-config=xxx] 

Examples

  • Take the Windows OS as an example. Run the obsutil sync d:\temp\test.txt obs://bucket-test/key command to synchronously upload a file.
    obsutil sync d:\temp\test.txt obs://bucket-test/key
    
    Parallel:      3                   Jobs:          3
    Threshold:     524288000           PartSize:      5242880
    Exclude:                           Include:
    VerifyLength:  false               VerifyMd5:     false
    CheckpointDir: xxxx
    
    [====================================================] 100.00% 1.68 MB/s 5s
    Upload successfully, 8.46MB, d:\temp\test.txt --> obs://bucket-test/key
  • Take the Windows OS as an example. Run the obsutil sync d:\temp obs://bucket-test/temp command to synchronously upload a folder.
    obsutil sync d:\temp obs://bucket-test/temp
    
    Parallel:      3                   Jobs:          3
    Threshold:     524288000           PartSize:      5242880
    Exclude:                           Include:
    VerifyLength:  false               VerifyMd5:     false
    CheckpointDir: xxxx
    OutputDir: xxxx
    
    [========================================================] 100.00% 2.02 KB/s 0s
    Succeed count is:   5         Failed count is:    0
    Metrics [max cost:90 ms, min cost:45 ms, average cost:63.80 ms, average tps:35.71]
    Task id is: 104786c8-27c2-48fc-bc6a-5886596fb0ed

Parameter Description

Parameter

Optional or Mandatory

Description

file_url

Mandatory for uploading a file synchronously

Local file path

folder_url

Mandatory for uploading a folder synchronously

Local folder path

bucket

Mandatory

Bucket name

key

Optional

Indicates the object name or object name prefix specified when uploading a file synchronously, or the object name prefix specified when uploading a folder synchronously.

The rules are as follows:

  • If this parameter is left blank when synchronously uploading a file, the file is uploaded to the root directory of the bucket and the object name is the file name. If the value ends with a slash (/), the value is used as the object name prefix when the file is uploaded, and the object name is the value plus the file name. If the value does not end with a slash (/), the file is uploaded with the value as the object name.
  • If this parameter is left blank when synchronously uploading a folder, all objects in the root directory of the bucket are the same as the files in the local folder. If this parameter is configured, objects whose name prefix is the configured value are the same as the files in the local folder.
NOTE:
  • If the value of this parameter does not end with a slash (/) when synchronously uploading a folder, the obsutil tool automatically adds a slash (/) at the end of the configured value as the object name prefix.
  • For details about how to use this parameter, see Synchronous Upload.

fr

Optional for synchronously uploading a file (additional parameter)

Generates an operation result list when synchronously uploading a file.

arcDir

Optional (additional parameter)

Path to which the synchronously uploaded files are archived

dryRun

Optional (additional parameter)

Conducts a dry run.

link

Optional (additional parameter)

Uploads the actual path of the symbolic-link file/folder

NOTICE:
  • If this parameter is not specified and the file to be uploaded is a symbolic-link file whose target file does not exist, the exception message "The system cannot find the file specified" will be displayed in Windows OS, while the exception message "No such file or directory" will be displayed in macOS or Linux OS.
  • Avoid the symbolic link loop of a folder, otherwise, the upload will exit due to panic. If you do not want the system to panic, set panicForSymbolicLinkCircle to false in the configuration file.

vlength

Optional (additional parameter)

After the synchronous upload is complete, check whether the sizes of the objects in the bucket are the same as those of the local files.

vmd5

Optional (additional parameter)

After the synchronous upload is complete, check whether the MD5 values of the objects in the bucket are the same as those of the local files.

NOTE:
  • If the size of the file or folder to be uploaded is too large, using this parameter will degrade the overall performance due to MD5 calculation.
  • After the MD5 value verification is successful, the parameter value is set to the object metadata x-obs-md5chksum, which is used for later MD5 verification during download or copy.

p

Optional (additional parameter)

Indicates the maximum number of concurrent multipart upload tasks when uploading a file. The default value is the value of defaultParallels in the configuration file.

threshold

Optional (additional parameter)

Indicates the threshold for enabling multipart upload, in bytes. The default value is the value of defaultBigfileThreshold in the configuration file.

NOTE:
  • If the size of the file or folder to be uploaded is smaller than the threshold, upload it directly. Otherwise, a multipart upload is required.
  • If you upload a file or folder directly, no part record is generated, and resumable transmission is not supported.
  • This value can contain a capacity unit. For example, 1 MB indicates 1048576 bytes.

acl

Optional (additional parameter)

Access control policies that can be specified when synchronously uploading files. Possible values are:

  • private
  • public-read
  • public-read-write
  • bucket-owner-full-control
NOTE:

The preceding four values indicate private read and write, public read, public read and write, and bucket owner full control.

sc

Optional (additional parameter)

Indicates the storage classes of objects that can be specified when synchronously uploading files. Possible values are:

  • standard: OBS Standard, which features low access latency and high throughput, and is applicable to storing frequently accessed data (multiple accesses per month) or data that is smaller than 1 MB
  • warm: It is applicable to storing infrequently accessed (less than 12 times a year) data that requires quick response.
  • cold: It is secure, durable, and inexpensive, and applicable to archiving rarely-accessed (once a year) data.

meta

Optional (additional parameter)

Indicates the customized metadata that can be specified when uploading files. The format is key1:value1#key2:value2#key3:value3.

NOTE:

The preceding value indicates that the object in the bucket contains three groups of customized metadata after the file is uploaded: key1:value1, key2:value2, and key3:value3.

ps

Optional (additional parameter)

Indicates the size of each part in a multipart upload task, in bytes. The value ranges from 100 KB to 5 GB. The default value is the value of defaultPartSize in the configuration file.

NOTE:
  • This value can contain a capacity unit. For example, 1 MB indicates 1048576 bytes.
  • The parameter can be set to auto. In this case, obsutil automatically sets the part size for each multipart task based on the source file size.

cpd

Optional (additional parameter)

Indicates the folder where the part records reside. The default value is .obsutil_checkpoint, the subfolder in the home directory of the user who executes obsutil commands.

NOTE:

A part record is generated during a multipart upload and saved to the upload subfolder. After the upload succeeds, its part record is deleted automatically. If the upload fails or is suspended, the system attempts to resume the task according to its part record when you perform the upload the next time.

j

Optional for synchronously uploading a folder (additional parameter)

Indicates the maximum number of concurrent tasks for uploading a folder synchronously. The default value is the value of defaultJobs in the configuration file.

NOTE:

The value is ensured to be greater than or equal to 1.

exclude

Optional for synchronously uploading a folder (additional parameter)

Indicates the file matching patterns that are excluded, for example: *.txt.

NOTE:
  • The asterisk (*) represents any group of characters, and the question mark (?) represents any single character. For instance, abc*.txt indicates any file whose name starts with abc and ends with .txt.
  • You can use \* to represent * and \? to represent ?.
  • If the name of the file to be uploaded matches the value of this parameter, the file is skipped.
NOTICE:
  • You are advised to use quotation marks for the matching pattern to prevent special characters from being escaped by the OS and leading to unexpected results. Use single quotation marks for Linux or macOS and quotation marks for Windows.
  • The matching pattern applies to the absolute file path (including the file name and file directory).
  • The matching pattern takes effect only for files in the folder.
  • Multiple exclude parameters can be specified, for example, -exclude=*.xxx -exclude=*.xxx.

include

Optional for synchronously uploading a folder (additional parameter)

Indicates the file matching patterns that are included, for example: *.jpg.

NOTE:
  • The asterisk (*) represents any group of characters, and the question mark (?) represents any single character.
  • You can use \* to represent * and \? to represent ?.
  • Only after identifying that the name of the file to be uploaded does not match the value of exclude, the system checks whether the file name matches the value of this parameter. If yes, the file is uploaded. If not, the file is skipped.
NOTICE:
  • You are advised to use quotation marks for the matching pattern to prevent special characters from being escaped by the OS and leading to unexpected results. Use single quotation marks for Linux or macOS and quotation marks for Windows.
  • The matching pattern applies to the absolute file path (including the file name and file directory).
  • The matching pattern takes effect only for files in the folder.
  • Multiple include parameters can be specified, for example, -include=*.xxx -include=*.xxx.

at

Optional for synchronously uploading a folder (additional parameter)

Indicates that when synchronously uploading a folder, only the files whose latest access time is within the value of timeRange are uploaded.

NOTE:
  • This parameter must be used together with timeRange.

disableDirObject

Optional for synchronously uploading folders (additional parameter)

Indicates the folders themselves are not uploaded as an object. Configuring this parameter can avoid uploading empty folders to a bucket. If a folder contains files, the files will be uploaded and the original path format is retained.

timeRange

Optional for synchronously uploading a folder (additional parameter)

Indicates the time range matching pattern when synchronously uploading files. Only files whose latest modification time is within the configured time range are uploaded.

This pattern has a lower priority than the file matching patterns (exclude/include). That is, the time range matching pattern is executed after the configured file matching patterns.

NOTE:
  • The matching time range is represented in time1-time2, where time1 must be earlier than or the same as time2. The time format is yyyyMMddHHmmss.
  • Automatic formatting is supported. For example, yyyyMMdd is equivalent to yyyyMMdd000000, and yyyyMM is equivalent to yyyyMM01000000.
  • If this parameter is set to *-time2, all files whose latest modification time is earlier than time2 are matched. If it is set to time1-*, all files whose latest modification time is later than time1 are matched.
NOTICE:

Time in the matching pattern is the UTC time.

mf

Optional (additional parameter)

Indicates that the name matching pattern (include or exclude) and the time matching pattern (timeRange) also take effect on folders.

o

Optional (additional parameter)

Indicates the folder where operation result lists reside. After the command is executed, result lists (possibly including success, failure, and warning files) are generated in the folder. The default value is .obsutil_output, the subfolder in the home directory of the user who executes obsutil commands.

NOTE:
  • The naming rule for result lists is as follows: sync_{succeed | failed | warning}_report_time_TaskId.txt
  • By default, the maximum size of a single result list is 30 MB and the maximum number of result lists that can be retained is 1024. You can set the maximum size and number by configuring recordMaxLogSize and recordBackups in the configuration file.
  • If there are multiple folders and files and you need to confirm the detailed error information about a failed task, refer to the failure list sync_failed_report_time_TaskId.txt in the result list folder and the log files in the log path.

config

Optional (additional parameter)

User-defined configuration file for executing a command. For details about parameters that can be configured, see Parameter Description.

Response

Refer to Response for uploading an object.