Multipart Upload

Multipart upload allows uploading a single object as a group of parts separately. Each part is a part of consecutive object data. You can upload object parts in any sequence or independently upload them. A part can be reloaded after an uploading failure, without affecting other parts. After all parts are uploaded, OBS merges these parts to create the object. Generally, if the size of an object reaches 100 MB, multipart upload is recommended. For example, you want to upload an object (500 MB) to an OBS bucket. In this case, you can use the tool OBS BrowserOBS Browser+ to upload the object in multiparts. The tool can automatically divide the object into multiple parts for uploading. Alternatively, you can make an API call for multipart upload, improving upload efficiency and reducing failures.

Multipart upload provides the following benefits:

  • Improving throughput: You can upload parts in parallel to improve throughput.
  • Quick recovery from any network failures: Small-size parts can minimize the impact of failed uploading caused by network errors.
  • Convenient suspension and resuming of object uploading: You can upload parts at any time. A multipart upload does not have a validity period. You must explicitly complete or cancel the multipart upload.
  • Starting uploading before knowing the size of an object: You can upload an object while creating it.

The multipart upload API allows uploading a large-size object in multiple parts. You can upload a new large-size object or create a copy of an existing object using this API.

The procedure for uploading multiple sections is as follows: Starting uploading (initializing the upload task), uploading parts, and completing uploading (merging the uploaded parts). Upon receiving a part merging request, OBS merges the uploaded parts to create an object. The object can be accessed like other objects.

You can list all the ongoing multipart upload tasks or obtain the list of uploaded parts of a specified multipart upload task. The following describes the detailed operations.

Initiating a multipart upload task

When you send a request to start multipart upload, OBS returns a response with the upload ID, which is the unique identifier of the multipart upload. This ID must be included in the request for uploading parts, listing uploaded parts, completing a multipart upload, or canceling a multipart upload.

Uploading parts

When uploading parts, you must specify the upload ID and part numbers. You can select any part number between 1 and 10 000. A part number uniquely identifies a part and its location in the object you are uploading. If the number of an uploaded part is used to upload a new part, the uploaded part will be overwritten. Whenever you upload a part, OBS returns the ETag header in the response. For each part upload task, you must record the part numbers and ETag values. These part numbers and ETag values are required in subsequent operations of completing the multipart upload task.

After the multipart upload task is initialized and one or more parts are uploaded, you must merge the parts or cancel the multipart upload task. Otherwise, you have to pay for the storage fee of the uploaded parts. OBS releases the storage and stops charging the storage fee only after the uploaded parts are merged or the multipart upload task is canceled.

When multiple concurrent upload operations are performed for the same part of an object, the server complies with the Last Write Win policy, but the time referred in Last Write is the time when the part metadata is created. To ensure data accuracy, the client must be locked during the concurrent upload for the same part of an object. Concurrent upload for different parts of an object does not require the client to be locked.

Copying parts

After creating a multipart upload job, you can specify upload IDs and upload parts for the specified upload task. You can also call the API for part copying to add parts. A part of an object or the whole object can be copied as a part.

You cannot determine whether a request is executed successfully only using status_code in the header returned by HTTP. If 200 in status_code is returned, the server has received the request and starts to process the request. The body in the response shows whether the request is executed successfully. The request is executed successfully only when the body contains Etag. Otherwise, the request fails to be executed.

If you copy the source object as a part called part1 and another part1 already exists before the copy operation, the original part1 will be overwritten by the new one after the copy operation. After the copy succeeds, only the new part1 is displayed. Data of the old part 1 will be deleted. Therefore, ensure that the target part does not exist or has no value when calling the interface for part copying. Otherwise, data may be deleted by mistake. The source object in the copy process does not change.

Merging parts and canceling a multipart upload task

When merging parts, OBS creates an object by standardizing multiple parts in ascending order. If any object metadata is provided in the initialization of a part upload task, OBS associates the metadata with the object. After the multipart upload is complete, the parts will no longer exist. A part merging request must contain the upload ID, part numbers, and a list of corresponding ETag values. OBS responses include the ETag that uniquely identifies composite object data. The ETag is not the MD5 hash value of the object data. You can cancel a multipart upload task. After a multipart upload task is canceled, the upload ID cannot be used to upload any part. Then, OBS releases the storage of all uploaded parts. If you stop an ongoing multipart upload, the uploading will still complete (the result can be successful or failed). To release the storage capacity occupied by all uploaded parts, cancel the multipart upload after the entire task is complete.

If 10 parts are uploaded but only nine parts are merged. The part that has not been merged will be automatically deleted by the system. PartS that are not merged cannot be restored after being deleted. Before merging the parts, adopt the interface used to list the parts that have been uploaded to check all parts to ensure that no part is missed.

Listing uploaded parts

You can list a specified multipart upload task or the parts of all the multipart upload tasks in progress. Information about uploaded parts in a specific multipart upload will be returned for a request to list uploaded parts. For each request to list uploaded parts, OBS returns information about uploaded parts in the specific multipart upload. Information about a maximum of 1000 parts can be returned. If more than 1000 parts are uploaded in a multipart upload, you need to send multiple requests to list all uploaded parts. It is worth mentioning that the list of uploaded parts does not include merged parts.

A returned list can only be used for verification. After a multipart upload is complete, the result in the list is no longer valid. However, when part numbers and the ETag values returned by OBS are uploaded, the list of part numbers specified by the user will be reserved.

Listing multipart upload tasks

You can list all the ongoing multipart upload tasks. Ongoing multipart upload tasks refer to those tasks that have been started but not completed or aborted. For each request, a maximum of 1000 multipart upload tasks will be returned. If there are more than 1000 ongoing multipart upload tasks, a user needs to send multiple requests to list all tasks.

Item

Specifications

Maximum object size

48.8 TB

Maximum number of parts for each upload task

10 000

Part number

1 to 10 000 (included)

Part size

The part size is between 100 KB to 5 GB. The size of the last part is between 0 bytes to 5 GB.

Maximum number of parts in a returned list

1 000

Maximum number of parts in a request

1 000

Multipart upload operations and permissions

You can perform multipart upload only after being granted with the permission. You can use ACLs, bucket policies, or user policies to grant users the permission. The following table lists multipart upload operations and the required permissions that can be granted by ACLs, bucket policies, or user polices.

Operation

Required Permission

Initiating a multipart upload task

To perform this operation, you need to have the PutObject permission.

You must have the PutObject bucket owner to allow others to perform the PutObject operation. A bucket owner can allow trustees to perform the PutObject operation.

Uploading parts

To perform this operation, you need to have the PutObject permission.

Only the initiator of a multipart upload can upload parts. The bucket owner must grant the multipart upload initiator the PutObject permission so that the initiator can upload parts of the object.

Copying parts

To perform this operation, you need to have the PutObject permission as well as the GetObject permission on the object to be copied.

Only the initiator of a multipart upload can copy parts. The bucket owner must grant the multipart upload initiator the PutObject permission so that the initiator can upload parts of the object.

Merging parts

To perform this operation, you need to have the PutObject permission.

Only the initiator of a multipart upload can merge parts. The bucket owner must grant the multipart upload initiator the PutObject permission so that the initiator can complete multipart upload.

Canceling a multipart upload task

To perform this operation, you need to have the AbortMultipartUpload permission.

By default, only the bucket owner and multipart upload initiator have this permission. In addition to the default configuration, the bucket owner can allow trustees to perform this operation. The bucket owner can also deny any trustees performing this operation.

Listing uploaded parts

To perform this operation, you need to have the ListMultipartUploadParts permission.

By default, the bucket owner can list the uploaded parts of any multipart upload to the bucket. The multipart upload initiator can list the uploaded parts of a specific multipart upload.

In addition to the default configuration, the bucket owner can allow trustees to perform this operation The bucket owner can also deny any trustees performing this operation.

Listing multipart upload tasks

To list multipart upload tasks to the bucket, you need to have the ListBucketMultipartUploads permission.

In addition to the default configuration, the bucket owner can allow trustees to perform this operation.

REST APIs applicable to multipart upload

The following sections in the Object Storage Service API Reference describe REST API operations relevant to multipart upload.

  • ListBucketMultipartUpload
  • InitiateMultipartUpload
  • UploadPart
  • UploadPart-Copy
  • ListParts
  • CompleteMultipartUpload
  • AbortMultipartUpload