Updated on 2023-07-14 GMT+08:00

Uploading Objects Using a Multipart Upload

Multipart upload allows uploading a single object as a group of parts separately. Each part is a part of consecutive object data. You can upload object parts in any sequence or independently upload them. A part can be reloaded after an uploading failure, without affecting other parts. After all parts are uploaded, OBS merges these parts to create the object. Generally, if the size of an object reaches 100 MB, multipart upload is recommended. For example, if you want to upload a 500 MB object to an OBS bucket, you can use OBS Browser+ to upload the object using a multipart upload. OBS Browser+ divides the object into small parts and then uploads the parts. Alternatively, you can call the multipart upload API, improving upload efficiency and reducing failures.

Multipart upload provides the following benefits:

  • Improving throughput: You can upload parts in parallel to improve throughput.
  • Quick recovery from any network failures: Small-size parts can minimize the impact of failed uploading caused by network errors.
  • Convenient suspension and resuming of object uploading: You can upload parts at any time. A multipart upload does not have a validity period. You must explicitly complete or cancel the multipart upload.
  • Starting uploading before knowing the size of an object: You can upload an object while creating it.

The multipart upload API allows uploading a large object in multiple parts. You can upload a new large object or create a copy of an existing object using this API.

The procedure for uploading multiple sections is as follows: Starting uploading (initializing the upload task), uploading parts, and completing uploading (merging the uploaded parts). Upon receiving a part merging request, OBS merges the uploaded parts to create a new object. The object can be accessed like other objects.

You can list all the ongoing multipart upload tasks or obtain the list of uploaded parts of a specified multipart upload task. The following describes the detailed operations.

Initiating a Multipart Upload

When you send a request to start multipart upload, OBS returns a response with the upload ID, which is the unique identifier of the multipart upload. This ID must be included in the request for uploading parts, listing uploaded parts, completing a multipart upload, or canceling a multipart upload.

Uploading a Part

When uploading parts, you must specify the upload ID and part numbers. You can select any part number between 1 and 10,000. A part number uniquely identifies a part and its location in the object you are uploading. If the number of an uploaded part is used to upload a new part, the uploaded part will be overwritten. Whenever you upload a part, OBS returns the ETag header in the response. For each part upload task, you must record the part numbers and ETag values. These part numbers and ETag values are required in subsequent operations of completing the multipart upload task.

After the multipart upload task is initialized and one or more parts are uploaded, you must merge the parts or cancel the multipart upload task. Otherwise, you have to pay for the storage fee of the uploaded parts. OBS releases the storage and stops charging the storage fee only after the uploaded parts are merged or the multipart upload task is canceled.

When multiple concurrent upload operations are performed for the same part of an object, the server complies with the Last Write Win policy, but the time referred in Last Write is the time when the part metadata is created. To ensure data accuracy, the client must be locked during the concurrent upload for the same part of an object. Concurrent upload for different parts of an object does not require the client to be locked.

Copying a Part

After creating a multipart upload job, you can specify upload IDs and upload parts for the specified upload task. You can also call the API for part copying to add parts. A part of an object or the whole object can be copied as a part.

You cannot determine whether a request is successful only based on the status_code in the returned HTTP header. If 200 is returned for status_code, the server has received the request and started to process the request. The copy is successful only when the body in the response contains ETag.

If you copy the source object as a part called part1 and another part1 already exists before the copy operation, the original part1 will be overwritten by the new one after the copy operation. After the copy succeeds, only the new part1 is displayed. Data of the old part1 will be deleted. Therefore, ensure that the target part does not exist or has no value when copying a part. Otherwise, data may be deleted by mistake. The source object does not change during the copying.

Merging Parts and Canceling a Multipart Upload Task

When merging parts, OBS creates an object by standardizing multiple parts in ascending order. If any object metadata is provided in the initialization of a part upload task, OBS associates the metadata with the object. After the multipart upload is complete, the parts will no longer exist. A part merging request must contain the upload ID, part numbers, and a list of corresponding ETag values. OBS responses include the ETag that uniquely identifies composite object data. The ETag is not the MD5 hash value of the object data. You can cancel a multipart upload task. After a multipart upload task is canceled, the upload ID cannot be used to upload any part. Then, OBS releases the storage of all uploaded parts. If you stop an ongoing multipart upload, the uploading will still complete (the result can be successful or failed). To release the storage capacity occupied by all uploaded parts, cancel the multipart upload after the entire task is complete.

If 10 parts are uploaded but only nine parts are selected for merge, the parts that are not merged will be automatically deleted by the system. The parts that are not merged cannot be restored after being deleted. Before merging the parts, adopt the API used to list the parts that have been uploaded to check all parts to ensure that no part is missed.

Listing Uploaded Parts

You can list the parts of a specified multipart upload task or the parts of all the multipart upload tasks in progress. Information about uploaded parts in a specific multipart upload will be returned for a request to list uploaded parts. For each request to list uploaded parts, OBS returns information about uploaded parts in the specific multipart upload. Information about a maximum of 1000 parts can be returned. If more than 1000 parts are uploaded in a multipart upload, you need to send multiple requests to list all uploaded parts. The list of uploaded parts does not include merged parts.

A returned list can only be used for verification. After a multipart upload is complete, the result in the list is no longer valid. However, when part numbers and the ETag values returned by OBS are uploaded, the list of part numbers specified by the user will be reserved.

Listing Multipart Uploads

You can obtain the list of initialized multipart upload tasks by listing the multipart upload tasks in the bucket. Initialized multipart upload tasks refer to the multipart upload tasks that are not merged or canceled after initialization. A maximum of 1000 multipart upload tasks can be returned for each request. If the number of ongoing multipart upload tasks exceeds 1000, you need to send more requests to query the remaining tasks.

Table 1 lists the restrictions on listing multipart uploads.

Table 1 Restrictions on listing multipart uploads

Item

Restriction

Object size

Up to 48.8 TB

Maximum number of parts for each upload task

10,000

Part number

1–10,000 (included)

Part size

The part size is between 5 MB to 5 GB. The size of the last part is between 0 bytes to 5 GB.

Maximum number of uploaded parts returned in response to the request for listing uploaded parts.

1000

Maximum number of initialized multipart upload tasks returned in response to the request for listing initialized multipart tasks.

1000

Multipart Upload Operations and Permissions

You can perform multipart upload only after being granted with the permission. You can use ACLs, bucket policies, or user policies to grant users the permission. The following table lists multipart upload operations and the required permissions that can be granted by ACLs, bucket policies, or user policies.

Operation

Required Permission

Initiating a multipart upload

To perform this operation, you need to have the PutObject permission.

A bucket owner can allow trustees to perform the PutObject operation.

Uploading parts

To perform this operation, you need to have the PutObject permission.

Only the initiator of a multipart upload can upload parts. The bucket owner must grant the multipart upload initiator the PutObject permission so that the initiator can upload parts of the object.

Copying parts

To perform this operation, you need to have the PutObject permission as well as the GetObject permission on the object to be copied.

Only the initiator of a multipart upload can copy parts. The bucket owner must grant the multipart upload initiator the PutObject permission so that the initiator can upload parts of the object.

Assembling parts

To perform this operation, you need to have the PutObject permission.

Only the initiator of a multipart upload can assemble parts. The bucket owner must grant the multipart upload initiator the PutObject permission so that the initiator can complete the multipart upload.

Canceling a multipart upload

To perform this operation, you need to have the AbortMultipartUpload permission.

By default, only the bucket owner and multipart upload initiator have this permission. In addition to the default configuration, the bucket owner can allow trustees to perform this operation. The bucket owner can also deny any trustees performing this operation.

Listing uploaded parts

To perform this operation, you need to have the ListMultipartUploadParts permission.

By default, the bucket owner can list the uploaded parts of any multipart upload to the bucket. The multipart upload initiator can list the uploaded parts of a specific multipart upload.

In addition to the default configuration, the bucket owner can allow trustees to perform this operation. The bucket owner can also deny any trustees performing this operation.

Listing multipart uploads

To list multipart upload tasks to the bucket, you need to have the ListBucketMultipartUploads permission.

In addition to the default configuration, the bucket owner can allow trustees to perform this operation.

REST APIs Applicable to Multipart Upload

The following sections in the Object Storage Service API Reference describe REST APIs relevant to multipart upload.

  • Listing Initiated Multipart Uploads in a Bucket
  • Initiating a Multipart Upload
  • Multipart Upload
  • Uploading a Part of an Object - Copy
  • Listing Uploaded Parts of an Object
  • Completing a Multipart Upload
  • Canceling a Multipart Upload Task