Multipart Upload
Multipart upload allows uploading a single object as a group of parts separately. Each part is a part of consecutive object data. You can upload object parts in any sequence or independently upload them. A part can be reloaded after an uploading failure, without affecting other parts. After all parts are uploaded, OBS merges these parts to create the object. Generally, if the size of an object reaches 100 MB, multipart upload is recommended. For example, you want to upload an object (500 MB) to an OBS bucket. In this case, you can use the tool OBS BrowserOBS Browser+ to upload the object in multiparts. The tool can automatically divide the object into multiple parts for uploading. Alternatively, you can make an API call for multipart upload, improving upload efficiency and reducing failures.
Multipart upload provides the following benefits:
- Improving throughput: You can upload parts in parallel to improve throughput.
- Quick recovery from any network failures: Small-size parts can minimize the impact of failed uploading caused by network errors.
- Convenient suspension and resuming of object uploading: You can upload parts at any time. A multipart upload does not have a validity period. You must explicitly complete or cancel the multipart upload.
- Starting uploading before knowing the size of an object: You can upload an object while creating it.
The multipart upload API allows uploading a large-size object in multiple parts. You can upload a new large-size object or create a copy of an existing object using this API.
The procedure for uploading multiple sections is as follows: Starting uploading (initializing the upload task), uploading parts, and completing uploading (merging the uploaded parts). Upon receiving a part merging request, OBS merges the uploaded parts to create an object. The object can be accessed like other objects.
You can list all the ongoing multipart upload tasks or obtain the list of uploaded parts of a specified multipart upload task. The following describes the detailed operations.
Initiating a multipart upload task
When you send a request to start multipart upload, OBS returns a response with the upload ID, which is the unique identifier of the multipart upload. This ID must be included in the request for uploading parts, listing uploaded parts, completing a multipart upload, or canceling a multipart upload.
Uploading parts
When uploading parts, you must specify the upload ID and part numbers. You can select any part number between 1 and 10 000. A part number uniquely identifies a part and its location in the object you are uploading. If the number of an uploaded part is used to upload a new part, the uploaded part will be overwritten. Whenever you upload a part, OBS returns the ETag header in the response. For each part upload task, you must record the part numbers and ETag values. These part numbers and ETag values are required in subsequent operations of completing the multipart upload task.
After the multipart upload task is initialized and one or more parts are uploaded, you must merge the parts or cancel the multipart upload task. Otherwise, you have to pay for the storage fee of the uploaded parts. OBS releases the storage and stops charging the storage fee only after the uploaded parts are merged or the multipart upload task is canceled.
When multiple concurrent upload operations are performed for the same part of an object, the server complies with the Last Write Win policy, but the time referred in Last Write is the time when the part metadata is created. To ensure data accuracy, the client must be locked during the concurrent upload for the same part of an object. Concurrent upload for different parts of an object does not require the client to be locked.
Copying parts
After creating a multipart upload job, you can specify upload IDs and upload parts for the specified upload task. You can also call the API for part copying to add parts. A part of an object or the whole object can be copied as a part.
You cannot determine whether a request is executed successfully only using status_code in the header returned by HTTP. If 200 in status_code is returned, the server has received the request and starts to process the request. The body in the response shows whether the request is executed successfully. The request is executed successfully only when the body contains Etag. Otherwise, the request fails to be executed.
If you copy the source object as a part called part1 and another part1 already exists before the copy operation, the original part1 will be overwritten by the new one after the copy operation. After the copy succeeds, only the new part1 is displayed. Data of the old part 1 will be deleted. Therefore, ensure that the target part does not exist or has no value when calling the interface for part copying. Otherwise, data may be deleted by mistake. The source object in the copy process does not change.
Merging parts and canceling a multipart upload task
When merging parts, OBS creates an object by standardizing multiple parts in ascending order. If any object metadata is provided in the initialization of a part upload task, OBS associates the metadata with the object. After the multipart upload is complete, the parts will no longer exist. A part merging request must contain the upload ID, part numbers, and a list of corresponding ETag values. OBS responses include the ETag that uniquely identifies composite object data. The ETag is not the MD5 hash value of the object data. You can cancel a multipart upload task. After a multipart upload task is canceled, the upload ID cannot be used to upload any part. Then, OBS releases the storage of all uploaded parts. If you stop an ongoing multipart upload, the uploading will still complete (the result can be successful or failed). To release the storage capacity occupied by all uploaded parts, cancel the multipart upload after the entire task is complete.
If 10 parts are uploaded but only nine parts are merged. The part that has not been merged will be automatically deleted by the system. PartS that are not merged cannot be restored after being deleted. Before merging the parts, adopt the interface used to list the parts that have been uploaded to check all parts to ensure that no part is missed.
Listing uploaded parts
You can list a specified multipart upload task or the parts of all the multipart upload tasks in progress. Information about uploaded parts in a specific multipart upload will be returned for a request to list uploaded parts. For each request to list uploaded parts, OBS returns information about uploaded parts in the specific multipart upload. Information about a maximum of 1000 parts can be returned. If more than 1000 parts are uploaded in a multipart upload, you need to send multiple requests to list all uploaded parts. It is worth mentioning that the list of uploaded parts does not include merged parts.
A returned list can only be used for verification. After a multipart upload is complete, the result in the list is no longer valid. However, when part numbers and the ETag values returned by OBS are uploaded, the list of part numbers specified by the user will be reserved.
Listing multipart upload tasks
You can list all the ongoing multipart upload tasks. Ongoing multipart upload tasks refer to those tasks that have been started but not completed or aborted. For each request, a maximum of 1000 multipart upload tasks will be returned. If there are more than 1000 ongoing multipart upload tasks, a user needs to send multiple requests to list all tasks.
|
Item |
Specifications |
|---|---|
|
Maximum object size |
48.8 TB |
|
Maximum number of parts for each upload task |
10 000 |
|
Part number |
1 to 10 000 (included) |
|
Part size |
The part size is between 100 KB to 5 GB. The size of the last part is between 0 bytes to 5 GB. |
|
Maximum number of parts in a returned list |
1 000 |
|
Maximum number of parts in a request |
1 000 |
Multipart upload operations and permissions
You can perform multipart upload only after being granted with the permission. You can use ACLs, bucket policies, or user policies to grant users the permission. The following table lists multipart upload operations and the required permissions that can be granted by ACLs, bucket policies, or user polices.
|
Operation |
Required Permission |
|---|---|
|
Initiating a multipart upload task |
To perform this operation, you need to have the PutObject permission. You must have the PutObject bucket owner to allow others to perform the PutObject operation. A bucket owner can allow trustees to perform the PutObject operation. |
|
Uploading parts |
To perform this operation, you need to have the PutObject permission. Only the initiator of a multipart upload can upload parts. The bucket owner must grant the multipart upload initiator the PutObject permission so that the initiator can upload parts of the object. |
|
Copying parts |
To perform this operation, you need to have the PutObject permission as well as the GetObject permission on the object to be copied. Only the initiator of a multipart upload can copy parts. The bucket owner must grant the multipart upload initiator the PutObject permission so that the initiator can upload parts of the object. |
|
Merging parts |
To perform this operation, you need to have the PutObject permission. Only the initiator of a multipart upload can merge parts. The bucket owner must grant the multipart upload initiator the PutObject permission so that the initiator can complete multipart upload. |
|
Canceling a multipart upload task |
To perform this operation, you need to have the AbortMultipartUpload permission. By default, only the bucket owner and multipart upload initiator have this permission. In addition to the default configuration, the bucket owner can allow trustees to perform this operation. The bucket owner can also deny any trustees performing this operation. |
|
Listing uploaded parts |
To perform this operation, you need to have the ListMultipartUploadParts permission. By default, the bucket owner can list the uploaded parts of any multipart upload to the bucket. The multipart upload initiator can list the uploaded parts of a specific multipart upload. In addition to the default configuration, the bucket owner can allow trustees to perform this operation The bucket owner can also deny any trustees performing this operation. |
|
Listing multipart upload tasks |
To list multipart upload tasks to the bucket, you need to have the ListBucketMultipartUploads permission. In addition to the default configuration, the bucket owner can allow trustees to perform this operation. |
REST APIs applicable to multipart upload
The following sections in the Object Storage Service API Reference describe REST API operations relevant to multipart upload.
- ListBucketMultipartUpload
- InitiateMultipartUpload
- UploadPart
- UploadPart-Copy
- ListParts
- CompleteMultipartUpload
- AbortMultipartUpload
Last Article: Batch Deleting Objects
Next Article: Appending Data to an Object
Did this article solve your problem?
Thank you for your score!Your feedback would help us improve the website.