Downloading an Object - Resumable (SDK for Python)
Function
Downloading large files often fails due to unstable network or program breakdown. It is a waste of resources to download files again. Moreover, the restarted download may still fail due to an unstable network. To resolve such issues, the resumable download API splits the file to be downloaded into multiple parts and downloads them separately. The download result of each part is recorded in a checkpoint file in real time. Only when all parts are downloaded is a message indicating the download is successful returned. If any parts fail to be downloaded, a message is returned telling you to call the API again to download the failed parts. Since the checkpoint file contains the progress of all parts, it helps you avoid downloading all parts in re-downloads, so that you can enjoy a cost-effective, efficient download.
Restrictions
- To download an object, you must be the bucket owner or have the required permission (obs:object:GetObject in IAM or GetObject in a bucket policy). For details, see Introduction to OBS Access Control, IAM Custom Policies, and Configuring an Object Policy.
- The resumable download is an encapsulated and enhanced version of partial download.
- This API saves resources and improves efficiency upon the re-download, and speeds up the download process by concurrently downloading parts. You do not need to worry about internal service details, such as the creation and deletion of checkpoint files, division of objects, or concurrent downloads of parts.
- EnableCheckpoint: The default value is False, indicating that resumable download is disabled. In this case, the resumable download API is a simple encapsulation of the partial download API, and no checkpoint file will be generated.
- CheckpointFile: This parameter is valid only when EnableCheckpoint is True.
Method
ObsClient.downloadFile(bucketName, objectKey, downloadFile, partSize, taskNum, enableCheckpoint, checkpointFile, header, versionId, progressCallback, extensionHeaders)
Request Parameters
Parameter |
Type |
Mandatory (Yes/No) |
Description |
---|---|---|---|
bucketName |
str |
Yes |
Explanation: Bucket name Restrictions:
Default value: None |
objectKey |
str |
Yes |
Explanation: Object name. An object is uniquely identified by an object name in a bucket. An object name is a complete path that does not contain the bucket name. For example, if the address for accessing the object is examplebucket.obs.eu-west-101.myhuaweicloud.eu/folder/test.txt, the object name is folder/test.txt. Value range: The value must contain 1 to 1,024 characters. Default value: None |
downloadFile |
str |
Yes |
Explanation: Full local path for saving the file to be downloaded Default value: None |
partSize |
int |
No |
Explanation: Part size Value range: The value must be greater than 0 but less than the object size, in bytes. Default value: 5MB |
taskNum |
int |
No |
Explanation: Maximum number of parts that can be downloaded concurrently in a multipart download Value range: The value must be greater than 0 but not exceed the result of the file size divided by the part size (rounded up). Default value: 1, indicating concurrent downloads are not used. |
enableCheckpoint |
bool |
No |
Explanation: Whether to enable the resumable download mode Value range: True: The resumable download mode is enabled. False: The resumable download mode is disabled. Default value: False |
checkpointFile |
str |
No |
Explanation: Path of a file generated for recording the progress of a resumable download. The file contains the information about parts and progress. Restrictions: This parameter is valid only for resumable uploads. Default value: If this parameter is left blank, the checkpoint file will be saved in the current directory. |
header |
No |
Explanation: Headers in the request used for obtaining the storage class, redundancy policy, and other basic information about the object Value range: See Table 2. Default value: None |
|
versionId |
str |
No |
Explanation: Object version ID, for example, G001117FCE89978B0000401205D5DC9 Value range: The value must contain 32 characters. Default value: None. If this parameter is left blank, the latest version of the object is obtained. |
progressCallback |
callable |
No |
Explanation: Callback function for obtaining the download progress Default value: None
NOTE:
This function contains the following parameters in sequence: number of downloaded bytes, total number of bytes, and used time (in seconds). For details about the sample code, see Downloading an Object - Obtaining the Download Progress (SDK for Python). |
extensionHeaders |
dict |
No |
Explanation: Extension headers. Value range: See User-defined Header (SDK for Python). Default value: None |
Parameter |
Type |
Mandatory (Yes/No) |
Description |
---|---|---|---|
range |
str |
No |
Explanation: Download range. For example, 0-999 indicates the download range is from byte 1 to byte 1,000. Value range: Value range: 0 to the object length minus 1. Format: x-y, indicating the range is from byte x+1 to byte y+1 Restrictions: The upper limit of range is the length of the object minus 1. If the specified value exceeds this limit, the length of the object minus 1 is used. Default value: None |
if_match |
str |
No |
Explanation: Preset ETag. If the ETag of the object to be downloaded is the same as the preset ETag, the object is returned. Otherwise, an error is returned. Value range: The value must contain 32 characters. Default value: None |
if_none_match |
str |
No |
Explanation: Preset ETag. If the ETag of the object to be downloaded is different from the preset ETag, the object is returned. Otherwise, an error is returned. Value range: The value must contain 32 characters. Default value: None |
if_modified_since |
str or |
No |
Explanation: The object is returned if it has been modified since the specified time; otherwise, an error is returned. Restrictions: The value must be in the GMT format. For example, Wed, 25 Mar 2020 02:39:52 GMT. You can refer to Table 3 to specify time. For example, DateTime(year=2023, month=9, day=12) Default value: None |
if_unmodified_since |
str or |
No |
Explanation: The object is returned if it has not been modified since the specified time; otherwise, an error is returned. Restrictions: The value must be in the GMT format. For example, Wed, 25 Mar 2020 02:39:52 GMT. You can refer to Table 3 to specify time. For example, DateTime(year=2023, month=9, day=12) Default value: None |
origin |
str |
No |
Explanation: Origin of the cross-domain request specified by the preflight request. Generally, it is a domain name. Restrictions: Each origin can contain only one wildcard character (*). Default value: None |
requestHeaders |
str |
No |
Explanation: HTTP headers in a cross-origin request Only CORS requests matching the allowed headers are valid. Restrictions: Each header can contain only one wildcard character (*). Spaces, ampersands (&), colons (:), and less-than signs (<) are not allowed. Default value: None |
sseHeader |
No |
Explanation: Server-side decryption headers. For details, see Table 4. Restrictions: If the object uploaded to the server is encrypted on the server using the encryption key provided by the client, downloading the object requires including the encryption key in the message. Default value: None |
Parameter |
Type |
Description |
---|---|---|
year |
int |
Explanation: Year in UTC Default value: None |
month |
int |
Explanation: Month in UTC Default value: None |
day |
int |
Explanation: Day in UTC Default value: None |
hour |
int |
Explanation: Hour in UTC Restrictions: The value is in 24-hour format. Default value: 0 |
min |
int |
Explanation: Minute in UTC Default value: 0 |
sec |
int |
Explanation: Second in UTC Default value: 0 |
Parameter |
Type |
Mandatory (Yes/No) |
Description |
---|---|---|---|
encryption |
str |
Yes |
Explanation: SSE-C used for encrypting objects Value range: AES256 Default value: None |
key |
str |
Yes |
Explanation: Key used in SSE-C encryption. It corresponds to the encryption method. For example, if encryption is set to AES256, the key is calculated using the AES-256 algorithm. Value range: The value must contain 32 characters. Default value: None |
Responses
Type |
Description |
---|---|
Explanation: SDK common results |
Parameter |
Type |
Description |
---|---|---|
status |
int |
Explanation: HTTP status code Value range: A status code is a group of digits ranging from 2xx (indicating successes) to 4xx or 5xx (indicating errors). It indicates the status of a response. For more information, see Status Code. Default value: None |
reason |
str |
Explanation: Reason description. Default value: None |
errorCode |
str |
Explanation: Error code returned by the OBS server. If the value of status is less than 300, this parameter is left blank. Default value: None |
errorMessage |
str |
Explanation: Error message returned by the OBS server. If the value of status is less than 300, this parameter is left blank. Default value: None |
requestId |
str |
Explanation: Request ID returned by the OBS server Default value: None |
indicator |
str |
Explanation: Error indicator returned by the OBS server. Default value: None |
hostId |
str |
Explanation: Requested server ID. If the value of status is less than 300, this parameter is left blank. Default value: None |
resource |
str |
Explanation: Error source (a bucket or an object). If the value of status is less than 300, this parameter is left blank. Default value: None |
header |
list |
Explanation: Response header list, composed of tuples. Each tuple consists of two elements, respectively corresponding to the key and value of a response header. Default value: None |
body |
object |
Explanation: Result content returned after the operation is successful. If the value of status is larger than 300, the value of body is null. The value varies with the API being called. For details, see Bucket-Related APIs (SDK for Python) and Object-Related APIs (SDK for Python). Default value: None |
GetResult.body Type |
Description |
---|---|
Explanation: For details, see Table 8. |
Parameter |
Type |
Description |
---|---|---|
storageClass |
str |
Explanation: Object storage class. Value range:
Default value: None |
accessContorlAllowOrigin |
str |
Explanation: If Origin in the request meets the CORS rules of the bucket, AllowedOrigin specified in the CORS rules is returned. AllowedOrigin indicates the origin from which the requests can access the bucket. Restrictions: Domain name of the origin. Each origin can contain only one wildcard character (*), for example, https://*.vbs.example.com. Default value: None |
accessContorlAllowHeaders |
str |
Explanation: If RequestHeader in the request meets the CORS rules of the bucket, AllowedHeader specified in the CORS rules is returned. AllowedHeader indicates the allowed headers for cross-origin requests. Only CORS requests matching the allowed headers are valid. Restrictions: Each header can contain only one wildcard character (*). Spaces, ampersands (&), colons (:), and less-than signs (<) are not allowed. Default value: None |
accessContorlAllowMethods |
str |
Explanation: AllowedMethod in the CORS rules of the bucket. It specifies the HTTP method of cross-origin requests, that is, the operation type of buckets and objects. Value range: The following HTTP methods are supported:
Default value: None |
accessContorlExposeHeaders |
str |
Explanation: ExposeHeader in the CORS rules of the bucket. It specifies the CORS-allowed additional headers in the response. These headers provide additional information to clients. By default, your browser can only access headers Content-Length and Content-Type. If your browser needs to access other headers, add them to a list of the allowed additional headers. Restrictions: Spaces, wildcard characters (*), ampersands (&), colons (:), and less-than signs (<) are not allowed. Default value: None |
accessContorlMaxAge |
int |
Explanation: MaxAgeSeconds in the CORS rules of the bucket. It specifies the time your client can cache the response for a cross-origin request. Restrictions: Each CORS rule can contain only one MaxAgeSeconds. Value range: An integer greater than or equal to 0, in seconds Default value: 100 |
contentLength |
int |
Explanation: Object size Value range: The value ranges from 0 TB to 48.8 TB, in bytes. Default value: None |
contentType |
str |
Explanation: MIME type of the object to be downloaded. MIME type is a standard way of describing a data type and is used by the browser to decide how to display data. Value range: See What Is Content-Type (MIME)? (Python SDK) Default value: None |
lastModified |
str |
Explanation: Time when the last modification was made to the object Restrictions: The time must be in the GMT format, for example, Wed, 25 Mar 2020 02:39:52 GMT. Default value: None |
etag |
str |
Explanation: Base64-encoded, 128-bit MD5 value of an object. ETag is the unique identifier of the object contents and is used to determine whether the contents of an object are changed. For example, if the ETag value is A when an object is uploaded and is B when the object is downloaded, this indicates the contents of the object are changed. The ETag reflects changes only to the contents of an object, not its metadata. Objects created by the upload and copy operations have unique ETags after being encrypted using MD5. Restrictions: If an object is encrypted using server-side encryption, the ETag is not the MD5 value of the object. Value range: The value must contain 32 characters. Default value: None |
versionId |
str |
Explanation: Object version ID. Value range: The value must contain 32 characters. Default value: None |
restore |
str |
Explanation: Restore status of an object. This header is returned when an Archive object is being restored or has been restored. For example, ongoing-request="true" indicates that the object is being restored. ongoing-request="false", expiry-date="Wed, 7 Nov 2012 00:00:00 GMT" indicates that the object has been restored. expiry-date indicates when the restored object expires. Restrictions: This parameter is only available for Archive objects. Default value: None |
expiration |
str |
Explanation: Expiration details. Example: "expiry-date=\"Mon, 11 Sep 2023 00:00:00 GMT\"" Default value: None |
sseKms |
str |
Explanation: SSE-KMS is used for encrypting objects on the server side. Value range: kms Default value: None |
sseKmsKey |
str |
Explanation: ID of the KMS master key when SSE-KMS is used Value range: Valid value formats are as follows:
In the preceding formats:
Default value:
|
sseC |
str |
Explanation: SSE-C algorithm Value range: AES256 Default value: None |
sseCKeyMd5 |
str |
Explanation: MD5 value of the key for encrypting objects when SSE-C is used. This value is used to check whether any error occurs during the transmission of the key. Restrictions: The value is encrypted by MD5 and then encoded by Base64, for example, 4XvB3tbNTN+tIEVa0/fGaQ==. Default value: None |
websiteRedirectLocation |
str |
Explanation: If the bucket is configured with website hosting, the request for obtaining the object can be redirected to another object in the bucket or an external URL. This parameter specifies the address the request for the object is redirected to. The request is redirected to object anotherPage.html in the same bucket: WebsiteRedirectLocation:/anotherPage.html The request is redirected to an external URL http://www.example.com/: WebsiteRedirectLocation:http://www.example.com/ OBS obtains the specified value from the header and stores it in the object metadata WebsiteRedirectLocation. Restrictions:
Default value: None |
isAppendable |
bool |
Explanation: Whether the object is appendable Value range: True: The object is appendable. False: The object is not appendable. Default value: None |
nextPosition |
int |
Explanation: Start position for next appending Value range: 0 to the object length, in bytes. Default value: None |
Parameter |
Type |
Description |
---|---|---|
STANDARD |
Standard storage class |
Explanation: Features low access latency and high throughput and is used for storing massive, frequently accessed (multiple times a month) or small objects (< 1 MB) requiring quick response. |
WARM |
Infrequent Access storage class |
Explanation: Used for storing data that is semi-frequently accessed (fewer than 12 times a year) but is instantly available when needed. |
COLD |
Archive storage class |
Explanation: Used for storing rarely accessed (once a year) data. |
Code Examples
This example downloads object objectname from bucket examplebucket using resumable download.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
from obs import ObsClient import os import traceback # Obtain an AK and SK pair using environment variables or import the AK and SK pair in other ways. Using hard coding may result in leakage. # Obtain an AK and SK pair on the management console. For details, see https://support.huaweicloud.com/eu/usermanual-ca/ca_01_0003.html. ak = os.getenv("AccessKeyID") sk = os.getenv("SecretAccessKey") # (Optional) If you use a temporary AK and SK pair and a security token to access OBS, obtain them from environment variables. # security_token = os.getenv("SecurityToken") # Set server to the endpoint corresponding to the bucket. EU-Dublin is used here as an example. Replace it with the one in use. server = "https://obs.eu-west-101.myhuaweicloud.eu" # Create an obsClient instance. # If you use a temporary AK and SK pair and a security token to access OBS, you must specify security_token when creating an instance. obsClient = ObsClient(access_key_id=ak, secret_access_key=sk, server=server) try: bucketName = "examplebucket" objectKey = "objectname" # Specify the full path to which objects are downloaded. The full path contains the local file name. downloadFile = 'localfile' # Specify the number of parts that can be concurrently downloaded. taskNum = 5 # Specify the part size. partSize = 10 * 1024 * 1024 # Enable the resumable download by setting enableCheckpoint to True. enableCheckpoint = True # Download the object using resumable download. resp = obsClient.downloadFile(bucketName, objectKey, downloadFile, partSize, taskNum, enableCheckpoint) # If status code 2xx is returned, the API is called successfully. Otherwise, the API call fails. if resp.status < 300: print('Download File Succeeded') print('requestId:', resp.requestId) else: print('Download File Failed') print('requestId:', resp.requestId) print('errorCode:', resp.errorCode) print('errorMessage:', resp.errorMessage) except: print('Download File Failed') print(traceback.format_exc()) |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.