Help Center/ Object Storage Service/ SDK Reference/ Python/ Object-Related APIs (SDK for Python)/ Downloading an Object - Resumable (SDK for Python)
Updated on 2024-09-05 GMT+08:00

Downloading an Object - Resumable (SDK for Python)

Function

Downloading large files often fails due to unstable network or program breakdown. It is a waste of resources to download files again. Moreover, the restarted download may still fail due to an unstable network. To resolve such issues, the resumable download API splits the file to be downloaded into multiple parts and downloads them separately. The download result of each part is recorded in a checkpoint file in real time. Only when all parts are downloaded is a message indicating the download is successful returned. If any parts fail to be downloaded, a message is returned telling you to call the API again to download the failed parts. Since the checkpoint file contains the progress of all parts, it helps you avoid downloading all parts in re-downloads, so that you can enjoy a cost-effective, efficient download.

Restrictions

  • To download an object, you must be the bucket owner or have the required permission (obs:object:GetObject in IAM or GetObject in a bucket policy). For details, see Introduction to OBS Access Control, IAM Custom Policies, and Configuring an Object Policy.
  • The mapping between OBS regions and endpoints must comply with what is listed in Regions and Endpoints.
  • The resumable download is an encapsulated and enhanced version of partial download.
  • This API saves resources and improves efficiency upon the re-download, and speeds up the download process by concurrently downloading parts. You do not need to worry about internal service details, such as the creation and deletion of checkpoint files, division of objects, or concurrent downloads of parts.
  • EnableCheckpoint: The default value is False, indicating that resumable download is disabled. In this case, the resumable download API is a simple encapsulation of the partial download API, and no checkpoint file will be generated.
  • CheckpointFile: This parameter is valid only when EnableCheckpoint is True.

Method

ObsClient.downloadFile(bucketName, objectKey, downloadFile, partSize, taskNum, enableCheckpoint, checkpointFile, header, versionId, progressCallback)

Request Parameters

Table 1 List of request parameters

Parameter

Type

Mandatory (Yes/No)

Description

bucketName

str

Yes

Explanation:

Bucket name

Restrictions:

  • A bucket name must be unique across all accounts and regions.
  • A bucket name:
    • Must be 3 to 63 characters long and start with a digit or letter. Lowercase letters, digits, hyphens (-), and periods (.) are allowed.
    • Cannot be formatted as an IP address.
    • Cannot start or end with a hyphen (-) or period (.).
    • Cannot contain two consecutive periods (..), for example, my..bucket.
    • Cannot contain periods (.) and hyphens (-) adjacent to each other, for example, my-.bucket or my.-bucket.
  • If you repeatedly create buckets of the same name in the same region, no error will be reported and the bucket properties comply with those set in the first creation request.

Default value:

None

objectKey

str

Yes

Explanation:

Object name. An object is uniquely identified by an object name in a bucket. An object name is a complete path that does not contain the bucket name.

For example, if the address for accessing the object is examplebucket.obs.ap-southeast-1.myhuaweicloud.com/folder/test.txt, the object name is folder/test.txt.

Value range:

The value must contain 1 to 1,024 characters.

Default value:

None

downloadFile

str

Yes

Explanation:

Full local path for saving the file to be downloaded

Default value:

None

partSize

int

No

Explanation:

Part size

Value range:

The value must be greater than 0 but less than the object size, in bytes.

Default value:

5MB

taskNum

int

No

Explanation:

Maximum number of parts that can be downloaded concurrently in a multipart download

Value range:

The value must be greater than 0 but not exceed the result of the file size divided by the part size (rounded up).

Default value:

1, indicating concurrent downloads are not used.

enableCheckpoint

bool

No

Explanation:

Whether to enable the resumable download mode

Value range:

True: The resumable download mode is enabled.

False: The resumable download mode is disabled.

Default value:

False

checkpointFile

str

No

Explanation:

Address of a file generated for recording the progress of a resumable upload. The file contains the information about parts and the upload progress.

Restrictions:

This parameter is valid only for resumable uploads.

Default value:

If this parameter is left blank, the checkpoint file will be saved in the current directory.

header

GetObjectHeader

No

Explanation:

Headers in the request used for obtaining the storage class, redundancy policy, and other basic information about the object

Value range:

See Table 2.

Default value:

None

versionId

str

No

Explanation:

Object version ID, for example, G001117FCE89978B0000401205D5DC9

Value range:

The value must contain 32 characters.

Default value:

None. If this parameter is left blank, the latest version of the object is obtained.

progressCallback

callable

No

Explanation:

Callback function for obtaining the download progress

Default value:

None

NOTE:

This function contains the following parameters in sequence: number of downloaded bytes, total number of bytes, and used time (in seconds). For details about the sample code, see Downloading an Object - Obtaining the Download Progress (SDK for Python).

Table 2 GetObjectHeader

Parameter

Type

Mandatory (Yes/No)

Description

range

str

No

Explanation:

Download range. For example, 0-999 indicates the download range is from byte 1 to byte 1,000.

Value range:

Value range: 0 to the object length minus 1. Format: x-y, indicating the range is from byte x+1 to byte y+1

Restrictions:

The upper limit of range is the length of the object minus 1. If the specified value exceeds this limit, the length of the object minus 1 is used.

Default value:

None

if_match

str

No

Explanation:

Preset ETag. If the ETag of the object to be downloaded is the same as the preset ETag, the object is returned. Otherwise, an error is returned.

Value range:

The value must contain 32 characters.

Default value:

None

if_none_match

str

No

Explanation:

Preset ETag. If the ETag of the object to be downloaded is different from the preset ETag, the object is returned. Otherwise, an error is returned.

Value range:

The value must contain 32 characters.

Default value:

None

if_modified_since

str

or

DateTime

No

Explanation:

The object is returned if it has been modified since the specified time; otherwise, an error is returned.

Restrictions:

The value must be in the GMT format. For example, Wed, 25 Mar 2020 02:39:52 GMT. You can refer to Table 3 to specify time.

For example, DateTime(year=2023, month=9, day=12)

Default value:

None

if_unmodified_since

str

or

DateTime

No

Explanation:

The object is returned if it has not been modified since the specified time; otherwise, an error is returned.

Restrictions:

The value must be in the GMT format. For example, Wed, 25 Mar 2020 02:39:52 GMT. You can refer to Table 3 to specify time.

For example, DateTime(year=2023, month=9, day=12)

Default value:

None

origin

str

No

Explanation:

Origin of the cross-domain request specified by the preflight request. Generally, it is a domain name.

Restrictions:

Each origin can contain only one wildcard character (*).

Default value:

None

requestHeaders

str

No

Explanation:

HTTP headers in a cross-origin request Only CORS requests matching the allowed headers are valid.

Restrictions:

Each header can contain only one wildcard character (*). Spaces, ampersands (&), colons (:), and less-than signs (<) are not allowed.

Default value:

None

sseHeader

SseCHeader

No

Explanation:

Server-side decryption headers. For details, see Table 4.

Restrictions:

If the object uploaded to the server is encrypted on the server using the encryption key provided by the client, downloading the object requires including the encryption key in the message.

Default value:

None

Table 3 DateTime

Parameter

Type

Description

year

int

Explanation:

Year in UTC

Default value:

None

month

int

Explanation:

Month in UTC

Default value:

None

day

int

Explanation:

Day in UTC

Default value:

None

hour

int

Explanation:

Hour in UTC

Restrictions:

The value is in 24-hour format.

Default value:

0

min

int

Explanation:

Minute in UTC

Default value:

0

sec

int

Explanation:

Second in UTC

Default value:

0

Table 4 SseCHeader

Parameter

Type

Mandatory (Yes/No)

Description

encryption

str

Yes

Explanation:

SSE-C used for encrypting objects

Value range:

AES256

Default value:

None

key

str

Yes

Explanation:

Key used in SSE-C encryption. It corresponds to the encryption method. For example, if encryption is set to AES256, the key is calculated using the AES-256 algorithm.

Value range:

The value must contain 32 characters.

Default value:

None

Responses

Table 5 List of returned results

Type

Description

GetResult

Explanation:

SDK common results

Table 6 GetResult

Parameter

Type

Description

status

int

Explanation:

HTTP status code

Value range:

A status code is a group of digits ranging from 2xx (indicating successes) to 4xx or 5xx (indicating errors). It indicates the status of a response. For more information, see Status Code.

Default value:

None

reason

str

Explanation:

Reason description.

Default value:

None

errorCode

str

Explanation:

Error code returned by the OBS server. If the value of status is less than 300, this parameter is left blank.

Default value:

None

errorMessage

str

Explanation:

Error message returned by the OBS server. If the value of status is less than 300, this parameter is left blank.

Default value:

None

requestId

str

Explanation:

Request ID returned by the OBS server

Default value:

None

indicator

str

Explanation:

Error indicator returned by the OBS server.

Default value:

None

hostId

str

Explanation:

Requested server ID. If the value of status is less than 300, this parameter is left blank.

Default value:

None

resource

str

Explanation:

Error source (a bucket or an object). If the value of status is less than 300, this parameter is left blank.

Default value:

None

header

list

Explanation:

Response header list, composed of tuples. Each tuple consists of two elements, respectively corresponding to the key and value of a response header.

Default value:

None

body

object

Explanation:

Result content returned after the operation is successful. If the value of status is larger than 300, this parameter is left blank. The value varies with the API being called. For details, see Bucket-Related APIs (SDK for Python) and Object-Related APIs (SDK for Python).

Default value:

None

Table 7 GetResult.body

GetResult.body Type

Description

GetObjectMetadataResponse

Explanation:

For details, see Table 8.

Table 8 GetObjectMetadataResponse

Parameter

Type

Description

storageClass

str

Explanation:

Object storage class.

Value range:

  • If the storage class is Standard, leave this parameter blank.
  • For details about the available storage classes, see Table 9.

Default value:

None

accessContorlAllowOrigin

str

Explanation:

If Origin in the request meets the CORS rules of the bucket, AllowedOrigin specified in the CORS rules is returned. AllowedOrigin indicates the origin from which the requests can access the bucket.

Restrictions:

Domain name of the origin. Each origin can contain only one wildcard character (*), for example, https://*.vbs.example.com.

Default value:

None

accessContorlAllowHeaders

str

Explanation:

If RequestHeader in the request meets the CORS rules of the bucket, AllowedHeader specified in the CORS rules is returned. AllowedHeader indicates the allowed headers for cross-origin requests. Only CORS requests matching the allowed headers are valid.

Restrictions:

Each header can contain only one wildcard character (*). Spaces, ampersands (&), colons (:), and less-than signs (<) are not allowed.

Default value:

None

accessContorlAllowMethods

str

Explanation:

AllowedMethod in the CORS rules of the bucket. It specifies the HTTP method of cross-origin requests, that is, the operation type of buckets and objects.

Value range:

The following HTTP methods are supported:

  • GET
  • PUT
  • HEAD
  • POST
  • DELETE

Default value:

None

accessContorlExposeHeaders

str

Explanation:

ExposeHeader in the CORS rules of the bucket. It specifies the CORS-allowed additional headers in the response. These headers provide additional information to clients. By default, your browser can only access headers Content-Length and Content-Type. If your browser needs to access other headers, add them to a list of the allowed additional headers.

Restrictions:

Spaces, wildcard characters (*), ampersands (&), colons (:), and less-than signs (<) are not allowed.

Default value:

None

accessContorlMaxAge

int

Explanation:

MaxAgeSeconds in the CORS rules of the bucket. It specifies the time your client can cache the response for a cross-origin request.

Restrictions:

Each CORS rule can contain only one MaxAgeSeconds.

Value range:

An integer greater than or equal to 0, in seconds

Default value:

100

contentLength

int

Explanation:

Object size

Value range:

The value ranges from 0 TB to 48.8 TB, in bytes.

Default value:

None

contentType

str

Explanation:

MIME type of the file to be uploaded. MIME type is a standard way of describing a data type and is used by the browser to decide how to display data.

Value range:

See What Is Content-Type (MIME)? (Python SDK)

Default value:

None

lastModified

str

Explanation:

Time when the last modification was made to the object

Restrictions:

The time must be in the GMT format, for example, Wed, 25 Mar 2020 02:39:52 GMT.

Default value:

None

etag

str

Explanation:

Base64-encoded, 128-bit MD5 value of an object. ETag is the unique identifier of the object contents and is used to determine whether the contents of an object are changed. For example, if the ETag value is A when an object is uploaded and is B when the object is downloaded, this indicates the contents of the object are changed. The ETag reflects changes only to the contents of an object, not its metadata. Objects created by the upload and copy operations have unique ETags after being encrypted using MD5.

Restrictions:

If an object is encrypted using server-side encryption, the ETag is not the MD5 value of the object.

Value range:

The value must contain 32 characters.

Default value:

None

crc64

str

Explanation:

A 64-bit CRC value calculated based on the ECMA-182 standard. It uniquely identifies an object and can be used to check the integrity of the object content. If an object has different CRC64 values when being uploaded and downloaded, its content has been changed. CRC64 reflects changes to the contents of the object, not its metadata.

Restrictions:

  • This parameter is returned when the CRC64 value was verified for the uploaded object, or when the CRC64 feature has been enabled for the bucket.
  • This parameter is not supported for POSIX or SFS objects.

Value range:

A 64-bit CRC value calculated based on the ECMA-182 standard.

Default value:

None

versionId

str

Explanation:

Object version ID.

Value range:

The value must contain 32 characters.

Default value:

None

restore

str

Explanation:

Restore status of an object. This header is returned when an Archive object is being restored or has been restored.

For example, ongoing-request="true" indicates that the object is being restored. ongoing-request="false", expiry-date="Wed, 7 Nov 2012 00:00:00 GMT" indicates that the object has been restored. expiry-date indicates when the restored object expires.

Restrictions:

This parameter is only available for Archive objects.

Default value:

None

expiration

str

Explanation:

Expiration details. Example: "expiry-date=\"Mon, 11 Sep 2023 00:00:00 GMT\""

Default value:

None

sseKms

str

Explanation:

SSE-KMS is used for encrypting objects on the server side.

Value range:

kms

Default value:

None

sseKmsKey

str

Explanation:

ID of the KMS master key when SSE-KMS is used

Value range:

Valid value formats are as follows:

  1. regionID:domainID:key/key_id
  2. key_id

In the preceding formats:

Default value:

  • If this parameter is not specified, the default master key will be used.
  • If there is no such a default master key, the system will create one and use it by default.

sseC

str

Explanation:

SSE-C algorithm

Value range:

AES256

Default value:

None

sseCKeyMd5

str

Explanation:

MD5 value of the key for encrypting objects when SSE-C is used. This value is used to check whether any error occurs during the transmission of the key.

Restrictions:

The value is encrypted by MD5 and then encoded by Base64, for example, 4XvB3tbNTN+tIEVa0/fGaQ==.

Default value:

None

websiteRedirectLocation

str

Explanation:

If the bucket is configured with website hosting, the request for obtaining the object can be redirected to another object in the bucket or an external URL. This parameter specifies the address the request for the object is redirected to.

The request is redirected to object anotherPage.html in the same bucket:

WebsiteRedirectLocation:/anotherPage.html

The request is redirected to an external URL http://www.example.com/:

WebsiteRedirectLocation:http://www.example.com/

OBS obtains the specified value from the header and stores it in the object metadata WebsiteRedirectLocation.

Restrictions:

  • The value must start with a slash (/), http://, or https:// and cannot exceed 2 KB.
  • OBS only supports redirection for objects in the root directory of a bucket.

Default value:

None

isAppendable

bool

Explanation:

Whether the object is appendable

Value range:

True: The object is appendable.

False: The object is not appendable.

Default value:

None

nextPosition

int

Explanation:

Start position for next appending

Value range:

0 to the object length, in bytes.

Default value:

None

Table 9 StorageClass

Parameter

Type

Description

STANDARD

Standard storage class

Explanation:

Features low access latency and high throughput and is used for storing massive, frequently accessed (multiple times a month) or small objects (< 1 MB) requiring quick response.

WARM

Infrequent Access storage class

Explanation:

Used for storing data that is semi-frequently accessed (fewer than 12 times a year) but is instantly available when needed.

COLD

Archive storage class

Explanation:

Used for storing rarely accessed (once a year) data.

Code Examples

This example downloads object objectname from bucket examplebucket using resumable download.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
from obs import ObsClient
import os
import traceback

# Obtain an AK and SK pair using environment variables or import the AK and SK pair in other ways. Using hard coding may result in leakage.
# Obtain an AK and SK pair on the management console. For details, see https://support.huaweicloud.com/intl/en-us/usermanual-ca/ca_01_0003.html.
ak = os.getenv("AccessKeyID")
sk = os.getenv("SecretAccessKey")
# (Optional) If you use a temporary AK and SK pair and a security token to access OBS, obtain them from environment variables.
security_token = os.getenv("SecurityToken")
# Set server to the endpoint corresponding to the bucket. Here uses CN-Hong Kong as an example. Replace it with the one in use.
server = "https://obs.ap-southeast-1.myhuaweicloud.com"

# Create an obsClient instance.
# If you use a temporary AK and SK pair and a security token to access OBS, you must specify security_token when creating an instance.
obsClient = ObsClient(access_key_id=ak, secret_access_key=sk, server=server)
try:
    bucketName = "examplebucket"
    objectKey = "objectname"
    # Specify the full path to which objects are downloaded. The full path contains the local file name.
    downloadFile = 'localfile'
    # Specify the number of parts that can be concurrently downloaded.
    taskNum = 5
    # Specify the part size.
    partSize = 10 * 1024 * 1024
    # Enable the resumable download by setting enableCheckpoint to True.
    enableCheckpoint = True
    # Download the object using resumable download.
    resp = obsClient.downloadFile(bucketName, objectKey, downloadFile, partSize, taskNum, enableCheckpoint)

    # If status code 2xx is returned, the API is called successfully. Otherwise, the API call fails.
    if resp.status < 300:
        print('Download File Succeeded')
        print('requestId:', resp.requestId)
    else:
        print('Download File Failed')
        print('requestId:', resp.requestId)
        print('errorCode:', resp.errorCode)
        print('errorMessage:', resp.errorMessage)
except:
    print('Download File Failed')
    print(traceback.format_exc())