Bucket Inventory

Overview

OBS provides bucket inventories to facilitate your management of objects in a bucket. You can configure bucket inventories to periodically list objects in a bucket. During the listing of objects, object metadata is saved in a CSV file, which is uploaded to the specified bucket.

When configuring a bucket inventory, you can specify the filter criteria (by object name prefix) of objects, set the inventory generation interval (daily or weekly), and choose whether to list all versions of objects. In addition, you can also specify the content of object metadata to be listed by the inventory, including the object size, modification time, and storage class.

You can configure multiple inventories for a bucket, but the object name prefixes specified in these inventories cannot overlap. For example, if an inventory has been configured to filter objects by the name prefix of a, the object name prefix ab cannot be used as the filter criteria of another inventory for the bucket.

Use Restrictions

  • Only OBS 3.0 buckets support this function. However, this restriction does not apply to the destination bucket where inventory files are saved.
  • The source bucket for which an inventory is configured and the destination bucket where inventory files are saved must belong to the same tenant and reside in the same region.
  • Inventory files are in CSV format.
  • If a bucket has enabled the KMS encryption function, it cannot be used as the destination bucket for saving inventory files.
  • Inventory files are delivered to the destination bucket by an OBS system user. Therefore, you need to authorize the system user the permission to write the destination bucket.

How to Configure a Bucket Inventory

Before the configuration, you need to briefly understand what is a source bucket and what is a destination bucket.
  • Source bucket: A source bucket is the bucket for which an inventory is configured. The inventory lists objects stored in the source bucket.
  • Destination bucket: A destination bucket is where generated inventory files are stored. A source bucket can also be the destination bucket. You can specify a name prefix for an inventory. Then generated inventory files will be named with the prefix and saved in the directory with the prefix. If you do not specify any name prefix for the inventory, the generated inventory files are stored in the root directory of the bucket.
    • Restrictions on the destination bucket
      • The destination bucket and source bucket must belong to the same tenant.
      • The destination bucket and source bucket must be in the same region.
      • The policy of the destination bucket must grant the OBS system users the permission to write objects to the bucket. For details about how to authorize such permission, see 1.
    • The destination bucket contains the following files:
      • A list of inventory files
      • The Manifest file, which contains the list of all inventory files under a certain inventory configuration. For details about the Manifest file, see Manifest File.

Configuring a Bucket Inventory

You use OBS Console or call the API to configure a bucket inventory. If you configure a bucket inventory on OBS Console, a bucket policy with the required permission configuration is automatically generated for the destination bucket. If you call the API to configure the bucket inventory, you need to manually configure the bucket policy for the destination bucket.

  1. Add a bucket policy for the destination bucket.

    A bucket policy must be configured for the destination bucket, to grant the OBS system users the permission to write objects to the destination bucket. The format of the bucket policy is as follows. Replace destbucket with the actual name of the destination bucket.

    {
    	"Statement": [
    		{
    			"Effect": "Allow",
    			"Sid": "1",
    			"Principal": {"Service": "obs"},
                            "Resource": ["destbucket/*"],
    			"Action": ["PutObject"]
    		}
    	]
    }
  1. Configure a bucket inventory.

    For details about how to configure a bucket list by calling the API, see descriptions about configuring bucket inventories in the OBS API Reference.

Content in an Inventory File

The content in an inventory file can be configured when creating the inventory. For details about all possible fields, see Table 1.

Table 1 Object metadata listed in an inventory file

Metadata

Description

Bucket

Name of the source bucket

Key

The name of an object. Each object in a bucket has a unique key. (Object names in the inventory file are URL-encoded using UTF-8 character set and can be used only after being decoded.)

VersionId

Version ID of an object. If the value of IncludedObjectVersions in the inventory configuration is Current, this field is not included in the inventory file.

IsLatest

If the object version is the latest, this parameter is True. (If the value of IncludedObjectVersions in the inventory configuration is Current, this field is not included in the inventory file.)

IsDeleteMarker

When versioning is enabled for the source bucket, if an object is deleted, a new object metadata is generated for the object, and the IsDeleteMarker of the metadata is set to true. (If the value of IncludedObjectVersions in the inventory configuration is Current, this field is not included in the inventory file.)

Size

Object size, in bytes.

LastModifiedDate

Object creation date or last modification date.

ETag

Hexadecimal digest of the object MD5. ETag is the unique identifier of the object content. It can be used to identify whether the object content is changed. For example, if ETag value is A when an object is uploaded and the ETag value has changed to B when the object is downloaded, it indicates that the object content is changed.

StorageClass

Storage class of an object.

IsMultipartUploaded

Indicates whether an object is uploaded in the multipart mode.

ReplicationStatus

Cross-region replication status of an object

EncryptionStatus

Encryption status of an object

Inventory File Name

The name of an inventory file is in the following format:

destinationPrefix/sourceBucketName/inventoryId/yyyy-MM-dd'T'HH-mm'Z'/files/UUID_index.csv
  • destinationPrefix: The inventory file name prefix configured when creating the inventory rule. Inventory files generated under the rule are named after the prefix, which can facilitate the classification of inventory files. If no prefix is specified, the default prefix is BucketInventory.
  • sourceBucketName: Name of the source bucket for which an inventory is configured. This field can be used to differentiate inventory files of different source buckets, if those inventory files are saved in the same destination bucket.
  • inventoryId: If a source bucket has multiple inventory rules whose inventory files are saved in the same destination bucket, this field can be used to identify different inventory rules.
  • yyyy-MM-dd'T'HH-mm'Z': Start time and date for scanning the destination bucket when an inventory file is generated. Objects uploaded to the source bucket after this time may not be listed in the inventory file.
  • UUID_index.csv: One of the inventory files.

Manifest File

If a source bucket contains a large number of objects, multiple inventory files may be generated under one inventory rule. After complete inventory files are generated, a manifest.json file is generated. summarizing information about all the generated inventory files. See information details as follows:

  • sourceBucket: name of the source bucket
  • destinationBucket: name of the destination bucket
  • version: version of the inventory
  • fileFormat: format of inventory files
  • fileSchema: object metadata fields contained in the inventory files
  • files: list of all inventory files
  • key: inventory file name
  • size: size of an inventory file, in bytes
  • inventoriedRecord: number of records contained in an inventory file
The following is an example of a simple manifest.json file.
{
        "sourceBucket":"user001",
        "destinationBucket":"bucket001",
        "version":"2019-01-03",
        "fileFormat":"CSV",
        "fileSchema":"Bucket,Key,Size,LastModifiedDate,ETag,StorageClass,IsMultipartUploaded,ReplicationStatus,EncryptionStatus",
        "files":[
                {
                        "key":"inventory/user001/test_id/2019-01-03T12-28Z/files/0000016813AF58E66806C1E2D7F15155_1.csv",
                        "size":6705647390,
                        "inventoriedRecord":70585762,
                }
        ]
}

The name of a manifest file is as follows. The meanings of the fields are as follows:

destinationPrefix/sourceBucketName/inventoryId/yyyy-MM-dd'T'HH-mm'Z'/manifest.json

How Is a User Notified When the Generation of Inventory Files Complete?

You can enable SMN (the message notification service) for the destination bucket. By doing so, you can receive SMS messages or emails every time when inventory files and the manifest file are generated. For more information about SMN, see Event notification.

The following is a simple example of SMN configuration. destinationPrefix/sourceBucketName indicates the prefix of the manifest file. destinationPrefix is the configured name prefix for inventory files, and sourceBucketName is the source bucket for which the inventory file is configured. The manifest.json is the suffix of the manifest file.

<NotificationConfiguration>
  <TopicConfiguration>
    <Id>01</Id>
    <Filter>
      <Object>
        <FilterRule>
          <Name>prefix</Name>
          <Value>destination-prefix/source-bucket</Value>
        </FilterRule>
        <FilterRule>
          <Name>suffix</Name>
          <Value>manifest.json</Value>
        </FilterRule>
      </Object>
    </Filter>
    <Topic>urn:smn:southchina:11aa22bb:notification</Topic>
    <Event>ObjectCreated:Put</Event>
  </TopicConfiguration>
</NotificationConfiguration>