You can use the bucket inventory feature to export information about specific objects in a bucket, such as the number, size, storage class, and encryption status of the objects. To list a large number of objects, we recommend that you use the bucket inventory feature instead of calling the GetBucket (ListObjects) operation.
To maintain OSS-HDFS availability and prevent data contamination, do not set Inventory Path to .dlsdata/
when you create an inventory for a bucket for which OSS-HDFS is enabled.
Billing rules
You are charged when you use the bucket inventory feature. However, you are charged only the storage fees for inventory lists, operation calling fees for the PutBucketInventory, GetBucketInventory, ListBucketInventory, and DeleteBucketInventory operations, and traffic and request fees for accessing the inventory lists during the public preview.
You are charged for the storage of the inventory lists. Object Storage Service (OSS) generates inventory lists based on the inventory. To prevent unnecessary costs, delete inventory lists that you no longer need.
Limits
You can configure up to 1,000 inventories for a bucket by using OSS SDKs or ossutil. You can configure up to 10 inventories for a bucket by using the OSS console.
The bucket for which you want to configure an inventory can be different from the bucket in which you want to store the generated inventory lists. However, the two buckets must belong to the same Alibaba Cloud account and be located in the same region.
Permissions
When you use the bucket inventory feature, you must configure an inventory for a bucket. Then, OSS assumes the RAM role that you created to write the generated inventory lists to the inventory storage bucket.
If you want to use the bucket inventory feature by using an Alibaba Cloud account, you must create a RAM role and grant permissions to the RAM role.
If you want to use the bucket inventory feature as a RAM user, you must grant the RAM user role-related permissions and the permissions to configure inventories, create a RAM role, and grant permissions to the RAM role.
You can perform the following steps to grant a RAM user role-related permissions and the permissions to configure inventories, create a RAM role, and grant permissions to the RAM role. Then, you can create an inventory. For more information, see Create an inventory.
High risks may arise when you grant a RAM user role-related permissions, such as the CreateRole
and GetRoles
permissions. We recommend that you create a RAM role and grant permissions to the RAM role by using the Alibaba Cloud account to which the RAM user belongs. Then, the RAM user can assume the RAM role that is created by the Alibaba Cloud account.
Grant a RAM user role-related permissions and the permissions to configure inventories
Perform the following steps to grant a RAM user role-related permissions and the permissions to configure inventories:
Create the following custom policy on the JSON tab. For more information, see Create custom policies.
NoteIf you access OSS by using the OSS console, you must add
oss:ListBuckets
to the Action element in the policy. If you access OSS by using OSS SDKs or ossutil, you do not need to addoss:ListBuckets
to the Action element in the policy.{ "Statement": [ { "Effect": "Allow", "Action": [ "oss:PutBucketInventory", "oss:GetBucketInventory", "oss:DeleteBucketInventory", "oss:ListBuckets", "ram:CreateRole", "ram:AttachPolicyToRole", "ram:GetRole", "ram:ListPoliciesForRole" ], "Resource": "*" } ], "Version": "1" }
ImportantThe
AliyunOSSFullAccess
policy allows you to perform all operations on OSS resources. If theAliyunOSSFullAccess
policy is attached to a RAM user that you created in your Alibaba Cloud account, you need to only attach the following policy to the RAM user to grant role-related permissions to the RAM user.{ "Statement": [ { "Effect": "Allow", "Action": [ "ram:CreateRole", "ram:AttachPolicyToRole", "ram:GetRole", "ram:ListPoliciesForRole" ], "Resource": "*" } ], "Version": "1" }
Attach the created custom policy to the RAM user. For more information, see Grant permissions to a RAM user.
After the RAM user is granted the required permissions, the RAM user can create a RAM role and grant permissions to the RAM role. For more information, see What to do next.
Create a RAM role and grant permissions to the RAM role
ImportantIf you want to use a key that is managed by Key Management Service (KMS) to encrypt inventory lists, you must attach the
AliyunKMSFullAccess
policy to the created RAM role to grant the RAM role the permissions to manage KMS.Grant permissions to the
AliyunOSSRole
RAM role that is automatically createdWhen you configure an inventory in the OSS console, the RAM console automatically creates the
AliyunOSSRole
RAM role. By default, the AliyunOSSRole RAM role has the permissions to write inventory lists to the inventory storage bucket.ImportantIf you use the
AliyunOSSRole
RAM role, you do not need to grant permissions to the RAM role. However, security risks may arise because the RAM role has the permissions to manage OSS by default. You can create a custom RAM role and grant the least permissions to the role to comply with the principle of least privilege.Create a custom RAM role and grant permissions to the role
If you want OSS to assume a custom RAM role to write inventory lists to the inventory storage bucket, perform the following steps to create a custom RAM role and attach a policy to the role:
Create a normal service role.
When you create a normal service role, select OSS for the Select Trusted Service parameter. For information about how to configure other parameters, see Create a RAM role for a trusted Alibaba Cloud service.
Create a custom policy on the Visual editor tab.
When you create a custom policy, configure the parameters. The following table describes the parameters. For information about how to configure other parameters, see Create custom policies.
Parameter
Description
Effect
Select Allow.
Service
Select Object Storage Service.
Action
Select Select action(s) and then select oss:PutObject in Write actions.
Resource
Select All resource(s)(*).
The following code shows the policy in JSON format:
{ "Version": "1", "Statement": [ { "Effect": "Allow", "Action": "oss:PutObject", "Resource": "*" } ] }
Attach the custom policy to the RAM role.
For more information, see Grant permissions to a RAM role.
Procedure
Use the OSS console
Use OSS SDKs
Use ossutil
Use the OSS API
Inventory lists
After you configure an inventory for a bucket, OSS generates inventory lists at the specified time interval. The following structure shows the directories in which generated inventory lists are stored.
dest_bucket
└──destination-prefix/
└──src_bucket/
└──inventory_id/
├──YYYY-MM-DDTHH-MMZ/
│ ├──manifest.json
│ └──manifest.checksum
└──data/
└──745a29e3-bfaa-490d-9109-47086afcc8f2.csv.gz
Directory | Description |
destination-prefix/ | This directory is generated based on the prefix specified when you configure an inventory. If no prefix is specified for inventory lists, this directory is omitted. |
src_bucket/ | This directory is generated based on the name of the bucket for which inventory lists are generated. |
inventory_id/ | This directory is generated based on the name of the inventory. |
YYYY-MM-DDTHH-MMZ/ | This directory indicates the start time when the bucket is scanned. The name of this directory is a timestamp in UTC. Example: 2020-05-17T16-00Z. The manifest.json and manifest.checksum objects are stored in this directory. |
data/ | Inventory lists that include the list of objects in the source bucket and the metadata of exported objects in the source bucket are stored in this directory. Inventory lists are CSV objects that are compressed by using Gzip. Important
|
After you configure an inventory for a bucket, the following objects are generated based on the inventory:
manifest objects
manifest objects include the manifest.json and manifest.checksum objects.
manifest.json: stores the metadata of inventory lists and related information.
{ "creationTimestamp": "1642994594", "destinationBucket": "destbucket", "fileFormat": "CSV", "fileSchema": "Bucket, Key, VersionId, IsLatest, IsDeleteMarker, Size, StorageClass, LastModifiedDate, ETag, IsMultipartUploaded, EncryptionStatus, ObjectAcl, TaggingCount, ObjectType, Crc64", "files": [{ "MD5checksum": "F77449179760C3B13F1E76110F07****", "key": "destbucket/inventory0124/data/a1574226-b5e5-40ee-91df-356845777c04.csv.gz", "size": 2046}], "sourceBucket": "srcbucket", "version": "2019-09-01"}
The following table describes the fields in the manifest.json object.
Field
Description
creationTimestamp
The start time when the source bucket is scanned. The value of this field is a UNIX timestamp.
destinationBucket
The bucket in which the inventory lists are stored.
fileFormat
The format of the inventory lists.
fileSchema
The fields in each inventory list. The fields are divided into fixed fields and optional fields. The sequence of the fixed fields is fixed. The sequence of the optional fields is determined by your selection sequence when you configure an inventory. We recommend that you parse the data columns in csv.gz based on the sequence of the fields in fileSchema. This prevents a mismatch between columns and attributes.
If you select the current object version when you configure an inventory, the fixed fields
Bucket, Key
in fileSchema are listed first, followed by the optional fields in fileSchema.If you select all object versions when you configure an inventory, the fixed fields
Bucket, Key, VersionId, IsLatest, IsDeleteMarker
in fileSchema are listed first, followed by the optional fields in fileSchema.
files
The name, size, and MD5 hash of each inventory list. The name of an inventory list contains the full path.
sourceBucket
The source bucket for which the inventory is configured.
version
The version of the inventory list.
manifest.checksum: stores the MD5 hash of the manifest.json object. Example:
8420A430CBD6B659A1C0DFC1C11A****
.
Inventory lists
Inventory lists contain the exported object information and are stored in the data/ directory. The following figure provides an example of an inventory list.
The sequence of the fields in the inventory list is determined by the sequence when you configure the inventory. The following table describes the fields in the preceding figure from left to right.
Field | Description |
Bucket | The name of the source bucket for which the inventory is configured. |
Key | The name of the object in the bucket. The object name is URL-encoded. You must decode the object name before you can view the name. |
VersionId | The version ID of the object. This field exists only when versioning is enabled for the bucket and the inventory specifies that all versions of the object are included in the inventory list. |
IsLatest | Specifies whether the version is the latest version. If the version is the latest version, this field is set to True. Otherwise, this field is set to False. This field exists only when versioning is enabled for the bucket and the inventory specifies that all versions of data are exported. |
IsDeleteMarker | Specifies whether the version is a delete marker. If the version is a delete marker, this field is set to True. Otherwise, this field is set to False. This field exists only when versioning is enabled for the bucket and the inventory specifies that all versions of the object are included in the inventory list. |
Size | The size of the object. |
StorageClass | The storage class of the object. |
LastModifiedDate | The time when the object was last modified. |
ETag | The ETag of the object. An ETag is generated when an object is created. The ETag is used to identify the content of the object.
|
IsMultipartUploaded | Specifies whether the object is created by using multipart upload. If you create the object by using multipart upload, the value of this field is True. Otherwise, the value is False. |
EncryptionStatus | Specifies whether the object is encrypted. If the object is encrypted, the value of this field is True. Otherwise, the value is False. |
ObjectAcl | The access control list (ACL) of the object. For more information, see Object ACLs. |
TaggingCount | The number of tags of the object. |
ObjectType | The type of the object. For more information, see Object ACLs. |
Crc64 | The CRC-64 of the object. |
Usage notes
Suggestions on exporting inventory lists
You can export inventory lists on a daily or weekly basis.
Number of objects in a bucket | Export suggestions |
< 10 billion | Export bucket inventory lists on a daily or weekly basis based on your business requirements. |
10 billion to 50 billion | Export bucket inventory lists on a weekly basis. |
≥ 50 billion |
|
Traffic and bandwidth
To increase the speed at which inventory lists are exported, bucket-level and user-level bandwidth may be occupied when the inventory lists are exported to the inventory storage bucket. If the bucket for which you want to configure the inventory is frequently accessed and the available bandwidth of the bucket is limited, we recommend that you create a bucket to store the inventory lists.
Exceptions
If no objects are stored in the bucket for which the inventory is configured or no objects match the specified prefix in the inventory, no inventory lists are generated.
When you export the inventory lists, the exported inventory lists may not contain all objects in the source bucket due to operations such as creation, deletion, or overwriting. If the time when an object was last modified is earlier than the time specified by the createTimeStamp field in the manifest.json object, inventory lists contain information about the object. Otherwise, the inventory lists may not contain information about the object. We recommend that you check the object attributes by calling the HeadObject operation before you export information about an object. For more information, see HeadObject.
Deletion of inventory lists
OSS continuously generates inventory lists based on the frequency specified by an inventory until the inventory is deleted. To prevent OSS from generating unnecessary inventory lists, you can delete inventories that you no longer need at the earliest opportunity. You can also delete exported historical inventory lists that you no longer need.