Export a snapshot of all file and directory metadata from an OSS-HDFS-enabled bucket to a JSON file. Use the inventory for:
Storage auditing: Review file ownership, permissions, and storage policies across your data lake after a migration.
Cost analysis: Identify large or stale files and analyze storage policy distribution to reduce unnecessary costs.
Pipeline acceleration: Replace slow, synchronous
ListAPI calls with a pre-generated inventory file in batch processing jobs.
Prerequisites
Before you begin, make sure you have:
The latest version of JindoSDK. See JindoSDK download on GitHub
An AccessKey pair configured to access the OSS-HDFS-enabled bucket. See Use Jindo command-line interface (CLI) commands to access OSS or OSS-HDFS
Export the inventory
Log on to the OSS console.
In the left-side navigation pane, click Buckets. On the Buckets page, find and click the desired bucket.
In the left-side navigation pane, choose Data Lake > OSS-HDFS.
In the Object Metadata section, click Export.
Depending on the volume of metadata, the export can take several minutes to several hours. The inventory file is saved to:
oss://<hdfs_bucket>.<dls_endpoint>/.sysinfo/inventory/<unix_ms_timestamp>.<uuid>For example:
oss://<hdfs_bucket>.<dls_endpoint>/.sysinfo/inventory/1666584461201.2ce40fba-5704-45c4-8720-d92a891d****The filename consists of a Unix millisecond timestamp and a UUID, separated by a dot. Custom output paths are not supported.
ImportantThe
.sysinfo/inventory/directory cannot be deleted. Individual inventory files within the directory can be accessed and deleted.
Download and view the inventory file
Download the inventory file using the Jindo CLI:
jindo fs -get oss://<hdfs_bucket>.<dls_endpoint>/.sysinfo/inventory/1666584461201.2ce40fba-5704-45c4-8720-d92a891d**** /tmp/Open the file with
viorvim. The following is a sample record:{ "id": 6289437267661375236, "path": "/user/spark/.sparkStaging/application_1767273535967_0001/__spark_libs__5660687526220176997.zip", "type": "file", "size": 315253339, "user": "spark", "group": "supergroup", "ctime": 1767273541200, "atime": 1767273541321, "mtime": 1767273541897, "storagePolicy": "UNSPECIFIED", "permission": 420, "state": 0, "storageState": "STD" }
Field reference
The following fields are always present in every inventory record:
| Field | Description |
|---|---|
id | The ID of the file or directory. |
path | The path of the file or directory. |
type | The entry type: file or directory. |
size | The size in bytes. For directories, the value is 0. |
user | The owner of the file or directory. |
group | The user group of the file or directory. |
ctime | The time when the file or directory was created. The value is a UNIX timestamp. |
atime | The time when the file or directory was last accessed. The value is a UNIX timestamp. |
mtime | The time when the file or directory was last modified. The value is a UNIX timestamp. |
permission | The permissions on the file or directory. |
storagePolicy | The storage policy applied to the file or directory. See Storage policy values. |
The following fields are reserved and not enabled in the current version:
| Field | Description |
|---|---|
state | The actual tiering state of the file or directory. |
storageState | The progress of the current tiering operation. |
Storage policy values
| Value | Storage class |
|---|---|
UNSPECIFIED | Default (equivalent to Standard) |
CLOUD_STD | Standard |
CLOUD_IA | Infrequent Access |
CLOUD_AR | Archive |
CLOUD_COLD_AR | Cold Archive |
CLOUD_DEEP_COLD_AR | Deep Cold Archive |
CLOUD_AR_RESTORED | Archive restored |
CLOUD_COLD_AR_RESTORED | Cold Archive restored |
CLOUD_DEEP_COLD_AR_RESTORED | Deep Cold Archive restored |
Delete the inventory file
The inventory file consumes storage space and incurs storage fees. After you finish using it, delete the file with the Jindo CLI:
jindo fs -rm -skipTrash oss://<hdfs_bucket>.<dls_endpoint>/.sysinfo/inventory/1666584461201.2ce40fba-5704-45c4-8720-d92a891d****Make sure the file path in the command matches the Data Location path that is generated during export. This prevents accidental deletion of system data under .dlsdata and .sysinfo.