OSS-HDFS stores data and auxiliary data in the .dlsdata/ directory of your OSS buckets. You are charged for the used storage capacity.
If your billed storage exceeds what hdfs du reports, the difference comes from one or more of the categories below.
Storage categories at a glance
| Category | Billed as OSS storage | Visible via hdfs du | Typical size |
|---|---|---|---|
| Data blocks | Yes | Yes | Largest portion |
| Checksum data | Yes | No | ~0.78% of data for large files; higher for small files |
| File holes | Yes | No | Varies; only when using JindoFuse with delta writes |
| Audit logs | Yes | Yes | Depends on read/write activity |
| Inventory lists | Yes | Yes | Depends on export frequency |
| Trash bin | Yes | Yes | Depends on retention period |
| Internal data | Yes | No | Less than 1 GB |
Data blocks
Data blocks are the primary component of OSS-HDFS storage. All data blocks are stored in your OSS bucket and count toward your storage capacity.
Run hdfs du to check data block storage.
Checksum data
OSS-HDFS supports the Hadoop Distributed File System (HDFS) checksum feature. Each write generates additional checksum data automatically.
By default, OSS-HDFS generates 4 bytes of checksum for every 512 bytes of data. For small files or small data blocks, the checksum-to-data ratio is higher; this is expected behavior.
Unlike open source HDFS, where checksum data occupies disk capacity on data nodes, OSS-HDFS checksum data occupies OSS bucket storage capacity.
Checksum data is counted in the storage capacity usage of Standard buckets. hdfs du does not report checksum storage.
File holes
When you use the JindoFuse client to randomly overwrite or modify objects, OSS-HDFS writes changes in delta mode because data blocks cannot be modified in place. This delta write process may generate file holes — regions of zeros that are not explicitly written to the file — which occupy additional storage capacity.
File holes only occur with delta file writes via JindoFuse.
File hole storage is counted in the storage capacity usage of Standard buckets. hdfs du does not report file hole storage.
Audit logs
OSS-HDFS records all read and write operations on objects as audit logs. Audit logs are stored in the /.sysinfo directory and occupy bucket storage capacity.
Run hdfs du /.sysinfo to check audit log storage.
Inventory lists
OSS-HDFS supports exporting inventory lists. Exported lists are stored in the /.sysinfo directory and occupy bucket storage capacity.
Run hdfs du /.sysinfo to check inventory list storage.
Trash bin
Objects moved to the OSS-HDFS trash bin remain in the bucket until the retention period ends. Trash bin objects occupy storage capacity during this period.
Run hdfs du on the trash bin directory to check its storage usage.
Internal data
OSS-HDFS temporarily stores internal data in your OSS bucket — for example, asynchronous task information. This data occupies less than 1 GB of storage capacity.
Internal data is counted in the storage capacity usage of Standard buckets. hdfs du does not report internal data storage.