All Products
Search
Document Center

Object Storage Service:Storage capacity usage of OSS-HDFS

Last Updated:Dec 21, 2023

OSS-HDFS uses Object Storage Service (OSS) buckets to store OSS-HDFS data and its auxiliary data. These data is stored in the .dlsdata/ directory of the buckets. You are charged for the used storage capacity.

OSS-HDFS data blocks

All OSS-HDFS data blocks occupy the storage capacity of OSS buckets. OSS-HDFS data blocks occupy a large amount of the used storage capacity. You can run the hdfs du command to view the storage capacity occupied by OSS-HDFS data blocks.

OSS-HDFS checksum data

OSS-HDFS supports the Hadoop Distributed File System (HDFS) checksum feature. Additional checksum data is generated when data is written. Open source HDFS checksum data occupies the disk capacity of data nodes, while OSS-HDFS checksum data occupies the storage capacity of OSS buckets.

By default, a normal data write generates a checksum of 4 bytes for every 512 bytes. When small files or small data blocks are written, the checksum data may be greater, which is normal.

Important

The storage capacity usage of checksum data is caculated into the storage capacity usage of Standard buckets. You cannot run the hdfs du command to calculate and query the storage capacity usage of OSS-HDFS checksum data.

OSS-HDFS file holes

In specific scenarios, OSS-HDFS allows you to randomly overwrite and modify objects by using the JindoFuse client. A data block cannot be modified. The system needs to write data to or modify an object in delta mode. In this case, a file hole consisting of zeros that are not written to a file may be generated. If you write or modify an object in delta file mode, additional storage capacity is occupied.

Important

The storage capacity usage of file holes is calculated into the storage capacity usage of Standard buckets. You cannot run the hdfs du command to calculate and query the storage capacity usage of OSS-HDFS file holes.

OSS-HDFS audit logs

OSS-HDFS read and write operations on objects are recorded in audit logs. Audit logs are stored in buckets and occupy storage capacity. Audit logs are stored in the /.sysinfo directory in OSS-HDFS. You can run the hdfs du command to query the storage capacity usage of OSS-HDFS audit logs.

OSS-HDFS inventory lists

OSS-HDFS supports the export of inventory lists. Inventory lists are stored in buckets and occupy storage capacity. The inventory lists are stored in the /.sysinfo directory in OSS-HDFS. You can run the hdfs du command to query the storage capacity usage of OSS-HDFS inventory lists.

OSS-HDFS trash bin

Objects in the OSS-HDFS trash bin are not deleted from the OSS-HDFS system before the specified retention period ends and occupy storage capacity. You can run the hdfs du command to query the storage capacity usage of OSS-HDFS trash bin.

OSS-HDFS internal data

OSS-HDFS uses OSS buckets to temporarily store data, such as asynchronous task information. The temporarily stored data occupies less than 1 GB of the storage capacity.

Important

OSS-HDFS internal data is calculated into the storage capacity usage of Standard buckets. You cannot run the hdfs du command to calculate and query OSS-HDFS internal usage.

References