All Products
Search
Document Center

Object Storage Service:Metadata cache

Last Updated:Jan 23, 2025

ossfs metadata cache is suitable for scenarios in which a server has high read/write IOPS on Object Storage Service (OSS) data. After you enable ossfs metadata cache, the overall efficiency of object operations and the response time of requests are improved. This topic describes how to configure and effectively use ossfs metadata cache.

Important

When you use ossfs metadata cache, pay attention to data consistency and timeliness issues. Therefore, we recommend that you do not enable ossfs metadata cache in scenarios that have high requirements on the timeliness of data.

Background information

Metadata refers to information that describes data, such as object size, creation time, modification time, and user and group IDs. User and group IDs are attributes that are not supported by OSS, but file systems rely on these attributes for permission checks. ossfs allows you to obtain additional information from the custom headers of objects in OSS. This way, you can perform operations on objects based on their attributes in Linux.

After you enable ossfs metadata cache, performance, resource usage, and user experience are improved.

  • Performance: ossfs metadata cache reduces the latency of metadata reading, especially in scenarios that involve high I/O operations, which can improve the overall efficiency of object operations.

  • Resource usage: ossfs metadata cache reduces the number of calls to OSS and the queries per second (QPS) when you frequently access hot data.

  • User experience: The response time of the requests is improved.

Scenarios

  • ossfs metadata cache is suitable for scenarios in which you use a server to access OSS data.

  • In a distributed environment, ossfs metadata cache is suitable for scenarios in which data that does not frequently change is read from OSS. For example, AI training datasets and AI model files are read from OSS and big data query is performed.

How it works

ossfs uses the client memory to cache OSS metadata to reduce the latency of remote storage operations.

  • Metadata cache for first data access:

    The first time you access an object or a directory under the ossfs mount point, the ossfs client obtains the metadata of the object from OSS and stores it in the local cache.

  • Subsequent access acceleration:

    If the cache deletion policy is disabled or the cache does not expire, subsequent access to the metadata of the object is directly read from the local cache without the need to send requests to OSS, which greatly reduces the latency.

  • Cache update policies:

    ossfs updates the local cache based on specific policies, such as cache expiration and cache upper limit.

  • Multi-client cache synchronization:

    ossfs metadata cache is a single-server local cache that uses the client memory. You cannot use a server to mount multiple buckets to multiple local file systems at a time or synchronize metadata changes between multiple servers.

Mode comparison

The following table describes the differences before and after you enable ossfs metadata cache.

ossfs metadata cache

Command

Request method

Operation

Performance

ossfs metadata cache disabled

stat

ossfs sends HeadObject requests to a bucket to obtain object metadata.

ossfs sends a HeadObject request to obtain only one object from the bucket.

The metadata of objects is obtained from the bucket, which is slower than reading from the memory.

ls

ossfs sends a ListObject request to a bucket to obtain objects in directories and sends HeadObject requests to obtain object metadata.

After ossfs sends a ListObject request, ossfs sends a HeadObject request to obtain only one object in a directory from the bucket.

ossfs metadata cache enabled (metadata cache does not expire)

stat

ossfs obtains object metadata from the local memory.

ossfs obtains objects from the memory.

Object metadata is read from the local memory, which is faster.

ls

ossfs sends a ListObject request to a bucket to obtain objects in directories and obtains object metadata from the local memory.

After ossfs sends a ListObject request, ossfs obtains objects in directories. If you want to access a specific object in a directory, ossfs obtains the object from the local memory.

Parameters

The following table describes the parameters that you can configure for ossfs metadata cache.

Parameter

Description

Value

max_stat_cache_size

Specifies whether to enable metadata cache and the maximum size of the metadata cache. Specify the parameter based on the number of frequently accessed objects in OSS. If the memory is sufficient, we recommend that you set the parameter to a larger value to improve the operation performance.

-omax_stat_cache_size=1,000,000 specifies that the maximum size of the metadata cache is approximately 400 MB in the memory. If you set the parameter to 0, metadata cache is disabled. If you do not specify this parameter, the default value is used.

Default value: 100,000.

Unit: object or directory.

Size: approximately 40 MB.

stat_cache_expire

Specifies whether to enable the cache deletion policy for the metadata cache and change the validity period of the metadata cache. We recommend that you specify the validity period of the metadata cache based on your business requirements.

-ostat_cache_expire=1800 specifies that the validity period is 30 minutes. If you set the parameter to -1, the cache deletion policy is disabled. If you do not specify this parameter, the default value is used. We recommend that you do not disable the cache deletion policy unless no updates are performed on objects after the bucket is mounted to the local file system.

Note

By default, ossfs enables the cache deletion policy and sets the upper limit for the metadata cache to 100,000, which consumes approximately 40 MB of memory.

Default value: 900.

Unit: seconds.

readdir_optimize

Specifies whether to use cache optimization. Default value: false.

After you specify the parameter, ossfs does not send a HeadObject request to obtain the object metadata, such as gid and uid, when you run the ls command. The HeadObject request is sent only when the size of the accessed object is 0. However, a specific number of HeadObject requests may still be sent due to reasons such as permission checks. Specify the parameter based on the application characteristics.

To use cache optimization, specify -oreaddir_optimize.

Default value: false.

Metadata cache management mechanism

The following table describes the metadata cache management mechanism of ossfs.

Cache status

Operation

Object metadata is cached and the cache deletion policy is disabled or the cache does not expire.

Read object metadata directly from the cache.

Object metadata is cached and the cache expires.

Update the cache.

Object metadata is not cached and the cache capacity is available.

Cache objects.

Object metadata is not cached, the cache capacity is fully consumed, and the cache deletion policy is enabled.

Traverse the cache and delete expired objects.

Object metadata is not cached, the cache capacity is fully consumed, and the cache deletion policy is disabled.

Delete cached objects that have not been accessed for an extended period of time based on the least recently used (LRU) policy.

Suggestions

  • If you do not pay attention to the metadata of objects, we recommend that you specify the readdir_optimize parameter to improve the performance of the list and find operations. After you enable ossfs metadata cache, symbolic links are no longer supported.

  • In multi-client scenarios that have high real-time requirements for data updates, proceed with caution if you enable ossfs metadata cache. You can disable ossfs metadata cache by specifying the -omax_stat_cache_size=0 parameter to maintain data consistency. However, performance degradation may occur and you may be charged additional fees.

    If your application requires strong data consistency across multiple clients, we recommend that you use Cloud Storage Gateway (CSG) or Cloud Parallel File Storage (CPFS). We recommend that you do not use ossfs.

  • If the number of frequently accessed objects in OSS is large, increase the value of the max_stat_cache_size parameter appropriately to prevent the cache from being frequently deleted.

    If you want to perform operations on a large number of objects and the server memory is insufficient, we recommend that you use ossutil or OSS SDKs to perform the operations. If you want to mount a bucket to a local file system, we recommend that you use CSG or CPFS.

  • If the number of frequently accessed objects in OSS is large, you can create multiple OSSFS mount points, each dedicated to a separate subdirectory. This setup distributes the workload across the mount points and consequently enhances the performance.