This topic describes how the cache disk for a file gateway works.
How the cache disk works
The cache disk for a file gateway is a cloud disk configured in Cloud Storage Gateway. It aggregates and caches user data and metadata. The main features are as follows:
It caches data written through the mount target. After a file is closed, the gateway uploads the data to Object Storage Service (OSS) in a single operation. This allows the gateway to support both sequential and random write operations. After the data is uploaded, the corresponding cache space is automatically revoked based on a policy.
It caches read data to reduce latency for repeated file access. For large files, a pre-read mechanism loads data into the cache disk. This improves the read bandwidth for files.
It caches metadata, such as the directory structure, to accelerate file system metadata operations such as
lsandstat. The gateway also provides flexible metadata synchronization mechanisms. These include express synchronization, periodic remote sync, and one-time remote sync to meet data consistency requirements in different scenarios.The gateway automatically evicts file data that has not been accessed for a long time based on cache disk usage. Only the data is evicted. The metadata, such as filenames, directory structure, and permissions, is retained. This ensures that the cache disk usage remains at a healthy level, which is typically around 60%.
The cache disk stores both data and metadata. The space is allocated proportionally: 80% is used for the data cache and 20% is used for the metadata cache.
Important notes
Choose a suitable cache disk type.
When you select a cache disk type, match it with your business requirements for bandwidth and input/output operations per second (IOPS). Also, make sure the cache disk performance matches the gateway specifications to achieve optimal efficiency. For example, use an ESSD PL1 cache disk for Basic and Medium gateways, and an ESSD PL2 cache disk for Enhanced and Compute-optimized gateways.
Choose a suitable cache disk capacity.
When you select the cache disk capacity, consider the requirements for both data cache and metadata cache.
The data cache capacity depends on the concurrency and the maximum file size. To prevent write failures or performance degradation, the free space on the cache disk for data must be greater than the product of concurrency × maximum file size. For better performance, reserve about 30% of the free space.
The metadata cache stores the structural information of the file system, including the following:
Directory structure
File properties, such as size and creation or modification time
Stub files: Even if file data is evicted, the file record is retained in the metadata. This ensures a consistent view when users browse folders. When a user accesses an evicted file again, the gateway automatically reloads the data from OSS to the cache.
The metadata cache capacity is closely related to the total number of files. Typically, a 100 GB cache disk can manage the metadata for about 10 million files.
If the data cache or metadata cache runs out of space, the system triggers throttling and insufficient metadata space alerts. To avoid business interruptions, scale out the cache disk promptly after you receive an alert. This ensures the gateway runs in a stable and efficient manner.