ossfs 1.91 and later introduces three performance improvements over 1.88.x: POSIX operation optimizations, readdir optimization, and direct read. If your cluster uses Container Storage Interface (CSI) 1.30.1 or later, enable the corresponding feature gates to upgrade ossfs.
ossfs features are available only on Elastic Compute Service (ECS) nodes.
To upgrade, see Switch to ossfs 1.91 or later.
What changed from 1.88.x to 1.91
The following sections describe the feature changes in ossfs 1.91 and later. For the full release notes, see the ossfs changelog.
POSIX operation fixes and parameter defaults
ossfs 1.91 includes several fixes and default-value updates:
OSS volumes can now be mounted to subpaths that do not exist in OSS buckets.
Zero-byte files are no longer uploaded when you create an object. The EntityTooSmall error that occasionally occurred during multipart upload is fixed. Append operations are improved.
Default parameter values are updated based on upstream ossfs and performance benchmarking results.
The following table shows the parameter defaults that changed between 1.88.x and 1.91:
| Parameter | Description | Default in 1.88.x | Default in 1.91+ |
|---|---|---|---|
stat_cache_expire | Metadata cache TTL. Unit: seconds. | -1 (never expires) | 900 |
multipart_threshold | File size threshold for multipart upload. Unit: MB. | 5 x 1024 | 25 |
max_dirty_data | Dirty data size threshold for forced flush to disk. Unit: MB. | -1 (never flushed) | 5120 |
The following parameters retain the same defaults as 1.88.x but differ from the open-source version of ossfs 1.91, to maintain performance at the same level:
| Parameter | Description | Default in open-source 1.91+ | Default in Alibaba Cloud 1.91+ |
|---|---|---|---|
multipart_size | Part size for multipart upload. Unit: MB. | 10 | 30 |
parallel_count | Number of parts uploaded concurrently. | 5 | 20 |
To modify any of these parameters, update the otherOpts field in your PV.
Readdir optimization
When mounting an OSS volume, ossfs calls HeadObject for every object in the mounted path to retrieve metadata such as permissions, modification time, UIDs, and GIDs. On paths with many objects, these HeadObject calls can significantly slow down directory traversal operations like ls and find.
The readdir optimization feature skips those HeadObject calls, which reduces latency for directory operations. Understand the following trade-offs before enabling it:
chmodandchowncommands have no effect.Symbolic links may not behave as expected. Hard links are not supported.
The following table describes the parameters for the readdir optimization feature:
| Parameter | Description | Default |
|---|---|---|
readdir_optimize | Enables the readdir optimization feature. Enable with -o readdir_optimize (no value required). | Disabled |
symlink_in_meta | Records symbolic link metadata so that symbolic links display correctly. Enable with -o symlink_in_meta (no value required). | Disabled |
Direct read
The direct read feature is designed for sequential read workloads on large files. Without direct read, ossfs downloads files from OSS to disk before reading them, so read throughput is limited by disk I/O. With direct read, ossfs prefetches data from OSS directly into memory and reads from there, removing the disk I/O bottleneck.
Understand the following limits before enabling direct read:
Use for sequential reads only. Random reads cause ossfs to restart the prefetch window, which degrades throughput.
Writes flush memory to disk to maintain data consistency.
After you enable direct read, the
use_cacheparameter has no effect.
The following table describes the parameters for the direct read feature:
| Parameter | Description | Default |
|---|---|---|
direct_read | Enables the direct read feature. Enable with -o direct_read (no value required). | Disabled |
direct_read_prefetch_limit | Maximum memory for prefetched data across all ossfs processes. Unit: MB. | 1024 (minimum: 128) |
When the prefetched data reaches the direct_read_prefetch_limit, ossfs stops prefetching and read throughput falls back to network I/O speed. To disable memory prefetching entirely and read directly from OSS, set -o direct_read_prefetch_chunks=0.
Choose a read configuration
Use the following table to select the right configuration for your workload:
| Workload | Recommended configuration | Reason |
|---|---|---|
| Sequential reads on large files, accessed once | Enable direct_read (-o direct_read) | Prefetches data into memory, eliminates disk I/O |
| Small or medium files read repeatedly from the same node | Enable kernel page cache (-o kernel_cache) | Reuses the OS page cache across reads |
| Files read repeatedly where cache must survive process restarts | Enable disk cache (-o use_cache=/path/to/cache) | Persists across restarts; larger capacity than page cache |
| Large number of objects, services do not need object metadata | Enable readdir_optimize (-o readdir_optimize) | Removes per-object HeadObject calls from directory traversal |
direct_readanduse_cacheare mutually exclusive. Whendirect_readis enabled,use_cachehas no effect.
Best practices
Read/write scenarios
Split reads and writes across separate OSS endpoints for best performance. See Best practices for OSS read/write splitting.
If splitting is not possible, upgrade to ossfs 1.91 or later to fix the EntityTooSmall multipart upload error. To ensure data consistency, add -o max_stat_cache_size=0 to the otherOpts field.
Read-only scenarios
Use the following guidance to choose a caching strategy:
Direct read (
-o direct_read): Use for one-pass sequential reads on large files where the data is not accessed again. Eliminates disk I/O by prefetching into memory.Kernel page cache (
-o kernel_cache): Use for files read repeatedly from the same node. Reuses cached data across reads with no disk write overhead.Disk cache (
-o use_cache=/path/to/cache): Use for files read repeatedly where the cache must persist across process restarts or outlast available memory.
Directory-heavy workloads
If a large number of objects exist in the OSS bucket and your services do not require object metadata, enable -o readdir_optimize. If versioning is enabled for the OSS bucket, also add -o listobjectsv2.
Performance benchmarks
The results below use Sysbench or custom scripts. Results vary by tool and environment.
All benchmarks use an ecs.g7.xlarge node with a PL0 system disk.
Throughput (readdir optimization and direct read disabled)
Sysbench tests 128 files x 8 MiB with sequential reads, sequential writes, random reads, and random writes. Compared with 1.88.x:
ossfs 1.88.x produces higher throughput for file creates and sequential reads.
ossfs 1.91 and later produces higher throughput for sequential reads, random reads, and random writes.
Directory traversal latency after enabling readdir optimization
The test runs ls and find on 1,000 files and records latency per execution. Compared with ossfs 1.88.x and ossfs 1.91 with readdir optimization disabled:
lslatency is 74.8% lower than 1.88.x and 74.3% lower than 1.91 with readdir optimization disabled — 4.0x and 3.9x improvement, respectively.findlatency is 58.8% lower than both 1.88.x and 1.91 with readdir optimization disabled — 2.4x improvement in both cases.
Large file sequential read latency after enabling direct read
The test concurrently reads 10 files x 10 GiB and records latency, peak disk usage, and peak memory usage.
Peak memory usage covers all ossfs processes, including prefetched data and other direct read overhead.
Compared with 1.88.x and 1.91 with direct read disabled:
Latency is 85.3% lower than 1.88.x and 79% lower than 1.91 with direct read disabled.
Peak disk usage is 0 — no temporary files written to disk.
Peak memory usage is slightly higher, which enables the zero disk usage above.
Run your own benchmarks
Benchmark ossfs in containers or directly on ECS instances. The following steps use a containerized Sysbench environment.
Prerequisites
Before you begin, ensure that you have:
An OSS bucket and persistent volume claim (PVC). For setup, see Mount a statically provisioned OSS volume
What's next
You can benchmark various versions of ossfs by using the MySQL benchmarking tool provided by Sysbench.
Test readdir optimization by running
lsandfindin the mount path.Test direct read by adding
-o direct_readto your PV'sotherOptsand running concurrent sequential reads.