All Products
Search
Document Center

Container Service for Kubernetes:New features of ossfs 1.0 and later and ossfs performance benchmarking

Last Updated:Mar 26, 2026

ossfs 1.91 and later introduces three performance improvements over 1.88.x: POSIX operation optimizations, readdir optimization, and direct read. If your cluster uses Container Storage Interface (CSI) 1.30.1 or later, enable the corresponding feature gates to upgrade ossfs.

ossfs features are available only on Elastic Compute Service (ECS) nodes.

To upgrade, see Switch to ossfs 1.91 or later.

What changed from 1.88.x to 1.91

The following sections describe the feature changes in ossfs 1.91 and later. For the full release notes, see the ossfs changelog.

POSIX operation fixes and parameter defaults

ossfs 1.91 includes several fixes and default-value updates:

  • OSS volumes can now be mounted to subpaths that do not exist in OSS buckets.

  • Zero-byte files are no longer uploaded when you create an object. The EntityTooSmall error that occasionally occurred during multipart upload is fixed. Append operations are improved.

  • Default parameter values are updated based on upstream ossfs and performance benchmarking results.

The following table shows the parameter defaults that changed between 1.88.x and 1.91:

ParameterDescriptionDefault in 1.88.xDefault in 1.91+
stat_cache_expireMetadata cache TTL. Unit: seconds.-1 (never expires)900
multipart_thresholdFile size threshold for multipart upload. Unit: MB.5 x 102425
max_dirty_dataDirty data size threshold for forced flush to disk. Unit: MB.-1 (never flushed)5120

The following parameters retain the same defaults as 1.88.x but differ from the open-source version of ossfs 1.91, to maintain performance at the same level:

ParameterDescriptionDefault in open-source 1.91+Default in Alibaba Cloud 1.91+
multipart_sizePart size for multipart upload. Unit: MB.1030
parallel_countNumber of parts uploaded concurrently.520

To modify any of these parameters, update the otherOpts field in your PV.

Readdir optimization

When mounting an OSS volume, ossfs calls HeadObject for every object in the mounted path to retrieve metadata such as permissions, modification time, UIDs, and GIDs. On paths with many objects, these HeadObject calls can significantly slow down directory traversal operations like ls and find.

The readdir optimization feature skips those HeadObject calls, which reduces latency for directory operations. Understand the following trade-offs before enabling it:

  • chmod and chown commands have no effect.

  • Symbolic links may not behave as expected. Hard links are not supported.

The following table describes the parameters for the readdir optimization feature:

ParameterDescriptionDefault
readdir_optimizeEnables the readdir optimization feature. Enable with -o readdir_optimize (no value required).Disabled
symlink_in_metaRecords symbolic link metadata so that symbolic links display correctly. Enable with -o symlink_in_meta (no value required).Disabled

Direct read

The direct read feature is designed for sequential read workloads on large files. Without direct read, ossfs downloads files from OSS to disk before reading them, so read throughput is limited by disk I/O. With direct read, ossfs prefetches data from OSS directly into memory and reads from there, removing the disk I/O bottleneck.

Understand the following limits before enabling direct read:

  • Use for sequential reads only. Random reads cause ossfs to restart the prefetch window, which degrades throughput.

  • Writes flush memory to disk to maintain data consistency.

  • After you enable direct read, the use_cache parameter has no effect.

The following table describes the parameters for the direct read feature:

ParameterDescriptionDefault
direct_readEnables the direct read feature. Enable with -o direct_read (no value required).Disabled
direct_read_prefetch_limitMaximum memory for prefetched data across all ossfs processes. Unit: MB.1024 (minimum: 128)

When the prefetched data reaches the direct_read_prefetch_limit, ossfs stops prefetching and read throughput falls back to network I/O speed. To disable memory prefetching entirely and read directly from OSS, set -o direct_read_prefetch_chunks=0.

Choose a read configuration

Use the following table to select the right configuration for your workload:

WorkloadRecommended configurationReason
Sequential reads on large files, accessed onceEnable direct_read (-o direct_read)Prefetches data into memory, eliminates disk I/O
Small or medium files read repeatedly from the same nodeEnable kernel page cache (-o kernel_cache)Reuses the OS page cache across reads
Files read repeatedly where cache must survive process restartsEnable disk cache (-o use_cache=/path/to/cache)Persists across restarts; larger capacity than page cache
Large number of objects, services do not need object metadataEnable readdir_optimize (-o readdir_optimize)Removes per-object HeadObject calls from directory traversal
direct_read and use_cache are mutually exclusive. When direct_read is enabled, use_cache has no effect.

Best practices

Read/write scenarios

Split reads and writes across separate OSS endpoints for best performance. See Best practices for OSS read/write splitting.

If splitting is not possible, upgrade to ossfs 1.91 or later to fix the EntityTooSmall multipart upload error. To ensure data consistency, add -o max_stat_cache_size=0 to the otherOpts field.

Read-only scenarios

Use the following guidance to choose a caching strategy:

  • Direct read (-o direct_read): Use for one-pass sequential reads on large files where the data is not accessed again. Eliminates disk I/O by prefetching into memory.

  • Kernel page cache (-o kernel_cache): Use for files read repeatedly from the same node. Reuses cached data across reads with no disk write overhead.

  • Disk cache (-o use_cache=/path/to/cache): Use for files read repeatedly where the cache must persist across process restarts or outlast available memory.

Directory-heavy workloads

If a large number of objects exist in the OSS bucket and your services do not require object metadata, enable -o readdir_optimize. If versioning is enabled for the OSS bucket, also add -o listobjectsv2.

Performance benchmarks

The results below use Sysbench or custom scripts. Results vary by tool and environment.

Important

All benchmarks use an ecs.g7.xlarge node with a PL0 system disk.

Throughput (readdir optimization and direct read disabled)

Sysbench tests 128 files x 8 MiB with sequential reads, sequential writes, random reads, and random writes. Compared with 1.88.x:

  • ossfs 1.88.x produces higher throughput for file creates and sequential reads.

  • ossfs 1.91 and later produces higher throughput for sequential reads, random reads, and random writes.

image

Directory traversal latency after enabling readdir optimization

The test runs ls and find on 1,000 files and records latency per execution. Compared with ossfs 1.88.x and ossfs 1.91 with readdir optimization disabled:

  • ls latency is 74.8% lower than 1.88.x and 74.3% lower than 1.91 with readdir optimization disabled — 4.0x and 3.9x improvement, respectively.

  • find latency is 58.8% lower than both 1.88.x and 1.91 with readdir optimization disabled — 2.4x improvement in both cases.

image

Large file sequential read latency after enabling direct read

The test concurrently reads 10 files x 10 GiB and records latency, peak disk usage, and peak memory usage.

Peak memory usage covers all ossfs processes, including prefetched data and other direct read overhead.

Compared with 1.88.x and 1.91 with direct read disabled:

  • Latency is 85.3% lower than 1.88.x and 79% lower than 1.91 with direct read disabled.

  • Peak disk usage is 0 — no temporary files written to disk.

  • Peak memory usage is slightly higher, which enables the zero disk usage above.

image

Run your own benchmarks

Benchmark ossfs in containers or directly on ECS instances. The following steps use a containerized Sysbench environment.

Prerequisites

Before you begin, ensure that you have:

Deploy the Sysbench test environment

  1. Create a file named sysbench.yaml with the following content. It deploys a Sysbench container with the PVC mounted at /data.

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: sysbench
      labels:
        app: sysbench
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: sysbench
      template:
        metadata:
          labels:
            app: sysbench
        spec:
          containers:
          - name: sysbench
            image: registry.cn-beijing.aliyuncs.com/tool-sys/tf-train-demo:sysbench-sleep
            ports:
            - containerPort: 80
            volumeMounts:
              - name: pvc-oss
                mountPath: "/data"
            livenessProbe:
              exec:
                command:
                - sh
                - -c
                - cd /data
              initialDelaySeconds: 30
              periodSeconds: 30
          volumes:
            - name: pvc-oss
              persistentVolumeClaim:
                claimName: pvc-oss
  2. Deploy the Sysbench application:

    kubectl apply -f sysbench.yaml
  3. Log on to the Sysbench container, then run the following commands in the mount path to benchmark read/write throughput.

    Adjust parameter values to match your node specifications. For consecutive tests, prepare new test files each time to avoid cache interference.
    OperationCommand
    Prepare test filessysbench --num-threads=2 --max-requests=0 --max-time=120 --file-num=128 --file-block-size=16384 --test=fileio --file-total-size=1G --file-test-mode=rndrw prepare
    Test sequential write I/Osysbench --num-threads=2 --max-requests=0 --max-time=120 --file-num=128 --file-block-size=16384 --test=fileio --file-total-size=1G --file-test-mode=seqwr --file-fsync-freq=0 run
    Test sequential read I/Osysbench --num-threads=2 --max-requests=0 --max-time=120 --file-num=128 --file-block-size=16384 --test=fileio --file-total-size=1G --file-test-mode=seqrd --file-fsync-freq=0 run
    Test random read/write I/Osysbench --num-threads=2 --max-requests=0 --max-time=120 --file-num=128 --file-block-size=16384 --test=fileio --file-total-size=1G --file-test-mode=rndrw --file-fsync-freq=0 run
    Delete test filessysbench --test=fileio --file-total-size=1G cleanup

What's next

  • You can benchmark various versions of ossfs by using the MySQL benchmarking tool provided by Sysbench.

  • Test readdir optimization by running ls and find in the mount path.

  • Test direct read by adding -o direct_read to your PV's otherOpts and running concurrent sequential reads.