All Products
Search
Document Center

Container Service for Kubernetes:Introduction of new functions and performance benchmarking of ossfs 1.91 and later

Last Updated:Jan 11, 2025

With the iterative upgrade of the CSI widget, ossfs needs to be upgraded to the corresponding version to use new features. In CSI widget version 1.30.1 and later, you can enable feature gates to switch ossfs to version 1.91 and later to improve file operation performance. This topic introduces the new features of ossfs 1.91 and later, including POSIX operation optimization, readdir optimization, and the new direct read feature, and provides performance benchmarking comparisons.

Note

If you have high requirements on file operation performance, we recommend that you switch ossfs to version 1.91 and later.

New features in ossfs 1.91 and later

The main changes in ossfs 1.91 and later compared to version 1.88.x are as follows. The following feature configuration items only provide important information. For complete configuration items and more version change information, see ossfs changelog.

Important

ossfs-related features are only supported on ECS nodes.

Basic POSIX operation optimization and bug fixes

  • OSS volumes can be mounted to subpaths that do not exist in OSS buckets.

  • Zero-byte files can no longer be uploaded when you create an object in OSS. The issue that the EntityTooSmall error occasionally occurs when you use multipart upload is fixed. Append operations are improved.

  • Based on the open-source ossfs version and actual benchmarking results, the default values of some configuration items are modified.

    Configuration item

    Description

    Default value in version 1.88.x

    Default value in version 1.91 and later

    stat_cache_expire

    The validity period of metadata. Unit: seconds.

    -1 (The metadata never expires)

    900

    multipart_threshold

    The size threshold for files that can be uploaded by using multipart upload. Unit: MB.

    5 × 1024

    25

    max_dirty_data

    The size threshold for forcefully flushing dirty data to disks. Unit: MB.

    -1 (Dirty data is not forcefully flushed)

    5120

    To optimize the performance of large file processing, the following configuration items are compatible with the previous version 1.88.x and differ from the open-source ossfs version.

    Configuration item

    Description

    Default value in open-source version 1.91 and later

    Default value in version 1.91 and later

    multipart_size

    The part size when multipart upload is used. Unit: MB.

    10

    30

    parallel_count

    The number of parts that can be concurrently uploaded.

    5

    20

    When you use ossfs version 1.91 and later, if you need to roll back or adjust the configuration, you can modify the optional parameter otherOpts in the persistent volume (PV).

New readdir optimization feature

The readdir optimization feature is added to improve the efficiency of file system traversal.

To support authentication and POSIX operations such as the chmod command execution when mounting an OSS volume, the system calls many HeadObject operations to query the metadata of all objects in the mounted path of the OSS bucket, such as the permissions, modification time, user identifiers (UIDs), and group identifiers (GIDs) of the objects. If many files exist in some paths, the performance of ossfs may be adversely affected.

After you enable the readdir optimization feature, the system ignores the preceding metadata to optimize the readdir performance. Take note of the following items:

  • The chmod or chown command does not take effect.

  • Errors may occur when you use symbolic links to access objects. ossfs does not support hard links.

The following table describes the parameters that are required for enabling the readdir optimization feature:

Configuration item

Description

How to enable

Default value in version 1.91 and later

readdir_optimize

Specifies whether to enable the readdir optimization feature.

You can specify -o readdir_optimize to enable the feature without specifying a value for the parameter.

disable

symlink_in_meta

Specifies whether to enable metadata recording for symbolic links. If you enable this feature, the metadata of symbolic links is recorded to ensure that the symbolic links can be displayed as expected.

You can specify -o symlink_in_meta to enable the feature without specifying a value for the parameter.

disable

New direct read feature

The direct read feature is introduced to improve the performance of sequential reads (read-only scenarios) performed on large files.

To support writes and random reads when mounting OSS volumes, ossfs downloads files from the OSS server to disks and then reads the data on the disks. In this case, the read performance of ossfs is limited by the disk I/O.

The direct read feature prefetches data from OSS into memory and the prefetched data is not immediately flushed to disks. This way, ossfs can directly read data from memory, which improves the performance of sequential reads. Take note of the following items:

  • The feature is recommended for sequential read (read-only) scenarios. After a file is opened:

    • If you perform random reads, ossfs prefetches data again. Many random reads may compromise the read performance.

    • If you perform writes, data is flushed from memory to disks to ensure data consistency.

  • The use_cache parameter becomes invalid when the direct read feature is enabled.

  • When data is prefetched from OSS to memory, the memory usage may increase. You can refer to the following table to configure the direct_read_prefetch_limit parameter to limit the memory usage of ossfs. When the memory usage of ossfs reaches the upper limit, ossfs stops prefetching data. In this case, the read performance of ossfs is limited by the network I/O.

The following table describes the parameters that are required for enabling the direct read feature:

Configuration item

Description

Default value in version 1.91 and later

direct_read

Specifies whether to enable the direct read feature. You can specify -o direct_read to enable the feature without specifying a value for the parameter.

disable

direct_read_prefetch_limit

The maximum memory size that can be used to store data prefetched by ossfs processes. Unit: MB.

1024 (Minimum: 128)

If you do not want to improve the performance of sequential reads by prefetching data when you use the direct read feature, you can add the -o direct_read_prefetch_chunks=0 parameter. In this case, ossfs directly reads data from the OSS server and the read performance is limited by the network I/O.

Best practices for updating ossfs to 1.91 or later

  • If many files exist on the OSS server and you do not need the metadata of the files, we recommend that you switch ossfs to version 1.91 and later and add the -o readdir_optimize parameter. If versioning is enabled for your bucket, we recommend that you also add the -o listobjectsv2 parameter.

  • In read/write scenarios, we recommend that you first consider Best practices for read/write splitting of OSS storage to split reads and writes. If you do not split reads and writes, we recommend that you update ossfs to version 1.91 and later to fix the issue that the EntityTooSmall error occasionally occurs when you use multipart upload. To ensure data consistency, we recommend that you also add the -o max_stat_cache_size=0 parameter.

  • Read-only scenarios

    • If you do not need to use cache in large file sequential read scenarios, we recommend that you add the -o direct_read parameter to enable the direct read feature.

    • If files are read frequently, we recommend that you configure the following parameters to use the local cache to accelerate the reads:

      • Add the -o kernel_cache parameter to use the page cache.

      • Add the -o use_cache=/path/to/cache parameter to use the disk cache.

Performance comparison between ossfs 1.88.x and ossfs 1.91 and later

Important

The test results vary depending on the tools used. The data in this topic is obtained by using sysbench or custom scripts.

Throughput comparison

When the readdir optimization feature and direct read feature are disabled, the following table compares the results of sequential read, sequential write, random read, and random write tests that are performed by using the sysbench benchmarking tool on the ecs.g7.xlarge node (the system disk is PL0) to create 128 files each of which is 8 MiB in size.

The preceding throughput comparison shows that when the readdir optimization feature and direct read feature are disabled:

  • ossfs version 1.88.x has advantages in create file and sequential read.

  • ossfs version 1.91 and later has advantages in sequential write, random read, and random write.

Performance comparison of ls and find commands after readdir optimization is enabled

The following table compares the time consumed to run the ls and find commands on 1,000 files before and after the readdir optimization feature is enabled.

The preceding performance comparison shows that ossfs version 1.91 and later with -o readdir_optimize, that is, ossfs version 1.91 and later with the readdir optimization feature enabled:

  • The time consumed to run the ls command is reduced by 74.8% compared with ossfs version 1.88.x, and the performance is improved by 4.0 times. The time consumed to run the ls command is reduced by 74.3% compared with ossfs version 1.91 and later with the readdir optimization feature disabled, and the performance is improved by 3.9 times.

  • Running the find command is 58.8% faster than with ossfs version 1.88.x, resulting in a performance improvement of 2.4 times. Similarly, when the readdir optimization feature is disabled in ossfs version 1.91 and later, the find command runs 58.8% faster, with a 2.4-fold increase in performance.

Performance comparison of sequential reads on large files after the direct read feature is enabled

The following table compares the time consumed and the maximum disk space usage and maximum memory usage when 10 files each of which is 10 GB in size are concurrently read in sequence before and after the direct read feature is enabled.

Note

The maximum memory usage refers to the amount of memory used by all ossfs processes, including the amount of memory used by prefetched data and the memory used by the direct feature for other purposes.

The preceding performance comparison shows that ossfs version 1.91 and later with -o direct_read, that is, ossfs version 1.91 and later with the direct read feature enabled:

  • The time consumed to read large files is reduced by 85.3% compared with ossfs version 1.88.x, and is reduced by 79% compared with ossfs version 1.91 and later with the direct read feature disabled.

  • The maximum disk space usage is 0, which is optimal.

  • The maximum memory usage slightly increases, which uses a small amount of memory to achieve 0 disk usage.

How to benchmark ossfs

You can benchmark ossfs in a containerized environment or directly in an ECS environment. The preceding performance data is generally obtained by using the open-source benchmarking tool sysbench or custom scripts. You can also switch to ossfs version 1.91 and later in the test environment to perform tests and comparisons. This section describes how to benchmark ossfs in a containerized test environment.

Procedure

  1. Create an OSS volume and a persistent volume claim (PVC). We recommend that you perform tests in a newly created bucket or subpath. For more information, see Use OSS static volumes.

  2. Create the sysbench application by using the following sysbench.yaml content and mount the PVC that you created in the previous step.

    Expand to view the sysbench.yaml sample code

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: sysbench
      labels:
        app: sysbench
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: sysbench
      template:
        metadata:
          labels:
            app: sysbench
        spec:
          containers:
          - name: sysbench
            image: registry.cn-beijing.aliyuncs.com/tool-sys/tf-train-demo:sysbench-sleep
            ports:
            - containerPort: 80
            volumeMounts:
              - name: pvc-oss
                mountPath: "/data"
            livenessProbe:
              exec:
                command:
                - sh
                - -c
                - cd /data
              initialDelaySeconds: 30
              periodSeconds: 30
          volumes:
            - name: pvc-oss
              persistentVolumeClaim:
                claimName: pvc-oss
  3. Run the following command to deploy the sysbench application.

    kubectl apply -f sysbench.yaml
  4. Log on to the sysbench container and run the following commands in the mount path to perform read/write throughput tests.

    Note
    • Modify the parameter values in the commands based on the actual node specifications or your business requirements.

    • If you want to perform consecutive tests, we recommend that you prepare new test files for new tests to eliminate the influence that data cache imposes on the test results.

    Function

    Command

    Prepare test files

    sysbench --num-threads=2 --max-requests=0 --max-time=120 --file-num=128 --file-block-size=16384 --test=fileio --file-total-size=1G --file-test-mode=rndrw prepare

    Test the sequential write I/O

    sysbench --num-threads=2 --max-requests=0 --max-time=120 --file-num=128 --file-block-size=16384 --test=fileio --file-total-size=1G --file-test-mode=seqwr --file-fsync-freq=0 run

    Test the sequential read I/O

    sysbench --num-threads=2 --max-requests=0 --max-time=120 --file-num=128 --file-block-size=16384 --test=fileio --file-total-size=1G --file-test-mode=seqrd --file-fsync-freq=0 run

    Test the random read/write I/O

    sysbench --num-threads=2 --max-requests=0 --max-time=120 --file-num=128 --file-block-size=16384 --test=fileio --file-total-size=1G --file-test-mode=rndrw --file-fsync-freq=0 run

    Delete test files

    sysbench --test=fileio --file-total-size=1G cleanup 

What to do next

  • You can benchmark various versions of ossfs by using the MySQL benchmarking tool provided by sysbench.

  • You can also test the readdir optimization and direct read features in the preceding test environment by running the ls and find commands or by concurrently performing sequential reads.