What is the read and write performance of a file system related to?

The peak performance of read and write operations is linearly proportional to the used capacity of the file system. A higher capacity indicates a higher throughput.

What is IOPS? What are the relationships between IOPS and throughput, read and write block size, and latency?

IOPS refers to the number of input/output operations per second.

The following formulas indicate the relationships between IOPS and read and write block size, throughput, number of reads and writes, and latency:

Throughput = IOPS × Read and write block size, where, IOPS = Number of reads and writes/Overall latency

For example, a Capacity NAS file system has a write latency of 100 ms per 1 MiB, 15 ms per 8 KiB, and 10 ms per 4 KiB. You can send a maximum of 128 concurrent requests to a Capacity NAS file system. If you want to write 1 MiB of data per second to a file system, you can use the solutions in the following table.
No. Read and write block size Concurrency Number of write operations Overall latency IOPS Throughput Description
Solution 1 4 KiB 1 250 10 ms*250=2.5s 250/2.5s=100 4 KiB*100=400 KiB/s Small read and write block size and low concurrency result in poor throughput and high latency. In this case, the throughput cannot reach 1 MiB/s.
Solution 2 1 MiB 1 1 100 ms 1/0.1s=10 1 MiB*10=10 MiB/s Compared with Solution 1, the read and write block size of Solution 2 is increased, and the throughput and latency performance are improved. The throughput can reach 1 MiB/s. However, the overall latency is high.
Solution 3 4 KiB 125 250 10 ms × (250/125) = 20 ms 250/0.02s=12500 4 KiB × 12500 ≈ 49 MiB/s Compared with Solution 1, the concurrency of Solution 3 is increased, and the throughput and latency performance are improved. The throughput can reach 1 MiB/s. The overall latency is low, but the IOPS reaches the upper limit of the file system.
Solution 4 8 KiB 125 125 15 ms × (125/125) = 15 ms 125/0.015s ≈ 8333 8 KiB × 8333 ≈ 65 MiB/s Compared with Solution 1, the read and write block size and concurrency of Solution 4 are increased, and the throughput and latency performance are improved. The throughput can reach 1 MiB/s. The overall latency is the lowest among the four solutions and the IOPS is low.

What happens if the read and write throughput of a request exceeds the bandwidth threshold?

If the read and write throughput of a request sent by you or your application exceeds the bandwidth threshold, NAS throttle the request. In this case, the latency is increased. For more information, see Performance metrics.

Why does NGINX take a long time to write logs to a file system?

  • Background information

    You can use the following two directives to specify NGINX logs: The log_format directive specifies the format of the logs. The access_log directive specifies the log storage path, format name, and cache size.

  • Issue

    NGINX requires a long time to write logs to the file system and the performance of the file system is compromised.

  • Cause

    The path that is specified in the access_log directive contains variables. Each time NGINX attempts to write logs to the file system, the destination files are opened. After the logs are written, the files are closed. To ensure data visibility, NAS writes the data back to the NAS server when the files are closed. This compromises the performance of the file system.

  • Solution
    • Solution 1: Delete the variables in the access_log directive and store the logs in a fixed path.
    • Solution 2: Use the open_log_file_cache directive to cache the file descriptors of frequently used logs. This improves the performance of log storage to the path that contains the variables. For more information, see open_log_file_cache.
      Recommended configurations:
      open_log_file_cache max=1000 inactive=1m valid=3m min_uses=2;

Why does an SMB file system have I/O latency?

  • Issue

    If you access a Server Message Block (SMB) file system by using a mount target, you must wait for several minutes before you can perform I/O operations on the file system.

  • Cause
    • You must wait for several minutes because a Network File System (NFS) client is installed but not used.
    • The file server cannot access the SMB file system because the WebClient service is enabled.
    • The files in the file system cannot be opened because Nfsnp is included in the value of the ProviderOrder key.
  • Solution
    1. When you access an SMB file system for the first time, ping the domain name of the mount target to check the network connectivity between the compute node and the file system. You can also check whether the latency is within the allowed range.
      • If the ping command fails, check your network settings and make sure that the network is connected.
      • If the latency is high, run the ping command to ping the IP address of the mount target. If the latency of accessing the IP address is lower than the latency of accessing the domain name, check the configurations of the Domain Name System (DNS) server.
    2. If an NFS client is installed but not used, we recommend that you delete the NFS client.
    3. Disable the WebClient service.
    4. Check the Registry key in the following path: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\NetworkProvider\Order\ProviderOrder. If the value of the ProviderOrder key contains Nfsnp, remove Nfsnp and restart the ECS instance on which the file system is mounted.
Note
  • You can use the fio tool to check the performance of the file system.
    fio.exe --name=./iotest1 --direct=1 --rwmixread=0 --rw=write --bs=4K --numjobs=1 --thread --iodepth=128 --runtime=300 --group_reporting --size=5G --verify=md5 --randrepeat=0 --norandommap --refill_buffers --filename=\\<mount point dns>\myshare\testfio1
  • We recommend that you perform read and write operations based on large data blocks. Small data blocks consume more network resources. If you cannot modify the block size, you can construct the BufferedOutputStream class to write data to a specified output stream with a specified buffer size.

Why am I unable to improve the I/O performance of Windows SMB clients?

  • Cause

    By default, the large MTU feature is disabled on Windows SMB clients. This limits the I/O performance of Windows SMB clients.

  • Solution

    You can enable the large MTU feature by modifying the Windows Registry. The Registry key is stored in the following path: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanmanWorkstation\Parameters.

    Create a key of the DWORD data type and name the key as DisableLargeMtu. Set the value of the key to 0. Restart the ECS instance on which the file system is mounted to validate the key.

How can I improve the performance for access from IIS to NAS?

  • Cause

    When Internet Information Service (IIS) accesses a file in the shared directory of a NAS file system, the backend of IIS may access the shared directory multiple times. When you access the NAS file system, you must interact with the network at least once. This is different from the scenario when you access a local disk. Although each access request does not take a long time, the client may take a long time to respond if multiple access requests are sent.

  • Solution
    1. Use the SMB Redirector component to optimize the performance of SMB file systems. For more information, see SMB2 Client Redirector Caches Explained.
      Modify the Registry keys in the following path: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\LanmanWorkstation\Parameters. Modify the values of the following three keys to 600 or higher:
      • FileInfoCacheLifetime
      • FileNotFoundCacheLifetime
      • DirectoryCacheLifetime
      Note If none of the preceding keys exists, perform the following operations:
      1. Make sure that the file system uses the SMB protocol.
      2. Check whether the Windows version supports the keys. If the Windows version supports the keys but the keys do not exist, create the keys. For more information, see Performance tuning for file servers.
    2. We recommend that you store the web-related files such as JS and CSS files to local disks if IIS frequently accesses these files.

If the read and write performance of IIS cannot meet your business requirements, submit a ticket.

Why is the performance of an NFS client poor on Linux?

  • Issue

    The read and write speed of an NFS client on Linux is only several MB/s.

  • Cause

    By default, a maximum of 2 concurrent NFS requests can be sent from a Linux-based NFS client. This limits the performance of the NFS client.

  • Solution

    After an NFS client is installed, modify the maximum number of concurrent NFS requests to improve the performance of the NFS client. For more information, see How can I modify the maximum number of concurrent NFS requests?.

How do I increase the read and write bandwidth of a General-purpose NAS file system?

The read and write bandwidth of a General-purpose NAS file system linearly increases with the capacity of the file system. For more information, see General-purpose NAS file systems.

You can increase the capacity of the file system by writing hole files to the file system. Then, the read and write bandwidth of the file system is increased. File holes occupy storage space in NAS. Therefore, you are charged for the file holes in your NAS file system. For more information, see Billing of General-purpose NAS file systems.

For example, if you write a 1,000 GiB hole file to a Capacity NAS file system, you can increase the read and write bandwidth of the file system by 150 MB/s. If you write a 1,000 GiB hole file to a Performance NAS file system, you can increase the read and write bandwidth of the file system by 600 MB/s.
  • Linux
    dd if=/dev/zero of=/mnt/sparse_file.txt bs=1073741824000 count=1
    In the preceding command, /mnt is the mount path of the file system on the compute node.
  • Windows
    fsutil file createnew Z:\sparse_file.txt 1073741824000
    In the preceding command, Z:\ is the mount path of the file system on the compute node.

Why does a file system give a slow response or does not respond when I run the ls command?

  • Issue

    When you traverse a directory of a file system, the file system gives a slow response or does not respond. This may occur when you run the ls command or commands that contain wildcards such as * and ?.It may also occur when you run the rm -rf command, or when the getdents system call is invoked.

  • Cause
    • The directory is being modified. For example, a directory is being created or deleted, or a file in the directory is being renamed. This leads to slow responses due to frequent cache invalidations.
    • The data size of the directory is too large. This leads to slow responses due to cache eviction.
  • Solution
    • Limit the number of files stored in the directory. Make sure that you can store less than 10,000 files in a single directory.
    • Do not frequently modify the directory when you traverse the directory.
    • If you store more than 10,000 files in the directory and the directory does not need to be frequently modified, you can accelerate the traversal process to some extent. However, you must make sure that you mount the file system by using the NFSv3 protocol and use the nordirplus parameter. For more information, see Mount parameters.