edit-icon download-icon

How do I avoid exceptions when multiple processes and multiple clients write the same log file at the same time

Last Updated: Oct 30, 2017

NAS enables multiple clients to share and write files in the same namespace using the NFS protocol. However, the NFS protocol does not support Atomic Append semantics.

Problem

When multiple processes/clients write the same file (for example, the same log) at the same time, and each process independently maintains context information (such as file descriptor or write location), there may be coverage, crossover, and disordered content.

Solution

Two solutions are provided:

  • (Recommended) Allow different processes/clients to write to different files in the same file system, and then consolidate these files during your analysis. This solution is preferable as it can mitigate problems caused by concurrent writing without the need for any filelock or alterations to performance.

  • Use the flock with seek method to ensure the atomicity and consistency of writing. The flock with seek method is a relatively time-consuming operation, that may significantly affect performance. The following steps detail how to implement the flock with seek method in Linux.

Flock with seek method

Note: To use flock() on the NAS file system, your Linux kernel version must be 2.6.12 or later. If your Linux kernel uses an earlier version, use fcntl().

  1. Call fd = open(filename, O_WRONLY | O_APPEND | O_DIRECT) to open the file by means of append writing, and specify O_DIRECT (bypass Page Cache to write directly) to acquire file descriptor fd.

  2. Call flock(fd, LOCK_EX|LOCK_NB) to get the filelock. In case of failure (for example, the filelock is already in use), the system returns an error. Retry or perform error handling.

  3. Call lseek(fd, 0, SEEK_END) to point the current file offset (cfo) of the fd to the end of the file.

  4. Perform normal write operations. The insert location is the end of the file. The filelock can prevent overwriting.

  5. Call flock(fd, LOCK_UN) to release the filelock after the write operation.

    The following is a simple C language sample program.

    1. #define _GNU_SOURCE
    2. #include<stdlib.h>
    3. #include<stdio.h>
    4. #include<fcntl.h>
    5. #include<string.h>
    6. #include<unistd.h>
    7. #include<sys/file.h>
    8. #include<time.h>
    9. const char *OUTPUT_FILE = "/mnt/blog";
    10. int WRITE_COUNT = 50000;
    11. int do_lock(int fd)
    12. {
    13. int ret = -1;
    14. while (1)
    15. {
    16. ret = flock(fd, LOCK_EX | LOCK_NB);
    17. if (ret == 0)
    18. {
    19. break;
    20. }
    21. usleep((rand() % 10) * 1000);
    22. }
    23. return ret;
    24. }
    25. int do_unlock(int fd)
    26. {
    27. return flock(fd, LOCK_UN);
    28. }
    29. int main()
    30. {
    31. int fd = open(OUTPUT_FILE, O_WRONLY | O_APPEND | O_DIRECT);
    32. if (fd < 0)
    33. {
    34. printf("Error Open\n");
    35. exit(-1);
    36. }
    37. for (int i = 0; i < WRITE_COUNT; ++i)
    38. {
    39. char *buf = "one line\n";
    40. /* Lock file */
    41. int ret = do_lock(fd);
    42. if (ret != 0)
    43. {
    44. printf("Lock Error\n");
    45. exit(-1);
    46. }
    47. /* Seek to the end */
    48. ret = lseek(fd, 0, SEEK_END);
    49. if (ret < 0)
    50. {
    51. printf("Seek Error\n");
    52. exit(-1);
    53. }
    54. /* Write to file */
    55. int n = write(fd, buf, strlen(buf));
    56. if (n <= 0)
    57. {
    58. printf("Write Error\n");
    59. exit(-1);
    60. }
    61. /* Unlock file */
    62. ret = do_unlock(fd);
    63. if (ret != 0)
    64. {
    65. printf("UnLock Error\n");
    66. exit(-1);
    67. }
    68. }
    69. return 0;
    70. }

    For more information, see Linux file locking mechanisms - Flock, Lockf, and Fcntl.

Thank you! We've received your feedback.