Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  2. ZBX-19348

Don't use MNT_WAIT with getmntinfo(2)


    • Icon: Problem report Problem report
    • Resolution: Fixed
    • Icon: Trivial Trivial
    • 5.4.1rc1, 6.0.0alpha1, 6.0 (plan)
    • 5.4.0rc1
    • Agent (G)
    • None
    • FreeBSD, and probably NetBSD, OpenBSD, and OSX too.
    • Team A
    • Sprint 76 (May 2021)
    • 0.5

      Steps to reproduce:

      1. Use VFS_FS_DISCOVERY on a host with many ZFS file systems and heavy control path activity (zfs snapshot, zfs destroy, zfs recv, etc)


      VFS Discovery will be very slow.  It can easily exceed the maximum timeout period.  These leads to plentiful false alarms about "Zabbix agent down on host ..."

      The VFS_FS_DISCOVERY function tries to discovery every mounted file system.  On the BSDs, it uses `getmntinfo`.  But it sets the mode argument to `MNT_WAIT`.  That means that the kernel effectively calls `statfs` on every single file system in order to ensure that fields like `f_bfree` are up to date.  Not only is that expensive in general, but on ZFS such calls can block for a long time if there are operations like a `zfs destroy` in process.

      In fact, Zabbix doesn't even use `f_bfree` or any of the other fields that require frequent updates.  The only fields that VFS_FS_DISCOVERY uses are `f_mntonname` and `f_fstypename`.  Those two will always be up-to-date even without `MNT_WAIT`, except temporarily while a file system is being unmounted.  So there's no reason to use `MNT_WAIT`.

      On Solaris, Zabbix simply reads `/etc/mnttab`, and on Linux it reads `/proc/mounts`.  Neither of those does anything like what `getmntinfo` does with `MNT_WAIT`.  In fact, they don't even report the used space of each file system.

      In conclusion, Zabbix should replace all uses of `MNT_WAIT` with `MNT_NOWAIT`.

            dgoloscapov Dmitrijs Goloscapovs
            asomers Alan Somers
            Team A
            0 Vote for this issue
            4 Start watching this issue