Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-21466

Unavailable mount error in zabbix-agent2

Details

    • Team C
    • Sprint 92 (Sep 2022)
    • 0.5

    Description

      With zabbix-agent2 running on virtual machines sometimes we get unsupported items for file system monitoring.

      In such cases random file systems on random hosts become unavailable with following error message:

      ZBX_NOTSUPPORTED: 'mount '***' is unavailable' to '***'
      

      Debug logs says something like that:

      Aug 11 11:59:13 zabbix_agent2[14758]: received passive check request: 'vfs.fs.size[/opt,total]' from '10.9.49.35'
      Aug 11 11:59:13 zabbix_agent2[14758]: [1] processing update request (1 requests)
      Aug 11 11:59:13 zabbix_agent2[14758]: [1] adding new request for key: 'vfs.fs.size[/opt,total]'
      Aug 11 11:59:13 zabbix_agent2[14758]: [1] created direct exporter task for plugin 'VfsFs' itemid:0 key 'vfs.fs.size[/opt,total]'
      Aug 11 11:59:13 zabbix_agent2[14758]: executing direct exporter task for key 'vfs.fs.size[/opt,total]'
      Aug 11 11:59:13 zabbix_agent2[14758]: failed to execute direct exporter task for key 'vfs.fs.size[/opt,total]' error: 'mount '/opt' is unavailable'
      Aug 11 11:59:13 zabbix_agent2[14758]: sending passive check response: ZBX_NOTSUPPORTED: 'mount '/opt' is unavailable' to '10.9.49.35'
      

      Such behavior starts with log message.

      $ sudo journalctl -u zabbix-agent2 --since '3 days ago' | grep 'timed out'
      Aug 08 15:33:26 nl-build17.local.profee.com zabbix_agent2[14758]: check 'vfs.fs.size[/opt,free]' is not supported: operation on mount '/opt' timed out
      

      After that file system never become available. Helps only restart of zabbix-agent2 service.

      We looked thru the source code of `VfsFs` module. It seems like this branch of code that makes file system available never will be executed.

      https://github.com/zabbix/zabbix/blob/master/src/go/plugins/vfs/fs/fscaller.go#L64-L70

      func (f *fsCaller) execute(path string) {
      	stats, err := f.fsFunc(path)
      
      	if isStuck(path) {
      		f.p.Debugf("mount '%s' has become available", path)
      		stuckMux.Lock()
      		stuckMounts[path] = false
      		stuckMux.Unlock()
      		return
      	}
              # ...
      

      Only one call of the 'execute' happens inside of 'run' function.

      https://github.com/zabbix/zabbix/blob/master/src/go/plugins/vfs/fs/fscaller.go#L41-L46

      func (f *fsCaller) run(path string) (stat *FsStats, err error) {
      	if isStuck(path) {
      		return nil, fmt.Errorf("mount '%s' is unavailable", path)
      	}
      
      	go f.execute(path)
              # ...
      

      This pieces of code looks vary strange. In both cases it checks the same condition. But in one case file system become available, in another remains unavailable.

      Attachments

        Issue Links

          Activity

            People

              arimdjonoks Artjoms Rimdjonoks
              skokhanovskiy Stepan Kokhanovskiy
              Votes:
              2 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated: