Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-21466

Unavailable mount error in zabbix-agent2

XMLWordPrintable

    • Sprint 92 (Sep 2022)
    • 0.5

      With zabbix-agent2 running on virtual machines sometimes we get unsupported items for file system monitoring.

      In such cases random file systems on random hosts become unavailable with following error message:

      ZBX_NOTSUPPORTED: 'mount '***' is unavailable' to '***'
      

      Debug logs says something like that:

      Aug 11 11:59:13 zabbix_agent2[14758]: received passive check request: 'vfs.fs.size[/opt,total]' from '10.9.49.35'
      Aug 11 11:59:13 zabbix_agent2[14758]: [1] processing update request (1 requests)
      Aug 11 11:59:13 zabbix_agent2[14758]: [1] adding new request for key: 'vfs.fs.size[/opt,total]'
      Aug 11 11:59:13 zabbix_agent2[14758]: [1] created direct exporter task for plugin 'VfsFs' itemid:0 key 'vfs.fs.size[/opt,total]'
      Aug 11 11:59:13 zabbix_agent2[14758]: executing direct exporter task for key 'vfs.fs.size[/opt,total]'
      Aug 11 11:59:13 zabbix_agent2[14758]: failed to execute direct exporter task for key 'vfs.fs.size[/opt,total]' error: 'mount '/opt' is unavailable'
      Aug 11 11:59:13 zabbix_agent2[14758]: sending passive check response: ZBX_NOTSUPPORTED: 'mount '/opt' is unavailable' to '10.9.49.35'
      

      Such behavior starts with log message.

      $ sudo journalctl -u zabbix-agent2 --since '3 days ago' | grep 'timed out'
      Aug 08 15:33:26 nl-build17.local.profee.com zabbix_agent2[14758]: check 'vfs.fs.size[/opt,free]' is not supported: operation on mount '/opt' timed out
      

      After that file system never become available. Helps only restart of zabbix-agent2 service.

      We looked thru the source code of `VfsFs` module. It seems like this branch of code that makes file system available never will be executed.

      https://github.com/zabbix/zabbix/blob/master/src/go/plugins/vfs/fs/fscaller.go#L64-L70

      func (f *fsCaller) execute(path string) {
      	stats, err := f.fsFunc(path)
      
      	if isStuck(path) {
      		f.p.Debugf("mount '%s' has become available", path)
      		stuckMux.Lock()
      		stuckMounts[path] = false
      		stuckMux.Unlock()
      		return
      	}
              # ...
      

      Only one call of the 'execute' happens inside of 'run' function.

      https://github.com/zabbix/zabbix/blob/master/src/go/plugins/vfs/fs/fscaller.go#L41-L46

      func (f *fsCaller) run(path string) (stat *FsStats, err error) {
      	if isStuck(path) {
      		return nil, fmt.Errorf("mount '%s' is unavailable", path)
      	}
      
      	go f.execute(path)
              # ...
      

      This pieces of code looks vary strange. In both cases it checks the same condition. But in one case file system become available, in another remains unavailable.

            arimdjonoks Artjoms Rimdjonoks
            skokhanovskiy Stepan Kokhanovskiy
            Team C
            Votes:
            2 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: