Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-21829

active agent may start collect wrong value for key "vfs.file.size". agent restart fixes it.

XMLWordPrintable

    • Icon: Problem report Problem report
    • Resolution: Cannot Reproduce
    • Icon: Critical Critical
    • None
    • 6.0.10, 6.2.4, 6.4.0beta3
    • Agent (G)
    • None

      We found that an active agent returns wrong value - 45 instead of 4348391.

      Here is a debug capture for trapper processes on zabbix server side:

      319660:20221026:110909.805 trapper got '{"request":"agent data","session":"935a94f2c514ecc8a16846c035fa7007","data":[
      {"host":"somename.example.com","key":"log.count[E:\\SiteScope\\logs\\RunMonitor.log,\"(?i)timeout|timed out\",,,skip]","value":"0","lastlogsize":469578,"id":16523,"clock":1666796944,"ns":735725500},
      {"host":"somename.example.com","key":"vfs.fs.size[C:,pfree]","value":"71.231177","id":16524,"clock":1666796944,"ns":736378500},
      {"host":"somename.example.com","key":"vfs.fs.size[D:,pfree]","value":"99.826274","id":16525,"clock":1666796944,"ns":736779600},
      {"host":"somename.example.com","key":"vfs.fs.size[E:,pfree]","value":"96.167602","id":16526,"clock":1666796944,"ns":737179400},
      {"host":"somename.example.com","key":"vfs.file.size[E:\\some{HOST.HOST}_SIS_Config.log]","value":"45","id":16527,"clock":1666796945,"ns":752000000}],"clock":1666796949,"ns":806334100}'
      

      We see that returned for the key was "value":"45"

      a minute later we see how agent is updating list of active checks, where is our key, this is fine:

      319660:20221026:111008.715 trapper got '{"request":"active checks","host":"somename.example.com","host_metadata":"OS:Windows;Apps:SIS43264;Type:Primary"}'
      319660:20221026:111008.715 In get_hostid_by_host() host:'somename.example.com' metadata:'OS:Windows;Apps:SIS43264;Type:Primary'
      319660:20221026:111008.715 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,h.tls_issuer,h.tls_subject,h.tls_psk_identity,a.host_metadata,a.listen_ip,a.listen_dns,a.listen_port,a.flags from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='somename.example.com' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null]
      319660:20221026:111008.719 End substitute_simple_macros_impl() data:'E:\some\somename.example.com_SIS_Config.log'
      319660:20221026:111008.719 End of substitute_key_macros_impl():SUCCEED data:'vfs.file.size[E:\some\somename.example.com_SIS_Config.log]'
      
      319660:20221026:111008.720 send_list_of_active_checks_json() sending [{"response":"success","data":[
      {"key":"system.uptime","delay":1800,"lastlogsize":0,"mtime":0},
      {"key":"agent.version","delay":1800,"lastlogsize":0,"mtime":0},{"key":"vfs.fs.discovery","delay":3600,"lastlogsize":0,"mtime":0},
      ... cut 6 log.count items ...
      {"key":"vfs.file.regexp[E:\\some.log,\"Total monitors: ([\\d]+)\",,,,\\1]","delay":200,"lastlogsize":0,"mtime":0},
      
      {"key":"vfs.file.size[E:\\some\\somename.example.com_SIS_Config.log]","key_orig":"vfs.file.size[E:\\some{HOST.HOST}_SIS_Config.log]","delay":600,"lastlogsize":0,"mtime":0},
      
      {"key":"service.discovery","delay":300,"lastlogsize":0,"mtime":0},
      {"key":"vfs.fs.size[C:,pfree]","delay":1800,"lastlogsize":0,"mtime":0},
      {"key":"vfs.fs.size[D:,pfree]","delay":1800,"lastlogsize":0,"mtime":0},
      {"key":"vfs.fs.size[E:,pfree]","delay":1800,"lastlogsize":0,"mtime":0},
      {"key":"agent.ping","delay":200,"lastlogsize":0,"mtime":0},
      {"key":"service.info[SiteScope,state]","delay":200,"lastlogsize":0,"mtime":0}
      ]}]
      

      In the same time, when testing the key locally using passive checks way, returned value is correct:

      C:\Zabbix\bin>zabbix_agentd.exe -t vfs.file.size[E:\some\somename.example.com_SIS_Config.log]
      vfs.file.size[E:\some\somename.example.com_SIS_Config.log] [u|4348391]
      
      C:\Zabbix\bin>zabbix_get.exe -s 127.0.0.1 -k vfs.file.size[E:\some\somename.example.com_SIS_Config.log]
      4348391
      

      Agent version:

      C:\Zabbix\bin>zabbix_agentd.exe -V
      zabbix_agentd Win64 (service) (Zabbix) 6.0.0
      Revision 5203d2ea7d 14 February 2022, compilation time: Feb 14 2022 12:57:08
      

      After we restarted zabbix agent, the active check item started to return correct value - 4348391.

      I could not find related fixes in 6.0 branch, so this issue looks fresh one to me.

      It happens sometimes/rarely and on a few different Windows servers.
      It's not possible to predict when it happens again.

            dimir dimir
            zalex_ua Oleksii Zagorskyi
            Team A
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: