Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-17303

timeout in "net.tcp.service" blocks agent

XMLWordPrintable

    • Icon: Problem report Problem report
    • Resolution: Cannot Reproduce
    • Icon: Trivial Trivial
    • None
    • 4.0.16
    • None
    • None
    • Debian

      I'm investigating regular gaps (5...10 min. duration) in all graphs for a particular VM (passive agent, standard "Template OS Linux" template plus few custom checks).
      For example, "CPU iowait time" receives data for a moment then there is nothing for about 5 minutes, then some data again, then another 5...10 min. gap and so on.

      Here is what I've found in Zabbix server log, repeatedly logged:

      ```
      Zabbix agent item "net.tcp.service[tcp,b2btest.internal,8880]" on host "web31.vm" failed: first network error, wait for 15 seconds
      resuming Zabbix agent checks on host "web31.vm": connection restored
      ```

      Interval for the check is 300s and the problem appears to be because firewall is dropping connections:

      ```
      $ time telnet b2btest.internal 8880
      Trying xx.xx.xxx.xx...
      telnet: Unable to connect to remote host: Connection timed out

      real 2m11.103s
      user 0m0.003s
      sys 0m0.000s
      ```

      The problem is that timeout in "net.tcp.service" makes Zabbix agent unresponsive which affects all other checks. Problem is exacerbated when there are more than one timeouting "net.tcp.service" check.

      I've isolated the problem by disabling the problematic "net.tcp.service" item which instantly restored graphs to normal.

            aigars.kadikis Aigars Kadikis
            onlyjob Onlyjob
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: