Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-25177

Unavailability mechanism very sensitive for instant network issues

XMLWordPrintable

      Zabbix perform unavailability / unreachability logic for every network issue (at least for SNMP items), while other SNMP items can be checked normally at the same time. In this case difference between "first network error" and "connection restored" can be very short (1-5 seconds). Probably it is better to consider multiple issues during some time:

       282:20240819:193202.450 resuming SNMP agent checks on host "hostA": connection restored
         282:20240819:193203.419 SNMP agent item "walk.ifHCInOctets" on host "hostA" failed: first network error, wait for 15 seconds
         282:20240819:193204.568 SNMP agent item "walk.ifInDiscards" on host "hostB" failed: first network error, wait for 15 seconds
         282:20240819:193215.875 resuming SNMP agent checks on host "hostD": connection restored
         282:20240819:193218.677 resuming SNMP agent checks on host "hostA": connection restored
         282:20240819:193219.431 resuming SNMP agent checks on host "hostC": connection restored
         282:20240819:193223.238 resuming SNMP agent checks on host "hostB": connection restored
         282:20240819:193232.457 SNMP agent item "walk.ifHCInUcastPkts" on host "hostA" failed: first network error, wait for 15 seconds
         282:20240819:193248.274 SNMP agent item "walk.ifHighSpeed" on host "hostD" failed: first network error, wait for 15 seconds
         282:20240819:193250.318 resuming SNMP agent checks on host "hostA": connection restored
         282:20240819:193251.074 SNMP agent item "walk.ifInDiscards" on host "hostA" failed: first network error, wait for 15 seconds
         282:20240819:193252.569 SNMP agent item "walk.ifInDiscards" on host "hostC" failed: first network error, wait for 15 seconds
         282:20240819:193255.192 SNMP agent item "walk.ifHCOutUcastPkts" on host "hostB" failed: first network error, wait for 15 seconds
         282:20240819:193305.121 resuming SNMP agent checks on host "hostD": connection restored
         282:20240819:193309.523 resuming SNMP agent checks on host "hostA": connection restored
         282:20240819:193310.711 resuming SNMP agent checks on host "hostC": connection restored
         282:20240819:193314.285 resuming SNMP agent checks on host "hostB": connection restored
      

      HostA - 193203.419 error, 193218.677 recovery
      HostB - 193204.568 error, 193223.238 recovery
      HostC - 193252.569 error, 193310.711 recovery
      HostD - 193248.274 error, 193305.121 recovery
      During time period between error and recovery other items are polling normally.

            zabbix.dev Zabbix Development Team
            dotneft Alexey Pustovalov
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: