-
Problem report
-
Resolution: Declined
-
Trivial
-
None
-
None
-
None
-
RHEL6/7
Over the last few months we have experienced false positives from Zabbix, where the proxies in the PSN weren't able to receive data from the remote agents. Yesterday we had two false positive (INC5107700 and INC5107703, today four more tickets). Below are some logs, that show the errors seen on the proxy. The thing is that Zabbix apparently recovers connectivity but it didn't query the servers again until I forced them to be checked at 11:11, when it should have checked them immediately or in the next 5 minutes. Could we please investigate it to see what we can do to avoid this false positives?
5924:20200130:110308.586 Zabbix agent item "proc.cpu.util[sshd,root]" on host "lhr5ukrpsfelp01" failed: first network error, wait for
15 seconds
5921:20200130:110308.843 Zabbix agent item "net.if.out[eth0]" on host "lhr5ukrpscalp02" failed: first network error, wait for 15 seconds
5949:20200130:110308.888 Zabbix agent item "proc.cpu.util[syslog-ng,root]" on host "lhr5ukrpsfelp04" failed: first network error, wait for 15 seconds
5935:20200130:110308.969 Zabbix agent item "custom.file.stat["ls -At",/var/log/service-sep-dnstap-streamer,log]" on host "lhr5ukrpsfelp03" failed: first network error, wait for 15 seconds
5922:20200130:110309.975 Zabbix agent item "proc.num[ntpd,ntp,,]" on host "lhr5ukrpscalp01" failed: first network error, wait for 15 seconds
5934:20200130:110309.979 Zabbix agent item "system.cpu.util[,iowait]" on host "lhr5ukrbbwklp01" failed: first network error, wait for 15 seconds
5960:20200130:110310.018 Zabbix agent item "proc.num[,dstreamer,,service-sep-dnstap-streamer]" on host "lhr5ukrpsfelp02" failed: first network error, wait for 15 seconds
5982:20200130:110359.362 resuming Zabbix agent checks on host "lhr5ukrpsfelp01": connection restored
5982:20200130:110359.367 resuming Zabbix agent checks on host "lhr5ukrbbwklp01": connection restored
5970:20200130:110359.390 resuming Zabbix agent checks on host "lhr5ukrpsfelp03": connection restored
5982:20200130:110400.393 resuming Zabbix agent checks on host "lhr5ukrpsfelp02": connection restored
5970:20200130:110400.396 resuming Zabbix agent checks on host "lhr5ukrpscalp01": connection restored
5970:20200130:110404.407 resuming Zabbix agent checks on host "lhr5ukrpscalp02": connection restored
5982:20200130:110415.413 resuming Zabbix agent checks on host "lhr5ukrpsfelp04": connection restore