Type: Incident report
Affects Version/s: 3.0.4
Fix Version/s: None
I have configured one host which has 40 SNMP based items. I use SNMPv3 autPriv to query devices. Everything works well as long as there is only one SNMPv3 monitored host. When I added second host from the same template my zabbix-server logs every 5 sec filled up messages like:
SNMP agent item "snmp.host.cpuLoad" on host "router 2" failed: first network error, wait for 1 seconds
resuming SNMP agent checks on host "router 2": connection restored
I thought that it could be a problem with too low pre-forked pollers, but even when I raised all posible pollers, the problem still occurs.
So my next think was that there could be something wrong with my network, but I captured SNMP traffic and it looks properly. Every Zabbix request has corresponding response.
When I tried add third SNMPv3 host, I found in logs that new host has turned off SNMP active checks due to host unavailable. I captured the traffic in cap2 file since the moment when I added third device till Zabbix considered host as unavailable. I can see that within this time zabbix sent a few request and for everyone got response but in logs are appropriate entries about supposed timeout reach.
So I cloned the template with SNMPv3 checks and edited it to use SNMPv1 queries. Then I linked it to hosts which caused problems and it works well.
In cap1 file is located captured traffic when Zabbix has enabled only one SNMPv3 device.
In cap2 file is located captured traffic between zabbix and freshly added third SNMPv3 device. The file ends when zabbix server considered new host as unavailable.
Obviously I am able to snmpget all of these devices via SNMPv3 without any problem. It is not problem with bulk requests because all devices recognise it correctly and two of my devices are identical, where one works well with zabbix SNMPv3 checks.