Steps to reproduce:
- Create template, link it to 10 hosts
- Inside template, create 10 items like net.tcp.port[<ipN>,<portN>]. Important: TCP connection to <ipN>:<portN> should not get any response. It should end up with connect timeout.
- Inside template, create triggers for all items, otherwise items are not updated.
It may be usable to use XML export-import to create all this items and triggers.
Multiple events like "Zabbix agent on web13 is unreachable for 2 min" start spawning randomly on this 10 hosts, where template is applied. When created items are disabled, problem disappears.
Server and Agents do not suffer performance problems, disk I/O and CPU load are small, plenty of free memory. MySQL server is running fine, not long queries.
Server and Agent configs are stanrard, with all default values for StartPollers, StartPollersUnreachable,Timeout etc.
Increasing innodb_buffer_pool_size on MySQL server didn't help.
Administration/Queue shows 0 value in whole table.
Value of zabbix[wcache,values on graph "Zabbix Server Preformance" goes down when items are enabled and problem starts. It goes back up when items are disabled. Value of zabbix[queue] on the same graph stays at 0.
After enabling items this messager often appear in server log:
2126:20160601:185801.820 Zabbix agent item "net.tcp.port[18.104.22.168,3136]" on host "web13" failed: first network error, wait for 15 seconds
2126:20160601:185816.870 resuming Zabbix agent checks on host "web13": connection restored