Type: Incident report
Affects Version/s: 2.0.12, 2.2.4, 2.2.5, 2.3.2
Fix Version/s: 2.5.0
Component/s: Agent (G)
$ zabbix_agentd --version
Zabbix Agent (daemon) v2.0.5 (revision 33558) (12 February 2013)
Compilation time: Feb 14 2013 10:58:53
After putting zabbix 2.0.5 into our production env for more than one year, we found some critical issues, here is one:
The built-in key net.tcp.service needs to call NET_TCP_LISTEN function located in src/libs/zbxsysinfo/linux/net.c, and this function need to read /proc/net/tcp file
we know, when the tcp connections are large, like 100k even more which is quite normal in our production servers, the performance is not so good, it needs to take tens even hundreds of seconds to get the correct result, which will cause agent timeout, for proxy or server, it won't get any data during that period, and it will trigger some false alert.
For us, we now use ss command to get the correct data ASAP to work around, ss doesn't need to read that file and it return the results very quickly.
The impact really depends on your service running on your server, for those who don't have so many connections, no worry, but for those who have tons of connections like us, it's really critical issue cuz it usually sends false alert.
At the moment, the latest stable version 2.2.5 haven't fix that, and hope you guys fix that ASAP.