Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-8545

Low performance for key net.tcp.service[] which reads /proc/net/tcp file

XMLWordPrintable

    • Icon: Incident report Incident report
    • Resolution: Fixed
    • Icon: Critical Critical
    • 2.5.0
    • 2.0.12, 2.2.4, 2.2.5, 2.3.2
    • Agent (G)
    • server:
      zabbix-server-2.0.5-1.el6.x86_64

      $ zabbix_agentd --version
      Zabbix Agent (daemon) v2.0.5 (revision 33558) (12 February 2013)
      Compilation time: Feb 14 2013 10:58:53

      After putting zabbix 2.0.5 into our production env for more than one year, we found some critical issues, here is one:
      The built-in key net.tcp.service[] needs to call NET_TCP_LISTEN function located in src/libs/zbxsysinfo/linux/net.c, and this function need to read /proc/net/tcp file
      we know, when the tcp connections are large, like 100k even more which is quite normal in our production servers, the performance is not so good, it needs to take tens even hundreds of seconds to get the correct result, which will cause agent timeout, for proxy or server, it won't get any data during that period, and it will trigger some false alert.

      For us, we now use ss command to get the correct data ASAP to work around, ss doesn't need to read that file and it return the results very quickly.

      The impact really depends on your service running on your server, for those who don't have so many connections, no worry, but for those who have tons of connections like us, it's really critical issue cuz it usually sends false alert.

      At the moment, the latest stable version 2.2.5 haven't fix that, and hope you guys fix that ASAP.

      ref:
      http://stackoverflow.com/questions/11763376/difference-between-netstat-and-ss-in-linux

            Unassigned Unassigned
            jaseywang jaseywang
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: