Zabbix agent is too slow on Solaris hosts running lots of process

XMLWordPrintable

    • Type: Incident report
    • Resolution: Duplicate
    • Priority: Trivial
    • None
    • Affects Version/s: 2.2.6
    • Component/s: Agent (G)
    • None
    • Environment:
      Zabbix server 2.2.6, polling data (passive mode) from Solaris 10 hosts

      Server logs this error constantly when polling Solaris 10 hosts with lots (~1300+) of active processes:

      zabbix_server[27363]: Zabbix agent item "proc.num[,,run]" on host "testhost" failed: first network error, wait for 15 seconds

      This causes issues (like missing data in history, and consequently holes in graphs) not only for the proc.num[,,run], but also for other items on these hosts (which I presume are getting aborted by the above timeout along with proc.num[,,run]).

      I did a little analysis of the respective Agent source code (src/libs/zbxsysinfo/solaris/proc.c_) and I believe it can be made more efficient, and therefore avoid that issue (or at least postpone it to occur only with much larger process tables). I plan on writing and then contributing a patch for the agent to implement this.

      I'm creating this issue to record the issue and my progress so far, and to server as a focal point for future work on the issue.

            Assignee:
            Andris Mednis
            Reporter:
            Durval Menezes
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: