Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-10632

Broken fd0 in poller processes

XMLWordPrintable

    • Icon: Incident report Incident report
    • Resolution: Incomplete
    • Icon: Major Major
    • None
    • 2.4.7
    • Proxy (P), Server (S)
    • Ubuntu 14.04 64Bit + Zabbix 2.4.7 from official Zabbix repos.

      Hi,

      we use externalscripts for some special application monitorings. These scripts are not very complex: bash script with a ssh command to a remote system and some magic with output of ssh command.

      We noticed yesterday that these scripts are flapping. But only when Zabbix runs these scripts. When we executed them manually than all works fine.
      So we took a look into differences between both executions and we found that the ssh command failed because Zabbix pass through a "broken" stdin.

      After creation from some additional items with externalscripts that give us the information which Zabbix process has this problem we found 7 poller processes of 150 with the some issue. Here 2 examples with a correct one and the wrong one:

      ### right one ###

      # ps faux | grep "poller #107"
      zabbix   23700  0.0  0.7 23329700 245344 ?     S    Feb24   5:21  \_ /usr/sbin/zabbix_server: poller #107 [got 0 values in 0.000005 sec, idle 1 sec]
      
      # ls -la /proc/23700/fd/
      total 0
      dr-x------ 2 root   root    0 Feb 24 10:09 .
      dr-xr-xr-x 9 zabbix zabbix  0 Feb 24 09:54 ..
      lr-x------ 1 root   root   64 Feb 24 10:09 0 -> /dev/null
      l-wx------ 1 root   root   64 Feb 24 10:09 1 -> /var/log/zabbix/zabbix_server.log.1
      l-wx------ 1 root   root   64 Feb 24 10:09 2 -> /var/log/zabbix/zabbix_server.log.1
      l-wx------ 1 root   root   64 Feb 24 10:09 3 -> /run/zabbix/zabbix_server.pid
      lrwx------ 1 root   root   64 Feb 24 10:09 4 -> socket:[4815415]
      lrwx------ 1 root   root   64 Feb 24 10:09 5 -> socket:[4815416]
      lrwx------ 1 root   root   64 Feb 24 10:09 6 -> socket:[105276876]
      

      ### wrong one ###

      # ps faux | grep "poller #108"
      zabbix   23701  0.0  0.7 23329736 243472 ?     S    Feb24   5:19  \_ /usr/sbin/zabbix_server: poller #108 [got 0 values in 0.000189 sec, idle 1 sec]
      
      # ls -la /proc/23701/fd/
      total 0
      dr-x------ 2 root   root    0 Feb 24 10:09 .
      dr-xr-xr-x 9 zabbix zabbix  0 Feb 24 09:54 ..
      lrwx------ 1 root   root   64 Feb 24 10:09 0 -> socket:[291604878]
      l-wx------ 1 root   root   64 Feb 24 10:09 1 -> /var/log/zabbix/zabbix_server.log.1
      l-wx------ 1 root   root   64 Feb 24 10:09 2 -> /var/log/zabbix/zabbix_server.log.1
      l-wx------ 1 root   root   64 Feb 24 10:09 3 -> /run/zabbix/zabbix_server.pid
      lrwx------ 1 root   root   64 Feb 24 10:09 4 -> socket:[4815415]
      lrwx------ 1 root   root   64 Feb 24 10:09 5 -> socket:[4815416]
      
      # lsof -n | grep 291604878
      zabbix_se 23701           zabbix    0u     IPv4          291604878       0t0        TCP 10.72.64.5:37487->10.0.3.40:postgresql (ESTABLISHED)
      

      So it seems that poller #108 created a new postgres connection and used fd0 for the new socket. But fd0 shoud point to /dev/null.
      We recognize this behaviour on Zabbix 1.8.x and 2.4.x. Upgrade to 3.0 is planned. But could take a look on this please? Maybe the issue is still existing in 3.0 too.

      Regards,
      Marcel

            Unassigned Unassigned
            marcel.japel Marcel Jäpel
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: