Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-9733

Zabbix agent port was taken by SYSTEM user and running without process on top of it.

    Details

    • Type: Incident report
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.4.1
    • Fix Version/s: 2.2.12rc1, 2.4.8rc1, 3.0.0beta2
    • Component/s: Agent (G)
    • Labels:
    • Environment:
      zabbix server & proxy 2.4.4 on centos 6, DB: mysql
      zabbix agent v2.4.1 on win 2008 R2

      Description

      We encountered to some issue when the zabbix port 10050 was up and running twice, once with system user, and another was the agent:

      C:\Users\XXXX>netstat -aon | find "10050"
        TCP    0.0.0.0:10050          0.0.0.0:0              LISTENING       16496
        TCP    0.0.0.0:10050          0.0.0.0:0              LISTENING       12600
        TCP    [::]:10050             [::]:0                 LISTENING       12600
        TCP    [::]:10050             [::]:0                 LISTENING       16496
      

      while proccess: 16496 is owned by zabbix_agent and process: 12600 was not in task manager list (which means that the process is no longer exist).

      This prevents our zabbix proxy not getting response to non active checks from the host: "Get value from agent failed: cannot connect to [[PPP.YYY.XXX.ZZZ]:10050]: [111] Connection refused"

      We were not able to kill the process and also not able to terminate the port since the process wasn't exist (tried with TCPView too).

      We also found bunch of Time_wait connections on port 10050 which were terminated by TCP Viewer, but the port LISTEN operation wasn't resolved.

      Only reboot solved the issue, but in production we can't reboot around 50 servers which running on them critical services.

      In addition we don't see this phenomenon on all windows machines.

      Do you have any idea why it's happening?
      Did you encounter this issue before?
      Any idea of how to resolve this issue without rebooting the machines?

      Thanks,
      Natalia

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              natalia Natalia Kagan
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: