Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  2. ZBX-16260

ZBXNEXT-4967 may causes problematic behaviour when it can't hit a server


    • Icon: Incident report Incident report
    • Resolution: Duplicate
    • Icon: Minor Minor
    • None
    • 4.2.1
    • Agent (G)

      The new zabbix_sender behaviour introduced in ZBXNEXT-4967 can cause long timeouts when a server in the ServerActive is not running/responding.

      Steps to reproduce:

      1. Add a server to the zabbix_agentd.conf ServerActive= section which exists, but can't be connected to on port 10051.
      2. Run zabbix_sender -c zabbix_agentd.conf -s test -k test -o 0 -vv
      3. Note the long timeout


      $ time zabbix_sender -c /etc/zabbix/zabbix_agentd.conf -s test -k test -o 0 -vv
      zabbix_sender [24249]: DEBUG: answer [{"response":"success","info":"processed: 0; failed: 1; total: 1; seconds spent: 0.000050"}]
      Response from "gamezabbix-server.svc.oanda.com:10051": "processed: 0; failed: 1; total: 1; seconds spent: 0.000050"
      zabbix_sender [24250]: Warning: timeout while executing operation
      sent: 1; skipped: 0; total: 1
      real 1m0.030s
      user 0m0.007s
      sys 0m0.038s
      grep -i timeout /etc/zabbix/zabbix_agentd.conf
      ### Option: Timeout
      # Spend no more than Timeout seconds on processing
      # Timeout=3
      $ grep ServerActive /etc/zabbix/zabbix_agentd.conf
      ### Option: ServerActive
      # Example: ServerActive=,zabbix.domain,[::1]:30051,::1,[12fc::1]
      # ServerActive=

      In the example above, I had no yet created a firewall rule to allow access to one of the two servers in my config.

      In these cases, removing the invalid server or allowing access to it (assuming the server is running and port 10051 is open) fixes the issue, but the behaviour change is enough that these issues may not be obvious - particularly in my case where I default to adding extra ServerActive for every host, but only some actually need the config (Ansible laziness)

      Ideally, zabbix_sender should follow the existing Timeout for these connections to prevent the long wait for a timeout.

      Make the new behaviour of sending to all servers require a special argument (or an argument to use the old behaviour of just the first server)

            agavrilovs Aleksandrs Petrovs-Gavrilovs
            dangelovich David Angelovich
            0 Vote for this issue
            3 Start watching this issue