Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-2500

Distributed monitoring: sync to master stops until child zabbix_server restart

XMLWordPrintable

    • Icon: Incident report Incident report
    • Resolution: Won't fix
    • Icon: Critical Critical
    • None
    • 1.8.2
    • Server (S)

      We are using distributed monitoring to monitor a second datacenter. So 1 master, 1 child. At first, logs showed that the child was syncing just fine:

      32152:20100603:083550.916 NODE 2: Sending history_uint_sync of node 2 to node 1 datalen 318187
      32152:20100603:083603.683 NODE 2: Sending configuration changes to master node 1 for node 2 datalen 8
      32152:20100603:083606.063 NODE 2: Received configuration changes from master node 1 for node 2 datalen 8
      32152:20100603:083606.280 NODE 2: Sending history_sync of node 2 to node 1 datalen 345188
      32152:20100603:083643.232 NODE 2: Sending events of node 2 to node 1 datalen 120

      ( Snippet is from current situation, after child Zabbix server restart )

      When initially started, the syncing seemed to work fine for over 6 weeks. We could see graphs fine on the master. Today, we have noticed that around 2 weeks ago, the child suddenly stopped sending data to the master. When browsing the php frontend locally on the child, everything was still gathering just fine. Looking for child data on the master frontend showed no new data since those 2 weeks..

      The child's zabbix_server.log was empty, with the last entries coming from around the time it stopped syncing. Restarting the child fixed it again.

      I'm not sure what the cause is, but i think we were having network issues on our master network by the time the sync stopped working.

      Maybe the child doesnt try to reconnect after a connection drop?

      Another improvement would be an internal zabbix check that checks the connection between master and children, so we can at least monitor this cleanly.

            Unassigned Unassigned
            verwilst Bart Verwilst
            Votes:
            8 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: