[ZBXNEXT-1788] Multi location monitoring voting members Created: 2013 Jun 11  Updated: 2022 May 25  Resolved: 2022 May 25

Status: Closed
Project: ZABBIX FEATURE REQUESTS
Component/s: Server (S)
Affects Version/s: None
Fix Version/s: None

Type: New Feature Request Priority: Major
Reporter: Frank Assignee: Unassigned
Resolution: Declined Votes: 4
Labels: dm, proxy
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

The current implementation of Distributed Monitoring (DM / Proxy) is not very optimal.

  • Proxy: The primary Zabbix server is the SPOF. If it fails, the proxy cannot send data to the master and does not have the ability to notify anyone about problems
  • DM: They individually check & report when there are issues, which means this can lead to duplicated reports or false positives/negatives when one of the servers detects a failure (which might be due to network issues on the Zabbix server's end)

It would be better if the DM setup would be more like a voting system where a master could be selected which takes care of reporting and the nodes only report the status.
When the primary elected node fails, the voting system kicks in and picks a different server.
This would make the Zabbix monitoring infrastructure more resilient and would only have one server responsible for sending reports at any given time to prevent duplicated reports.

MongoDB uses a similar approach to elect a new master, this might be a valuable improvement to Zabbix' distributed monitoring.



 Comments   
Comment by yayo [ 2013 Jun 14 ]

also, we need a clear description of distributed monitoring in HA: in the classic distributed zabbix monitoring every point is a SPOF because if one proxy goes down every server looks at this proxy fails to send data (or metrics are lost). If the Master fails every proxy loss data without any alert (can be mitigated using a failover cluster configuration)

Comment by Oleksii Zagorskyi [ 2022 May 25 ]

In 6.0 things are very different now, so this is not actual anymore -> Closing.

Generated at Fri Apr 26 19:19:22 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.