[ZBXNEXT-8758] Load balancing and HA for proxies Created: 2023 Oct 16  Updated: 2025 Jan 24  Resolved: 2024 Aug 25

Status: Closed
Project: ZABBIX FEATURE REQUESTS
Component/s: Agent (G), Agent2 plugin (G), Proxy (P)
Affects Version/s: None
Fix Version/s: 7.0.0rc1, 7.0 (plan)

Type: New Feature Request Priority: Trivial
Reporter: Rostislav Palivoda Assignee: Andris Zeila
Resolution: Fixed Votes: 10
Labels: None
Σ Remaining Estimate: Not Specified Remaining Estimate: Not Specified
Σ Time Spent: Not Specified Time Spent: Not Specified
Σ Original Estimate: Not Specified Original Estimate: Not Specified

Attachments: Text File crash_1.txt     PNG File image-2024-05-27-07-36-30-709.png     PNG File image-2024-05-27-07-38-30-341.png     Text File origin_zabbix_proxy.log     Text File server-crash-log.txt     GIF File uneavenly_distributed_proxies.gif     Zip Archive zabbix_agentd.zip     Text File zabbix_proxy.log     Text File zabbix_proxy_LLD_by_standalone.log     Zip Archive zabbix_proxy_VALGRIND_LLD1.zip     Text File zabbix_proxy_master.log     Text File zabbix_proxy_without_host_proxy_deletion.log     Text File zabbix_server.log     Zip Archive zabbix_server_VALGRIND_1.zip     Zip Archive zabbix_server_log.zip     Text File zabbix_server_log_corresponds_to_Valgrind1.log     Zip Archive zabbix_server_log_corresponds_to_Valgrind1.zip     Zip Archive zabbix_server_uneaven_distribution.zip     Zip Archive zabbix_server_wrong_host_assignment.zip    
Issue Links:
Causes
causes ZBX-25085 zabbix_sender on Windows mostly fails... Closed
causes ZBX-25932 zabbix sender returning incorrect exi... Closed
Duplicate
is duplicated by ZBXNEXT-5911 Automatically distribute hosts betwee... Closed
Sub-task
Sub-Tasks:
Key
Summary
Type
Status
Assignee
ZBXNEXT-8938 Frontend changes HA for proxies Specification change (Sub-task) Closed Andrejs Griščenko  
ZBXNEXT-9105 Add integration tests Auto test (Sub-task) Elaborating Vladislavs Sokurenko  
Epic Link: Zabbix 7.0
Team: Team A
Sprint: Sprint candidates
Story Points: 15

 Description   

Automated proxy load balancing and HA is one of the top feature requests and a must have functionality for enterprise users.



 Comments   
Comment by user185953 [ 2024 Jan 05 ]

Note TLS auth for proxies won't work right in HA: https://support.zabbix.com/browse/ZBXNEXT-8490

I apologize for promoting it like this. With my other tickets I promise I wait patiently like I should.

 

Comment by Andrejs Griščenko [ 2024 Apr 25 ]

Available in versions:

Comment by Dimitri Bellini [ 2024 May 21 ]

Dear ZabbixDevTeam,
Thanks for this great feature!
After play with it, I have noticed an attention point which maybe you already know.

If a Proxy Group will go offline, how we can received a notification?

Do you have some plan to create a dedicated template for monitoring the status and performance of a Proxy Group?

Thanks so much

Comment by Alexander Vladishev [ 2024 May 21 ]

Hi,

Thank you for the feedback!

You can use the "zabbix[proxy group, <name>, state]" item to get current state of the proxy group.

0 - unknown; 1 - offline; 2 - recovering; 3 - online; 4 - degrading.

Comment by Alexander Vladishev [ 2024 May 21 ]

Currently, we are working on such a template. It will be available in the nearest minor release.

Comment by Dimitri Bellini [ 2024 May 21 ]

Perfect! Thanks so much

Comment by Martins Valkovskis [ 2024 May 22 ]

Updated documentation:

Comment by pascal de jessey [ 2024 May 23 ]

Hello, 

great news !

what does it change for the network flows ?

does the proxies talk to each other, or they only talk to zabbix server

Comment by Leandro Dethloff [ 2024 May 23 ]

I believe that's not the case. I think the proxy will start reporting its status by writing to a new table in the database, similar to how the server's HA works, but I haven't verified this yet to be sure.

Comment by Leandro Dethloff [ 2024 May 27 ]

I've been running some tests with this new feature, and I have a question. Does this feature require something like HAProxy in my environment? My concern is that if I point the "Address for active agents" to a proxy, and that proxy goes offline, then the data might be lost.

Comment by Markku Leiniö [ 2024 May 27 ]

Documentation is still somewhat incomplete, but based on using the latest Wireshark 4.3.0rc0 build with Zabbix traffic, the server sends those addresses to all the proxies in the group. That way the proxies can return a (yet undocumented) redirect response to the active agents (7.0+?) to guide them to another proxy, for better load distribution.

That IP address is thus not a virtual address, that is the IP address of the proxy, to be used by the active agents. Otherwise Zabbix server does not "know" the IP address of the active proxy.

Comment by Aigars Kadikis [ 2024 May 31 ]

For the documentation under section https://www.zabbix.com/documentation/7.0/en/manual/distributed_monitoring/proxies/ha we could add/edit to include these bits:

How it works:

  • Active agent can have a single proxy for ServerActive field. When agent service starts, the agent will receive a full list of all IP addresses of all Zabbix proxies, load and keep into memory.
  • Proxy manager always knows which other proxies are healthy or unhealthy.

Possible problems:

  • Misconfiguration in Zabbix agent configuration file for the "Server" field. All proxies must be listed, it's mandatory for this field. Use subnet type of syntax.
  • Regarding Zabbix agent active checks, on an agent startup, if first proxy "alpha" responds and gives proxy "beta" in answer, and the proxy "beta" is not reachable because of firewall problem, the communication will stop in a state of waiting for "beta" to respond. The root cause of this is that proxy "alpha" knew that proxy "beta" was healthy for sure. It's a big priority to allow agents to reach all proxies at a firewall level. This is not a problem if the first proxy fails. Then, it will try different addresses configured in the "ServerActive" field.
  • Zabbix agent active checks, misconfiguration. Instead of semicolon syntax, a comma has been used
  • Misconfiguration at the group level. 10 proxies belong to 1 group and "Minimum number of proxies=10". One proxy dies, and the whole proxy group goes offline, data collection stops. It is better to have a minimum of 6 proxies. This will support 4 unhealthy proxies.
  • The HA setup has been very stable for multiple months in a row. Host rebalancing never happens; it is not needed. The agent does not validate the "backup channel" for other proxies. In a failover scenario, it might not work because a firewall was modified half a year ago.
  • While using the "Database monitoring" item, the DB object/server must have extended permissions.
Comment by Markku Leiniö [ 2024 May 31 ]

Active agent can have a single proxy for ServerActive field. When agent service starts, the agent will receive a full list of all IP addresses of all Zabbix proxies, load and keep into memory.

Note that in this case starting/rebooting agent while that particular proxy is offline causes agent to lose monitoring.

Generated at Sun Mar 30 14:14:25 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.