-
Type:
Problem report
-
Resolution: Duplicate
-
Priority:
Trivial
-
None
-
Affects Version/s: 7.0.21
-
Component/s: Proxy (P), Server (S)
-
None
In our environment, we have 10 proxies serving approximately 40,000 hosts. These proxies are grouped into a proxy group, where each proxy handles around 4,000 hosts and over 1 million items.
We intentionally avoid frequent polling to reduce load, so the average Required vps value remains around 1600.
The environment is generally stable — VM resource usage as well as internal Zabbix proxy process loads stay around 20–30%.
All hosts behind the proxy group are mostly uniform and share the same item set. The majority of items come from templates linked to all hosts, and there are very few unique or exceptional hosts with custom items.
Problem
We previously experienced configuration refresh problems after changing item configuration (more details in ticket https://support.zabbix.com/browse/ZBX-26732).
After upgrading the entire environment to 7.0.21, the original problem was resolved, but a new (similar) issue appeared.
We noticed a very long configuration resynchronization time in the following situation:
- detaching a proxy from its proxy group
- the proxy being unavailable for longer than the failover period (becomes “unavailable”)
When the proxy becomes available again, the configuration syncer remains at 80–100% load for a very long time (measured in hours), and the proxy does not start collecting data.
The only workaround we have found so far was to:
- detach the proxy from the proxy group
- detach all hosts from this proxy (GUI must show 0 NVPS)
- restart the proxy
- wait until the proxy processes the configuration (several minutes)
- reattach the proxy to the proxy group
After reviewing zabbix-proxy and PostgreSQL logs, we noticed that the configuration syncer has significant problems removing old configuration, especially in the item_rtdata table, after a proxy is detached from a proxy group.
The DELETE query looks very similar to the one involved in the earlier issue (ZBX-26732):
delete from item_rtdata where (itemid in (ITEMID x 1000) or itemid in (ITEMID x 1000) or ...)
This DELETE operation runs extremely slowly, but additionally it is repeatedly interrupted by another query, resulting in deadlocks:
2025-12-03 10:00:02.894 UTC [24585] zabbix@zabbix_proxy ERROR: deadlock detected
2025-12-03 10:00:02.894 UTC [24585] zabbix@zabbix_proxy DETAIL: Process 24585 waits for ShareLock on transaction 947797861; blocked by process 24583.
Process 24583 waits for ShareLock on transaction 947796829; blocked by process 24585.
Process 24585: delete from item_rtdata where (itemid in (....................
Process 24583: update item_rtdata set lastlogsize=10606934,mtime=1764755866 where itemid=33410972;
update item_rtdata set lastlogsize=10547081,mtime=1764755871 where itemid=33410973;
update item_rtdata set lastlogsize=10575986,mtime=1764755871 where itemid=33410974;
..............................
This causes the following error on the proxy side:
24385:20251203:100003.058 End of zbx_proxyconfig_process()
24385:20251203:100003.058 cannot process received configuration data from server at "<ZABBIX DNS>": cannot remove old objects from table "item_rtdata"
24385:20251203:100003.152 End of process_configuration_sync()
24385:20251203:100003.153 zbx_setproctitle() title:'configuration syncer [synced config 4680782 bytes in 53.426632 sec, idle 10 sec]'
This process continues indefinitely — the proxy is unable to remove old configuration from item_rtdata because deadlocks occur continuously.
The above workaround resolves the issue because with no items assigned to the proxy, there are no updates to item_rtdata, so DELETE can finally complete.
Questions
- Can this DELETE query be optimized in the same way as the SELECT query in ticket
ZBX-26732?
For example:
-
- batching DELETE operations in groups of 1000 item IDs, or
-
- performing DELETE per item (similar to how UPDATE is executed).
- Does Zabbix use SELECT ... FOR UPDATE when updating or deleting rows to acquire row-level locks proactively?
Could this mechanism also be applied here to avoid deadlocks on DELETE and UPDATE operations?
- related to
-
ZBX-26769 Remove slow "or" conditions from "in" statements
-
- Closed
-