Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-26732

Inefficient configuration sync scope on proxy after item update

XMLWordPrintable

    • Icon: Problem report Problem report
    • Resolution: Unresolved
    • Icon: Trivial Trivial
    • 7.0.16
    • Proxy (P)
    • None
    • 9x Proxy 7.0.6: 4 vcpu, 16 gb RAM, postgresql 16.5
    • S25-W30/31, S25-W32/33
    • 0.5

      In our environment, we have 9 proxies serving approximately 35,000 hosts. These proxies are grouped into a proxy group, where each proxy handles around 4,000 hosts and over 1 million items.

      We intentionally avoid frequent polling to reduce load, so the average Required vps value remains around 1600.
      The environment is generally stable — VM resource usage and internal Zabbix proxy process loads are around 20–30%.

      All hosts behind the proxy group are mostly uniform and share the same item set. The majority of items come from templates linked to all hosts, and there are very few unique or exceptional hosts with custom items.

      Problem:

      We are experiencing significant performance issues during minor template-level changes that affect many hosts. Examples of such changes include:

      • Any modification to item configuration
      • Tag changes on an item
      • Changes to item preprocessing

      These actions trigger extremely high load on the configuration syncer of the proxy, lasting up to 2 hours in some cases.

      Upon reviewing logs, we observed that even a single minor change results in massive amounts of configuration data being sent to the proxy. At the same time, we see slow queries in the database, particularly on the items and item_preproc tables, such as:

      select item_preprocid,itemid,step,type,params,error_handler,error_handler_params from item_preproc where (itemid in (ITEMID x 1000) or itemid in (ITEMID x 1000) or ...)

      We understand that this is a standard SQL query for preprocessing configuration. However, in practice, this query becomes extremely large.

      For example, when a single item is modified on all hosts, we expect at most 35,000 affected items (equal to the number of hosts). But in reality, we observed queries involving 900,000+ itemids — almost the entire item set of the proxy.

       

      According to Zabbix documentation (https://www.zabbix.com/documentation/7.0/en/manual/distributed_monitoring/proxies/sync):

      “If an item is changed on a host, all configuration of that host will be synced.”

      So this behavior appears intentional. However, treating the entire host as the unit of synchronization feels overly broad, especially in environments like ours. Even a minor change causes what is effectively a full sync, due to the volume of data tied to each host.

      Questions:

      1. Is there any way to reduce the configuration synchronization scope on the proxy level — for example, by treating individual item components as independent configuration entities (as listed in the documentation under host: properties, interfaces, inventory, items, item preprocessing, item parameters, web scenarios)?

      2. During configuration sync, is it possible to apply any form of batching or pagination for database queries?
      For example, the item_preproc query shown earlier includes nearly 1 million itemids, which leads to severe performance degradation.
      We tested the same query with only 10,000 itemids, and it completed in ~100 ms. This shows that splitting the request into smaller chunks would significantly improve responsiveness.

            ngogolevs Nikita Gogolevs
            aprzybylski Albert Przybylski
            Team A
            Votes:
            4 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - Not Specified
                Not Specified
                Logged:
                Time Spent - 21h
                21h