[ZBX-22617] Slow configuration sync "syncing trend data..." Created: 2023 Mar 30  Updated: 2025 Apr 07  Resolved: 2025 Mar 18

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 6.4.0
Fix Version/s: 7.0.11rc1, 7.2.5rc1, 7.4.0beta1

Type: Problem report Priority: Major
Reporter: Karol Woronowicz Assignee: Vladislavs Sokurenko
Resolution: Fixed Votes: 10
Labels: performance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

DB: (PostgreSQL) 14.5 (Ubuntu 14.5-2.pgdg20.04+2) + timescaledb 2.7.1
32 CPU, 80 GB RAM

Server: HA Zabbix two nodes cluster - 6.4.0 8 CPU, 64 GB RAM

Number of hosts (enabled) - 20136
Number of items (enabled/disabled/not supported) - 8647067 / 27164 / 1542
Number of triggers (enabled/disabled) - 7780347 / 258826
Required server performance, new values per second - 18269.5

Everything is monitored by 10 active zabbix proxies.
99% of items have logrt.count key


Attachments: Text File increased_logs_configuration_syncer.txt     Text File zabbix_server.txt    
Issue Links:
Causes
caused by ZBXNEXT-4732 Server-side changes for Host and Temp... Closed
Duplicate
Related
related to ZBX-25651 Add an opportunity to disable trend s... Closed
Sub-task
depends on ZBX-24095 Slow configuration sync on PostgreSQL... Closed
depends on ZBX-24103 Slow configuration sync due to redund... Closed
Team: Team A
Sprint: S25-W10/11
Story Points: 3

 Description   

Hi, we have a problem with slow synchronization of the configuration, when the cluster node changes and during normal work.

 

  1. Shutting down the cluster node:
    278053:20230329:131439.445 HA manager has been paused
    
    282834:20230329:131439.446 syncing history data in progress...
    
    282838:20230329:131439.558 syncing history data... 100.000000%
    
    282838:20230329:131439.558 syncing history data done
    
    282820:20230329:131440.367 [1] thread stopped [preprocessing worker #1]
    
    282820:20230329:131440.368 [3] thread stopped [preprocessing worker #3]
    
    282820:20230329:131440.368 [2] thread stopped [preprocessing worker #2]
    
    278048:20230329:{+}{color:#FF0000}*131443*{color}{+}.061 syncing trend data...
    
    278048:20230329:{+}*{color:#FF0000}131654{color}*{+}.404 syncing trend data done
    
    278053:20230329:131654.423 HA manager has been stopped
    
    278048:20230329:131655.780 Zabbix Server stopped. Zabbix 6.4.0 (revision 5b2736b6027).
    

 

  1. Turning on the cluster node - only one process is responsible for synchronization
283802:20230329:132044.944 "nodename" node switched to "active" mode

283802:20230329:132044.957 server #0 started [main process]

283942:20230329:132044.958 server #1 started [service manager #1]

283943:20230329:132044.959 server #2 started [configuration syncer #1]

283943:20230329:132058.629 slow query: 13.316544 sec, "select i.itemid,i.hostid,i.status,i.type,i.value_type,i.key_,i.snmp_oid,i.ipmi_sensor,i.delay,i.trapper_hosts,i.logtimefmt,i.params,ir.state,i.authtype,i.username,i.password,i.publickey,i.privatekey,i.flags,i.interfaceid,ir.lastlogsize,ir.mtime,i.history,i.trends,i.inventory_link,i.valuemapid,i.units,ir.error,i.jmx_endpoint,i.master_itemid,i.timeout,i.url,i.query_fields,i.posts,i.status_codes,i.follow_redirects,i.post_type,i.http_proxy,i.headers,i.retrieve_mode,i.request_method,i.output_format,i.ssl_cert_file,i.ssl_key_file,i.ssl_key_password,i.verify_peer,i.verify_host,i.allow_traps,i.templateid,null from items i inner join hosts h on i.hostid=h.hostid join item_rtdata ir on i.itemid=ir.itemid where h.status in (0,1) and i.flags<>2"

283943:20230329:132158.605 slow query: 4.137951 sec, "select itemtagid,itemid,tag,value from item_tag"

283943:20230329:132215.097 slow query: 5.529707 sec, "select triggerid,description,expression,error,priority,type,value,state,lastchange,status,recovery_mode,recovery_expression,correlation_mode,correlation_tag,opdata,event_name,null,null,null,flags from triggers"

284012:20230329:132319.485 server #3 started [alert manager #1] 

 

  1. Standard work
    480:55 /usr/sbin/zabbix_server: configuration syncer [synced configuration in 20.678369 sec, syncing configuration]
    
    481:00 /usr/sbin/zabbix_server: configuration syncer [synced configuration in 24.091629 sec, idle 10 sec]
    
    481:02 /usr/sbin/zabbix_server: configuration syncer [synced configuration in 24.091629 sec, syncing configuration]
    
    481:17 /usr/sbin/zabbix_server: configuration syncer [synced configuration in 24.091629 sec, syncing configuration]
    
    481:22 /usr/sbin/zabbix_server: configuration syncer [synced configuration in 23.483421 sec, idle 10 sec] [alert manager #1|#1][synced configuration in 23.483421 sec, idle 10 sec] 
    

Increased level of configuration syncer logs in the attachment.



 Comments   
Comment by Vladislavs Sokurenko [ 2023 Apr 06 ]

Could you please be so kind and try following query:

 explain analyze select i.itemid,i.hostid,i.status,i.type,i.value_type,i.key_,i.snmp_oid,i.ipmi_sensor,i.delay,i.trapper_hosts,i.logtimefmt,i.params,ir.state,i.authtype,i.username,i.password,i.publickey,i.privatekey,i.flags,i.interfaceid,ir.lastlogsize,ir.mtime,i.history,i.trends,i.inventory_link,i.valuemapid,i.units,ir.error,i.jmx_endpoint,i.master_itemid,i.timeout,i.url,i.query_fields,i.posts,i.status_codes,i.follow_redirects,i.post_type,i.http_proxy,i.headers,i.retrieve_mode,i.request_method,i.output_format,i.ssl_cert_file,i.ssl_key_file,i.ssl_key_password,i.verify_peer,i.verify_host,i.allow_traps,i.templateid,null from items i inner join hosts h on i.hostid=h.hostid join item_rtdata ir on i.itemid=ir.itemid;
Comment by Karol Woronowicz [ 2023 Apr 07 ]

sure, that's the result:

                                                                             QUERY PLAN                                                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=2386.26..2325653.17 rows=11060863 width=280) (actual time=173.061..17036.121 rows=11079075 loops=1)
   Hash Cond: (i.hostid = h.hostid)
   ->  Merge Join  (cost=135.53..2294363.36 rows=11060863 width=248) (actual time=166.556..15484.177 rows=11079075 loops=1)
         Merge Cond: (i.itemid = ir.itemid)
         ->  Index Scan using items_pkey on items i  (cost=0.43..949380.89 rows=11569670 width=233) (actual time=166.498..6887.657 rows=11525867 loops=1)
         ->  Index Scan using item_rtdata_pkey on item_rtdata ir  (cost=0.56..1177892.11 rows=11060863 width=23) (actual time=0.028..6328.448 rows=11079075 loops=1)
   ->  Hash  (cost=1893.66..1893.66 rows=28566 width=8) (actual time=6.483..6.487 rows=29646 loops=1)
         Buckets: 32768  Batches: 1  Memory Usage: 1415kB
         ->  Seq Scan on hosts h  (cost=0.00..1893.66 rows=28566 width=8) (actual time=0.009..4.090 rows=29646 loops=1)
 Planning Time: 1.479 ms
 JIT:
   Functions: 14
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 1.263 ms, Inlining 37.421 ms, Optimization 80.721 ms, Emission 48.118 ms, Total 167.523 ms
 Execution Time: 17330.515 ms 

I would like to add that the number of items has increased to 10.5+ million.

Comment by Vladislavs Sokurenko [ 2024 Jan 30 ]

Please increase log level to debug and provide statistics from history syncer
It should look like this:

38366:20240130:182429.268 zbx_dc_sync_configuration() changelog  : sql:0.013996 sec (1 records)
238366:20240130:182429.271 zbx_dc_sync_configuration() config     : sql:0.001789 sync:0.002462 sec (1/0/0).
238366:20240130:182429.272 zbx_dc_sync_configuration() autoreg    : sql:0.001018 sync:0.000667 sec (0/0/0).
238366:20240130:182429.273 zbx_dc_sync_configuration() autoreg host    : sql:0.000022 sync:0.000648 sec (0/0/0).
238366:20240130:182429.274 zbx_dc_sync_configuration() hosts      : sql:0.005960 sync:0.001522 sec (0/1/0).
238366:20240130:182429.275 zbx_dc_sync_configuration() host_invent: sql:0.003719 sync:0.000871 sec (0/0/0).
238366:20240130:182429.276 zbx_dc_sync_configuration() templates  : sql:0.004195 sec (0/0/0).
238366:20240130:182429.277 zbx_dc_sync_configuration() globmacros : sql:0.005729 sec (0/0/0).
238366:20240130:182429.277 zbx_dc_sync_configuration() hostmacros : sql:0.210211 sec (0/0/0).
238366:20240130:182429.277 zbx_dc_sync_configuration() interfaces : sql:0.006219 sync:0.001725 sec (25/0/0).
238366:20240130:182429.278 zbx_dc_sync_configuration() items      : sql:0.000281 sync:0.001360 sec (0/0/0).
238366:20240130:182429.278 zbx_dc_sync_configuration() template_items      : sql:0.292978 sync:0.000979 sec (0/0/0).
238366:20240130:182429.279 zbx_dc_sync_configuration() prototype_items      : sql:0.000840 sync:0.001099 sec (0/0/0).
238366:20240130:182429.279 zbx_dc_sync_configuration() item_discovery      : sql:0.133969 sync:0.000742 sec (0/0/0).
238366:20240130:182429.280 zbx_dc_sync_configuration() triggers   : sql:0.000202 sync:0.000731 sec (0/0/0).
238366:20240130:182429.280 zbx_dc_sync_configuration() trigdeps   : sql:0.049399 sync:0.000748 sec (0/0/0).
238366:20240130:182429.281 zbx_dc_sync_configuration() trig. tags : sql:0.000196 sync:0.000634 sec (0/0/0).
238366:20240130:182429.281 zbx_dc_sync_configuration() host tags : sql:0.001544 sync:0.000767 sec (0/0/0).
238366:20240130:182429.281 zbx_dc_sync_configuration() item tags : sql:0.000351 sync:0.001047 sec (0/0/0).
238366:20240130:182429.282 zbx_dc_sync_configuration() functions  : sql:0.000302 sync:0.000872 sec (0/0/0).
238366:20240130:182429.282 zbx_dc_sync_configuration() expressions: sql:0.003155 sync:0.000855 sec (0/0/0).
238366:20240130:182429.283 zbx_dc_sync_configuration() actions    : sql:0.001387 sync:0.000683 sec (0/0/0).
238366:20240130:182429.284 zbx_dc_sync_configuration() operations : sql:0.000999 sync:0.000684 sec (0/0/0).
238366:20240130:182429.285 zbx_dc_sync_configuration() conditions : sql:0.001738 sync:0.000614 sec (0/0/0).
238366:20240130:182429.285 zbx_dc_sync_configuration() corr       : sql:0.001095 sync:0.000852 sec (0/0/0).
238366:20240130:182429.285 zbx_dc_sync_configuration() corr_cond  : sql:0.001299 sync:0.000662 sec (0/0/0).
238366:20240130:182429.286 zbx_dc_sync_configuration() corr_op    : sql:0.000980 sync:0.000680 sec (0/0/0).
238366:20240130:182429.287 zbx_dc_sync_configuration() hgroups    : sql:0.003711 sync:0.002239 sec (0/0/0).
238366:20240130:182429.287 zbx_dc_sync_configuration() item pproc : sql:0.000371 sync:0.001095 sec (0/0/0).
238366:20240130:182429.288 zbx_dc_sync_configuration() item script param: sql:0.002087 sync:0.001808 sec (0/0/0).
238366:20240130:182429.288 zbx_dc_sync_configuration() maintenance: sql:0.011640 sync:0.003615 sec (0/0/0).
238366:20240130:182429.289 zbx_dc_sync_configuration() drules     : sql:0.000327 sync:0.001503 sec (0/0/0).
238366:20240130:182429.290 zbx_dc_sync_configuration() dchecks    : (0/0/0).
238366:20240130:182429.291 zbx_dc_sync_configuration() httptests  : sql:0.000584 sync:0.004232 sec (0/0/0).
238366:20240130:182429.291 zbx_dc_sync_configuration() httptestfld : (0/0/0).
238366:20240130:182429.292 zbx_dc_sync_configuration() httpsteps   : (0/0/0).
238366:20240130:182429.293 zbx_dc_sync_configuration() httpstepfld : (0/0/0).
238366:20240130:182429.293 zbx_dc_sync_configuration() connector: sql:0.000292 sync:0.001296 sec (0/0/0).
238366:20240130:182429.294 zbx_dc_sync_configuration() connector_tag: sql:0.000292 sync:0.001296 sec (0/0/0).
238366:20240130:182429.295 zbx_dc_sync_configuration() proxy: sql:0.000199 sync:0.000714 sec (0/0/0).
238366:20240130:182429.296 zbx_dc_sync_configuration() macro cache: 0.000177 sec.
238366:20240130:182429.297 zbx_dc_sync_configuration() reindex    : 0.011809 sec.
238366:20240130:182429.297 zbx_dc_sync_configuration() total sql  : 0.745157 sec.
238366:20240130:182429.297 zbx_dc_sync_configuration() total sync : 0.047818 sec.
238366:20240130:182429.298 zbx_dc_sync_configuration() proxies    : 0 (0 slots)
238366:20240130:182429.298 zbx_dc_sync_configuration() proxies_p    : 0 (0 slots)
238366:20240130:182429.299 zbx_dc_sync_configuration() hosts      : 1 (11 slots)
238366:20240130:182429.299 zbx_dc_sync_configuration() hosts_h    : 1 (11 slots)
238366:20240130:182429.300 zbx_dc_sync_configuration() autoreg_hosts: 0 (11 slots)
238366:20240130:182429.300 zbx_dc_sync_configuration() psks       : 0 (0 slots)
238366:20240130:182429.300 zbx_dc_sync_configuration() ipmihosts  : 0 (0 slots)
238366:20240130:182429.301 zbx_dc_sync_configuration() host_invent: 2 (11 slots)
238366:20240130:182429.301 zbx_dc_sync_configuration() glob macros: 1 (11 slots)
238366:20240130:182429.301 zbx_dc_sync_configuration() host macros: 5238 (9029 slots)
238366:20240130:182429.302 zbx_dc_sync_configuration() kvs_paths : 0
238366:20240130:182429.302 zbx_dc_sync_configuration() interfaces : 1 (11 slots)
238366:20240130:182429.303 zbx_dc_sync_configuration() interfaces_snmp : 0 (0 slots)
238366:20240130:182429.303 zbx_dc_sync_configuration() interfac_ht: 1 (11 slots)
238366:20240130:182429.304 zbx_dc_sync_configuration() if_snmpitms: 0 (0 slots)
238366:20240130:182429.304 zbx_dc_sync_configuration() if_snmpaddr: 0 (0 slots)
238366:20240130:182429.304 zbx_dc_sync_configuration() item_discovery : 6400 (9029 slots)
238366:20240130:182429.305 zbx_dc_sync_configuration() items      : 1 (101 slots)
238366:20240130:182429.305 zbx_dc_sync_configuration() items_hk   : 1 (101 slots)
238366:20240130:182429.305 zbx_dc_sync_configuration() numitems   : 1 (11 slots)
238366:20240130:182429.306 zbx_dc_sync_configuration() preprocitems: 0 (0 slots)
238366:20240130:182429.306 zbx_dc_sync_configuration() snmpitems  : 0 (0 slots)
238366:20240130:182429.307 zbx_dc_sync_configuration() ipmiitems  : 0 (0 slots)
238366:20240130:182429.307 zbx_dc_sync_configuration() trapitems  : 0 (0 slots)
238366:20240130:182429.308 zbx_dc_sync_configuration() dependentitems  : 0 (0 slots)
238366:20240130:182429.308 zbx_dc_sync_configuration() logitems   : 0 (0 slots)
238366:20240130:182429.308 zbx_dc_sync_configuration() dbitems    : 0 (0 slots)
238366:20240130:182429.309 zbx_dc_sync_configuration() sshitems   : 0 (0 slots)
238366:20240130:182429.309 zbx_dc_sync_configuration() telnetitems: 0 (0 slots)
238366:20240130:182429.309 zbx_dc_sync_configuration() simpleitems: 0 (0 slots)
238366:20240130:182429.310 zbx_dc_sync_configuration() jmxitems   : 0 (0 slots)
238366:20240130:182429.310 zbx_dc_sync_configuration() calcitems  : 0 (0 slots)
238366:20240130:182429.311 zbx_dc_sync_configuration() httpitems  : 0 (0 slots)
238366:20240130:182429.311 zbx_dc_sync_configuration() scriptitems  : 0 (0 slots)
238366:20240130:182429.311 zbx_dc_sync_configuration() functions  : 1 (101 slots)
238366:20240130:182429.312 zbx_dc_sync_configuration() triggers   : 5702 (9029 slots)
238366:20240130:182429.313 zbx_dc_sync_configuration() trigdeps   : 2393 (4007 slots)
238366:20240130:182429.313 zbx_dc_sync_configuration() trig. tags : 7421 (13553 slots)
238366:20240130:182429.314 zbx_dc_sync_configuration() expressions: 10 (17 slots)
238366:20240130:182429.314 zbx_dc_sync_configuration() actions    : 0 (0 slots)
238366:20240130:182429.314 zbx_dc_sync_configuration() conditions : 0 (0 slots)
238366:20240130:182429.316 zbx_dc_sync_configuration() corr.      : 0 (0 slots)
238366:20240130:182429.316 zbx_dc_sync_configuration() corr. conds: 0 (0 slots)
238366:20240130:182429.317 zbx_dc_sync_configuration() corr. ops  : 0 (0 slots)
238366:20240130:182429.317 zbx_dc_sync_configuration() hgroups    : 7 (11 slots)
238366:20240130:182429.317 zbx_dc_sync_configuration() item procs : 0 (0 slots)
238366:20240130:182429.318 zbx_dc_sync_configuration() maintenance: 1 (11 slots)
238366:20240130:182429.318 zbx_dc_sync_configuration() maint tags : 1 (11 slots)
238366:20240130:182429.319 zbx_dc_sync_configuration() maint time : 2 (11 slots)
238366:20240130:182429.320 zbx_dc_sync_configuration() drules     : 1 (11 slots)
238366:20240130:182429.320 zbx_dc_sync_configuration() dchecks    : 1 (11 slots)
238366:20240130:182429.320 zbx_dc_sync_configuration() httptests  : 0 (0 slots)
238366:20240130:182429.321 zbx_dc_sync_configuration() httptestfld: 0 (0 slots)
238366:20240130:182429.321 zbx_dc_sync_configuration() httpsteps  : 0 (0 slots)
238366:20240130:182429.322 zbx_dc_sync_configuration() httpstepfld: 0 (0 slots)
238366:20240130:182429.322 zbx_dc_sync_configuration() connector: 0 (0 slots)
238366:20240130:182429.323 zbx_dc_sync_configuration() connector tags : 0 (0 slots)
238366:20240130:182429.324 zbx_dc_sync_configuration() queue[0]   : 0 (0 allocated)
238366:20240130:182429.324 zbx_dc_sync_configuration() queue[1]   : 0 (0 allocated)
238366:20240130:182429.324 zbx_dc_sync_configuration() queue[2]   : 0 (0 allocated)
238366:20240130:182429.324 zbx_dc_sync_configuration() queue[3]   : 0 (0 allocated)
238366:20240130:182429.324 zbx_dc_sync_configuration() queue[4]   : 0 (0 allocated)
238366:20240130:182429.324 zbx_dc_sync_configuration() queue[5]   : 0 (0 allocated)
238366:20240130:182429.324 zbx_dc_sync_configuration() queue[6]   : 0 (0 allocated)
238366:20240130:182429.324 zbx_dc_sync_configuration() queue[7]   : 0 (0 allocated)
238366:20240130:182429.324 zbx_dc_sync_configuration() queue[8]   : 0 (0 allocated)
238366:20240130:182429.325 zbx_dc_sync_configuration() queue[9]   : 0 (0 allocated)
238366:20240130:182429.325 zbx_dc_sync_configuration() queue[10]   : 0 (0 allocated)
238366:20240130:182429.325 zbx_dc_sync_configuration() pqueue     : 0 (0 allocated)
238366:20240130:182429.326 zbx_dc_sync_configuration() timer queue: 0 (0 allocated)
238366:20240130:182429.327 zbx_dc_sync_configuration() changelog  : 986
238366:20240130:182429.328 zbx_dc_sync_configuration() configfree : 79.413605%
238366:20240130:182429.333 zbx_dc_sync_configuration() strings    : 7450 (13553 slots)
Comment by Vladislavs Sokurenko [ 2024 Mar 04 ]

Please see if ZBX-24095 helps, if not then there is also ZBX-24103

Comment by LivreAcesso.Pro [ 2024 Apr 16 ]

6.0.28:

                                                                   QUERY PLAN                                                                   
------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=23177.81..82987.22 rows=675696 width=261) (actual time=171.942..883.898 rows=675696 loops=1)
   Hash Cond: (i.hostid = h.hostid)
   ->  Hash Join  (cost=21694.16..79729.59 rows=675696 width=229) (actual time=165.575..710.250 rows=675696 loops=1)
         Hash Cond: (i.itemid = ir.itemid)
         ->  Seq Scan on items i  (cost=0.00..56218.73 rows=692073 width=210) (actual time=0.003..72.073 rows=690402 loops=1)
         ->  Hash  (cost=13247.96..13247.96 rows=675696 width=27) (actual time=165.076..165.077 rows=675696 loops=1)
               Buckets: 1048576  Batches: 1  Memory Usage: 50230kB
               ->  Seq Scan on item_rtdata ir  (cost=0.00..13247.96 rows=675696 width=27) (actual time=0.006..66.350 rows=675696 loops=1)
   ->  Hash  (cost=1146.09..1146.09 rows=27005 width=8) (actual time=6.347..6.348 rows=27004 loops=1)
         Buckets: 32768  Batches: 1  Memory Usage: 1311kB
         ->  Index Only Scan using hosts_pkey on hosts h  (cost=0.29..1146.09 rows=27005 width=8) (actual time=0.008..3.851 rows=27004 loops=1)
               Heap Fetches: 10723
 Planning Time: 0.301 ms
 Execution Time: 903.723 ms
(14 rows)

Comment by Evgeny Kravchenko [ 2024 Apr 25 ]

I have the same issue, but with database on MySQL server. Restart or shutdown the active node of zabbix server can spent few hours. During shutdown, the server does not experience overload, not on the CPU, not on memory, not on the disk, not on the network. A log of the service stopping during 6 hours is attached .

Comment by Vladislavs Sokurenko [ 2025 Mar 04 ]

Fixed in pull requests:

feature/ZBX-22617-7.0
feature/ZBX-22617-7.2

Comment by Vladislavs Sokurenko [ 2025 Mar 10 ]

Fixed in:

Comment by Ruslan Aznabaev [ 2025 Apr 07 ]

vso Now it works perfectly, thank you!
We had connection problems between our DCs in last week and Zabbix doesn't care about this at all.

Comment by Vladislavs Sokurenko [ 2025 Apr 07 ]

Thank you for your feedback! I'm glad to hear that everything is working perfectly now. I assume issue was also when TimescaleDB is used ?

Comment by Ruslan Aznabaev [ 2025 Apr 07 ]

Yes, with TimescaleDB.

Generated at Sun Apr 20 20:59:29 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.