[ZBX-22617] Slow configuration sync "syncing trend data..." Created: 2023 Mar 30 Updated: 2025 Apr 07 Resolved: 2025 Mar 18 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Proxy (P), Server (S) |
Affects Version/s: | 6.4.0 |
Fix Version/s: | 7.0.11rc1, 7.2.5rc1, 7.4.0beta1 |
Type: | Problem report | Priority: | Major |
Reporter: | Karol Woronowicz | Assignee: | Vladislavs Sokurenko |
Resolution: | Fixed | Votes: | 10 |
Labels: | performance | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
DB: (PostgreSQL) 14.5 (Ubuntu 14.5-2.pgdg20.04+2) + timescaledb 2.7.1 Server: HA Zabbix two nodes cluster - 6.4.0 8 CPU, 64 GB RAM Number of hosts (enabled) - 20136 Everything is monitored by 10 active zabbix proxies. |
Attachments: |
![]() ![]() |
||||||||||||||||||||||||||||||||
Issue Links: |
|
||||||||||||||||||||||||||||||||
Team: | |||||||||||||||||||||||||||||||||
Sprint: | S25-W10/11 | ||||||||||||||||||||||||||||||||
Story Points: | 3 |
Description |
Hi, we have a problem with slow synchronization of the configuration, when the cluster node changes and during normal work.
283802:20230329:132044.944 "nodename" node switched to "active" mode 283802:20230329:132044.957 server #0 started [main process] 283942:20230329:132044.958 server #1 started [service manager #1] 283943:20230329:132044.959 server #2 started [configuration syncer #1] 283943:20230329:132058.629 slow query: 13.316544 sec, "select i.itemid,i.hostid,i.status,i.type,i.value_type,i.key_,i.snmp_oid,i.ipmi_sensor,i.delay,i.trapper_hosts,i.logtimefmt,i.params,ir.state,i.authtype,i.username,i.password,i.publickey,i.privatekey,i.flags,i.interfaceid,ir.lastlogsize,ir.mtime,i.history,i.trends,i.inventory_link,i.valuemapid,i.units,ir.error,i.jmx_endpoint,i.master_itemid,i.timeout,i.url,i.query_fields,i.posts,i.status_codes,i.follow_redirects,i.post_type,i.http_proxy,i.headers,i.retrieve_mode,i.request_method,i.output_format,i.ssl_cert_file,i.ssl_key_file,i.ssl_key_password,i.verify_peer,i.verify_host,i.allow_traps,i.templateid,null from items i inner join hosts h on i.hostid=h.hostid join item_rtdata ir on i.itemid=ir.itemid where h.status in (0,1) and i.flags<>2" 283943:20230329:132158.605 slow query: 4.137951 sec, "select itemtagid,itemid,tag,value from item_tag" 283943:20230329:132215.097 slow query: 5.529707 sec, "select triggerid,description,expression,error,priority,type,value,state,lastchange,status,recovery_mode,recovery_expression,correlation_mode,correlation_tag,opdata,event_name,null,null,null,flags from triggers" 284012:20230329:132319.485 server #3 started [alert manager #1]
Increased level of configuration syncer logs in the attachment. |
Comments |
Comment by Vladislavs Sokurenko [ 2023 Apr 06 ] |
Could you please be so kind and try following query: explain analyze select i.itemid,i.hostid,i.status,i.type,i.value_type,i.key_,i.snmp_oid,i.ipmi_sensor,i.delay,i.trapper_hosts,i.logtimefmt,i.params,ir.state,i.authtype,i.username,i.password,i.publickey,i.privatekey,i.flags,i.interfaceid,ir.lastlogsize,ir.mtime,i.history,i.trends,i.inventory_link,i.valuemapid,i.units,ir.error,i.jmx_endpoint,i.master_itemid,i.timeout,i.url,i.query_fields,i.posts,i.status_codes,i.follow_redirects,i.post_type,i.http_proxy,i.headers,i.retrieve_mode,i.request_method,i.output_format,i.ssl_cert_file,i.ssl_key_file,i.ssl_key_password,i.verify_peer,i.verify_host,i.allow_traps,i.templateid,null from items i inner join hosts h on i.hostid=h.hostid join item_rtdata ir on i.itemid=ir.itemid; |
Comment by Karol Woronowicz [ 2023 Apr 07 ] |
sure, that's the result: QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=2386.26..2325653.17 rows=11060863 width=280) (actual time=173.061..17036.121 rows=11079075 loops=1) Hash Cond: (i.hostid = h.hostid) -> Merge Join (cost=135.53..2294363.36 rows=11060863 width=248) (actual time=166.556..15484.177 rows=11079075 loops=1) Merge Cond: (i.itemid = ir.itemid) -> Index Scan using items_pkey on items i (cost=0.43..949380.89 rows=11569670 width=233) (actual time=166.498..6887.657 rows=11525867 loops=1) -> Index Scan using item_rtdata_pkey on item_rtdata ir (cost=0.56..1177892.11 rows=11060863 width=23) (actual time=0.028..6328.448 rows=11079075 loops=1) -> Hash (cost=1893.66..1893.66 rows=28566 width=8) (actual time=6.483..6.487 rows=29646 loops=1) Buckets: 32768 Batches: 1 Memory Usage: 1415kB -> Seq Scan on hosts h (cost=0.00..1893.66 rows=28566 width=8) (actual time=0.009..4.090 rows=29646 loops=1) Planning Time: 1.479 ms JIT: Functions: 14 Options: Inlining true, Optimization true, Expressions true, Deforming true Timing: Generation 1.263 ms, Inlining 37.421 ms, Optimization 80.721 ms, Emission 48.118 ms, Total 167.523 ms Execution Time: 17330.515 ms I would like to add that the number of items has increased to 10.5+ million. |
Comment by Vladislavs Sokurenko [ 2024 Jan 30 ] |
Please increase log level to debug and provide statistics from history syncer 38366:20240130:182429.268 zbx_dc_sync_configuration() changelog : sql:0.013996 sec (1 records) 238366:20240130:182429.271 zbx_dc_sync_configuration() config : sql:0.001789 sync:0.002462 sec (1/0/0). 238366:20240130:182429.272 zbx_dc_sync_configuration() autoreg : sql:0.001018 sync:0.000667 sec (0/0/0). 238366:20240130:182429.273 zbx_dc_sync_configuration() autoreg host : sql:0.000022 sync:0.000648 sec (0/0/0). 238366:20240130:182429.274 zbx_dc_sync_configuration() hosts : sql:0.005960 sync:0.001522 sec (0/1/0). 238366:20240130:182429.275 zbx_dc_sync_configuration() host_invent: sql:0.003719 sync:0.000871 sec (0/0/0). 238366:20240130:182429.276 zbx_dc_sync_configuration() templates : sql:0.004195 sec (0/0/0). 238366:20240130:182429.277 zbx_dc_sync_configuration() globmacros : sql:0.005729 sec (0/0/0). 238366:20240130:182429.277 zbx_dc_sync_configuration() hostmacros : sql:0.210211 sec (0/0/0). 238366:20240130:182429.277 zbx_dc_sync_configuration() interfaces : sql:0.006219 sync:0.001725 sec (25/0/0). 238366:20240130:182429.278 zbx_dc_sync_configuration() items : sql:0.000281 sync:0.001360 sec (0/0/0). 238366:20240130:182429.278 zbx_dc_sync_configuration() template_items : sql:0.292978 sync:0.000979 sec (0/0/0). 238366:20240130:182429.279 zbx_dc_sync_configuration() prototype_items : sql:0.000840 sync:0.001099 sec (0/0/0). 238366:20240130:182429.279 zbx_dc_sync_configuration() item_discovery : sql:0.133969 sync:0.000742 sec (0/0/0). 238366:20240130:182429.280 zbx_dc_sync_configuration() triggers : sql:0.000202 sync:0.000731 sec (0/0/0). 238366:20240130:182429.280 zbx_dc_sync_configuration() trigdeps : sql:0.049399 sync:0.000748 sec (0/0/0). 238366:20240130:182429.281 zbx_dc_sync_configuration() trig. tags : sql:0.000196 sync:0.000634 sec (0/0/0). 238366:20240130:182429.281 zbx_dc_sync_configuration() host tags : sql:0.001544 sync:0.000767 sec (0/0/0). 238366:20240130:182429.281 zbx_dc_sync_configuration() item tags : sql:0.000351 sync:0.001047 sec (0/0/0). 238366:20240130:182429.282 zbx_dc_sync_configuration() functions : sql:0.000302 sync:0.000872 sec (0/0/0). 238366:20240130:182429.282 zbx_dc_sync_configuration() expressions: sql:0.003155 sync:0.000855 sec (0/0/0). 238366:20240130:182429.283 zbx_dc_sync_configuration() actions : sql:0.001387 sync:0.000683 sec (0/0/0). 238366:20240130:182429.284 zbx_dc_sync_configuration() operations : sql:0.000999 sync:0.000684 sec (0/0/0). 238366:20240130:182429.285 zbx_dc_sync_configuration() conditions : sql:0.001738 sync:0.000614 sec (0/0/0). 238366:20240130:182429.285 zbx_dc_sync_configuration() corr : sql:0.001095 sync:0.000852 sec (0/0/0). 238366:20240130:182429.285 zbx_dc_sync_configuration() corr_cond : sql:0.001299 sync:0.000662 sec (0/0/0). 238366:20240130:182429.286 zbx_dc_sync_configuration() corr_op : sql:0.000980 sync:0.000680 sec (0/0/0). 238366:20240130:182429.287 zbx_dc_sync_configuration() hgroups : sql:0.003711 sync:0.002239 sec (0/0/0). 238366:20240130:182429.287 zbx_dc_sync_configuration() item pproc : sql:0.000371 sync:0.001095 sec (0/0/0). 238366:20240130:182429.288 zbx_dc_sync_configuration() item script param: sql:0.002087 sync:0.001808 sec (0/0/0). 238366:20240130:182429.288 zbx_dc_sync_configuration() maintenance: sql:0.011640 sync:0.003615 sec (0/0/0). 238366:20240130:182429.289 zbx_dc_sync_configuration() drules : sql:0.000327 sync:0.001503 sec (0/0/0). 238366:20240130:182429.290 zbx_dc_sync_configuration() dchecks : (0/0/0). 238366:20240130:182429.291 zbx_dc_sync_configuration() httptests : sql:0.000584 sync:0.004232 sec (0/0/0). 238366:20240130:182429.291 zbx_dc_sync_configuration() httptestfld : (0/0/0). 238366:20240130:182429.292 zbx_dc_sync_configuration() httpsteps : (0/0/0). 238366:20240130:182429.293 zbx_dc_sync_configuration() httpstepfld : (0/0/0). 238366:20240130:182429.293 zbx_dc_sync_configuration() connector: sql:0.000292 sync:0.001296 sec (0/0/0). 238366:20240130:182429.294 zbx_dc_sync_configuration() connector_tag: sql:0.000292 sync:0.001296 sec (0/0/0). 238366:20240130:182429.295 zbx_dc_sync_configuration() proxy: sql:0.000199 sync:0.000714 sec (0/0/0). 238366:20240130:182429.296 zbx_dc_sync_configuration() macro cache: 0.000177 sec. 238366:20240130:182429.297 zbx_dc_sync_configuration() reindex : 0.011809 sec. 238366:20240130:182429.297 zbx_dc_sync_configuration() total sql : 0.745157 sec. 238366:20240130:182429.297 zbx_dc_sync_configuration() total sync : 0.047818 sec. 238366:20240130:182429.298 zbx_dc_sync_configuration() proxies : 0 (0 slots) 238366:20240130:182429.298 zbx_dc_sync_configuration() proxies_p : 0 (0 slots) 238366:20240130:182429.299 zbx_dc_sync_configuration() hosts : 1 (11 slots) 238366:20240130:182429.299 zbx_dc_sync_configuration() hosts_h : 1 (11 slots) 238366:20240130:182429.300 zbx_dc_sync_configuration() autoreg_hosts: 0 (11 slots) 238366:20240130:182429.300 zbx_dc_sync_configuration() psks : 0 (0 slots) 238366:20240130:182429.300 zbx_dc_sync_configuration() ipmihosts : 0 (0 slots) 238366:20240130:182429.301 zbx_dc_sync_configuration() host_invent: 2 (11 slots) 238366:20240130:182429.301 zbx_dc_sync_configuration() glob macros: 1 (11 slots) 238366:20240130:182429.301 zbx_dc_sync_configuration() host macros: 5238 (9029 slots) 238366:20240130:182429.302 zbx_dc_sync_configuration() kvs_paths : 0 238366:20240130:182429.302 zbx_dc_sync_configuration() interfaces : 1 (11 slots) 238366:20240130:182429.303 zbx_dc_sync_configuration() interfaces_snmp : 0 (0 slots) 238366:20240130:182429.303 zbx_dc_sync_configuration() interfac_ht: 1 (11 slots) 238366:20240130:182429.304 zbx_dc_sync_configuration() if_snmpitms: 0 (0 slots) 238366:20240130:182429.304 zbx_dc_sync_configuration() if_snmpaddr: 0 (0 slots) 238366:20240130:182429.304 zbx_dc_sync_configuration() item_discovery : 6400 (9029 slots) 238366:20240130:182429.305 zbx_dc_sync_configuration() items : 1 (101 slots) 238366:20240130:182429.305 zbx_dc_sync_configuration() items_hk : 1 (101 slots) 238366:20240130:182429.305 zbx_dc_sync_configuration() numitems : 1 (11 slots) 238366:20240130:182429.306 zbx_dc_sync_configuration() preprocitems: 0 (0 slots) 238366:20240130:182429.306 zbx_dc_sync_configuration() snmpitems : 0 (0 slots) 238366:20240130:182429.307 zbx_dc_sync_configuration() ipmiitems : 0 (0 slots) 238366:20240130:182429.307 zbx_dc_sync_configuration() trapitems : 0 (0 slots) 238366:20240130:182429.308 zbx_dc_sync_configuration() dependentitems : 0 (0 slots) 238366:20240130:182429.308 zbx_dc_sync_configuration() logitems : 0 (0 slots) 238366:20240130:182429.308 zbx_dc_sync_configuration() dbitems : 0 (0 slots) 238366:20240130:182429.309 zbx_dc_sync_configuration() sshitems : 0 (0 slots) 238366:20240130:182429.309 zbx_dc_sync_configuration() telnetitems: 0 (0 slots) 238366:20240130:182429.309 zbx_dc_sync_configuration() simpleitems: 0 (0 slots) 238366:20240130:182429.310 zbx_dc_sync_configuration() jmxitems : 0 (0 slots) 238366:20240130:182429.310 zbx_dc_sync_configuration() calcitems : 0 (0 slots) 238366:20240130:182429.311 zbx_dc_sync_configuration() httpitems : 0 (0 slots) 238366:20240130:182429.311 zbx_dc_sync_configuration() scriptitems : 0 (0 slots) 238366:20240130:182429.311 zbx_dc_sync_configuration() functions : 1 (101 slots) 238366:20240130:182429.312 zbx_dc_sync_configuration() triggers : 5702 (9029 slots) 238366:20240130:182429.313 zbx_dc_sync_configuration() trigdeps : 2393 (4007 slots) 238366:20240130:182429.313 zbx_dc_sync_configuration() trig. tags : 7421 (13553 slots) 238366:20240130:182429.314 zbx_dc_sync_configuration() expressions: 10 (17 slots) 238366:20240130:182429.314 zbx_dc_sync_configuration() actions : 0 (0 slots) 238366:20240130:182429.314 zbx_dc_sync_configuration() conditions : 0 (0 slots) 238366:20240130:182429.316 zbx_dc_sync_configuration() corr. : 0 (0 slots) 238366:20240130:182429.316 zbx_dc_sync_configuration() corr. conds: 0 (0 slots) 238366:20240130:182429.317 zbx_dc_sync_configuration() corr. ops : 0 (0 slots) 238366:20240130:182429.317 zbx_dc_sync_configuration() hgroups : 7 (11 slots) 238366:20240130:182429.317 zbx_dc_sync_configuration() item procs : 0 (0 slots) 238366:20240130:182429.318 zbx_dc_sync_configuration() maintenance: 1 (11 slots) 238366:20240130:182429.318 zbx_dc_sync_configuration() maint tags : 1 (11 slots) 238366:20240130:182429.319 zbx_dc_sync_configuration() maint time : 2 (11 slots) 238366:20240130:182429.320 zbx_dc_sync_configuration() drules : 1 (11 slots) 238366:20240130:182429.320 zbx_dc_sync_configuration() dchecks : 1 (11 slots) 238366:20240130:182429.320 zbx_dc_sync_configuration() httptests : 0 (0 slots) 238366:20240130:182429.321 zbx_dc_sync_configuration() httptestfld: 0 (0 slots) 238366:20240130:182429.321 zbx_dc_sync_configuration() httpsteps : 0 (0 slots) 238366:20240130:182429.322 zbx_dc_sync_configuration() httpstepfld: 0 (0 slots) 238366:20240130:182429.322 zbx_dc_sync_configuration() connector: 0 (0 slots) 238366:20240130:182429.323 zbx_dc_sync_configuration() connector tags : 0 (0 slots) 238366:20240130:182429.324 zbx_dc_sync_configuration() queue[0] : 0 (0 allocated) 238366:20240130:182429.324 zbx_dc_sync_configuration() queue[1] : 0 (0 allocated) 238366:20240130:182429.324 zbx_dc_sync_configuration() queue[2] : 0 (0 allocated) 238366:20240130:182429.324 zbx_dc_sync_configuration() queue[3] : 0 (0 allocated) 238366:20240130:182429.324 zbx_dc_sync_configuration() queue[4] : 0 (0 allocated) 238366:20240130:182429.324 zbx_dc_sync_configuration() queue[5] : 0 (0 allocated) 238366:20240130:182429.324 zbx_dc_sync_configuration() queue[6] : 0 (0 allocated) 238366:20240130:182429.324 zbx_dc_sync_configuration() queue[7] : 0 (0 allocated) 238366:20240130:182429.324 zbx_dc_sync_configuration() queue[8] : 0 (0 allocated) 238366:20240130:182429.325 zbx_dc_sync_configuration() queue[9] : 0 (0 allocated) 238366:20240130:182429.325 zbx_dc_sync_configuration() queue[10] : 0 (0 allocated) 238366:20240130:182429.325 zbx_dc_sync_configuration() pqueue : 0 (0 allocated) 238366:20240130:182429.326 zbx_dc_sync_configuration() timer queue: 0 (0 allocated) 238366:20240130:182429.327 zbx_dc_sync_configuration() changelog : 986 238366:20240130:182429.328 zbx_dc_sync_configuration() configfree : 79.413605% 238366:20240130:182429.333 zbx_dc_sync_configuration() strings : 7450 (13553 slots) |
Comment by Vladislavs Sokurenko [ 2024 Mar 04 ] |
Please see if |
Comment by LivreAcesso.Pro [ 2024 Apr 16 ] |
6.0.28: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------ Hash Join (cost=23177.81..82987.22 rows=675696 width=261) (actual time=171.942..883.898 rows=675696 loops=1) Hash Cond: (i.hostid = h.hostid) -> Hash Join (cost=21694.16..79729.59 rows=675696 width=229) (actual time=165.575..710.250 rows=675696 loops=1) Hash Cond: (i.itemid = ir.itemid) -> Seq Scan on items i (cost=0.00..56218.73 rows=692073 width=210) (actual time=0.003..72.073 rows=690402 loops=1) -> Hash (cost=13247.96..13247.96 rows=675696 width=27) (actual time=165.076..165.077 rows=675696 loops=1) Buckets: 1048576 Batches: 1 Memory Usage: 50230kB -> Seq Scan on item_rtdata ir (cost=0.00..13247.96 rows=675696 width=27) (actual time=0.006..66.350 rows=675696 loops=1) -> Hash (cost=1146.09..1146.09 rows=27005 width=8) (actual time=6.347..6.348 rows=27004 loops=1) Buckets: 32768 Batches: 1 Memory Usage: 1311kB -> Index Only Scan using hosts_pkey on hosts h (cost=0.29..1146.09 rows=27005 width=8) (actual time=0.008..3.851 rows=27004 loops=1) Heap Fetches: 10723 Planning Time: 0.301 ms Execution Time: 903.723 ms (14 rows) |
Comment by Evgeny Kravchenko [ 2024 Apr 25 ] |
I have the same issue, but with database on MySQL server. Restart or shutdown the active node of zabbix server can spent few hours. During shutdown, the server does not experience overload, not on the CPU, not on memory, not on the disk, not on the network. A log of the service stopping during 6 hours is attached |
Comment by Vladislavs Sokurenko [ 2025 Mar 04 ] |
Fixed in pull requests: |
Comment by Vladislavs Sokurenko [ 2025 Mar 10 ] |
Fixed in:
|
Comment by Ruslan Aznabaev [ 2025 Apr 07 ] |
vso Now it works perfectly, thank you! |
Comment by Vladislavs Sokurenko [ 2025 Apr 07 ] |
Thank you for your feedback! I'm glad to hear that everything is working perfectly now. I assume issue was also when TimescaleDB is used ? |
Comment by Ruslan Aznabaev [ 2025 Apr 07 ] |
Yes, with TimescaleDB. |