[ZBX-16170] History syncer process crash Created: 2019 May 24  Updated: 2024 Apr 10  Resolved: 2019 Jun 01

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 4.0.7
Fix Version/s: 4.0.10rc1, 4.2.4rc1, 4.4.0alpha1, 4.4 (plan)

Type: Problem report Priority: Blocker
Reporter: Edgar Akhmetshin Assignee: Vladislavs Sokurenko
Resolution: Fixed Votes: 1
Labels: crash
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File ZBX-16170-4.0-be1300be5cd.diff     File ZBX-16170.diff     File log.txt.7z     Zip Archive objdump.zip     File strace.txt.7z     File zabbix-server-mysql-4.0.7-1.el7.x86_64.rpm    
Issue Links:
Causes
Team: Team A
Sprint: Sprint 52 (May 2019)
Story Points: 0.25

 Description   

Steps to reproduce:

  1. Install Zabbix Server MySQL 4.0.7
  2. monitor something (VMware/IPMI/etc)

Result:
log.txt.7z
strace.txt.7z
objdump.zip

Expected:
No crash



 Comments   
Comment by Andris Mednis [ 2019 May 24 ]
 64366:20190524:021927.864 Got signal [signal:11(SIGSEGV),reason:128,refaddr:(nil)]. Crashing ...
 64366:20190524:021927.864 ====== Fatal information: ======
 64366:20190524:021927.864 Program counter: 0x538b6c
 64366:20190524:021927.864 === Registers: ===
 64366:20190524:021927.864 r8      = cccccccccccccccd = 14757395258967641293 = -3689348814741910323
 64366:20190524:021927.864 r9      = ffffffffffffffff = 18446744073709551615 =                   -1
 64366:20190524:021927.864 r10     =               18 =                   24 =                   24
 64366:20190524:021927.864 r11     =     7fd2272dcdc0 =      140540577172928 =      140540577172928
 64366:20190524:021927.865 r12     =     7ffd2fa69f00 =      140725402902272 =      140725402902272
 64366:20190524:021927.865 r13     =                0 =                    0 =                    0
 64366:20190524:021927.865 r14     =           a316d0 =             10688208 =             10688208
 64366:20190524:021927.865 r15     = b91f000000002000 = 13339380621294706688 = -5107363452414844928
 64366:20190524:021927.865 rdi     =     7ffd2fa69f00 =      140725402902272 =      140725402902272
 64366:20190524:021927.865 rsi     =           a10340 =             10552128 =             10552128
 64366:20190524:021927.865 rbp     =                0 =                    0 =                    0
 64366:20190524:021927.865 rbx     =           b91e48 =             12131912 =             12131912
 64366:20190524:021927.865 rdx     =                0 =                    0 =                    0
 64366:20190524:021927.865 rax     =           b91ecb =             12132043 =             12132043
 64366:20190524:021927.865 rcx     =             3832 =                14386 =                14386
 64366:20190524:021927.865 rsp     =     7ffd2fa69d80 =      140725402901888 =      140725402901888
 64366:20190524:021927.865 rip     =           538b6c =              5475180 =              5475180
 64366:20190524:021927.865 efl     =            10202 =                66050 =                66050
 64366:20190524:021927.865 csgsfs  =               33 =                   51 =                   51
 64366:20190524:021927.865 err     =                0 =                    0 =                    0
 64366:20190524:021927.865 trapno  =                d =                   13 =                   13
 64366:20190524:021927.865 oldmask =                0 =                    0 =                    0
 64366:20190524:021927.865 cr2     =                0 =                    0 =                    0
 64366:20190524:021927.865 === Backtrace: ===
 64366:20190524:021927.866 13: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history](zbx_backtrace+0x46) [0x4afad6]
 64366:20190524:021927.866 12: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history](zbx_log_fatal_info+0x255) [0x4afe35]
 64366:20190524:021927.866 11: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history]() [0x4affd6]
 64366:20190524:021927.866 10: /lib64/libpthread.so.0(+0xf5d0) [0x7fd22a2cc5d0]
 64366:20190524:021927.866 9: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history](zbx_process_events+0xbcc) [0x538b6c]
 64366:20190524:021927.866 8: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history]() [0x486c74]
 64366:20190524:021927.866 7: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history](zbx_sync_history_cache+0x3a) [0x487e1a]
 64366:20190524:021927.866 6: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history](dbsyncer_thread+0x15f) [0x42a0ef]
 64366:20190524:021927.866 5: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history](zbx_thread_start+0x3e) [0x4b916e]
 64366:20190524:021927.866 4: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history](MAIN_ZABBIX_ENTRY+0x7d6) [0x424fe6]
 64366:20190524:021927.866 3: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history](daemon_start+0x1b4) [0x4af584]
 64366:20190524:021927.866 2: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history](main+0x3c3) [0x424053]
 64366:20190524:021927.866 1: /lib64/libc.so.6(__libc_start_main+0xf5) [0x7fd2271723d5]
 64366:20190524:021927.866 0: /usr/sbin/zabbix_server: history syncer #4 [processed 657 values, 388 triggers in 0.569434 sec, syncing history]() [0x42437f]
Comment by Vladislavs Sokurenko [ 2019 May 24 ]

Is global correlation used ?

Comment by Charan kumar Naik [ 2019 May 27 ]

Yes, Global correlation used.

Comment by Vladislavs Sokurenko [ 2019 May 27 ]

Is it an option that I provide patch with additional debug and you recompile Zabbix server binaries ?

Comment by Charan kumar Naik [ 2019 May 27 ]

Zabbix packages are used. If it is needed to re-install, it can be done.

Comment by Vladislavs Sokurenko [ 2019 May 27 ]

Please try ZBX-16170.diff it contains additional debug to find out the exact place of crash, it should help us to diagnose the problem.

Are you using any history export modules ?

Comment by Vladislavs Sokurenko [ 2019 May 27 ]

Similar issue:
ZBX-15185

Comment by Vladislavs Sokurenko [ 2019 May 28 ]

How often is configuration sync performed ?

Comment by Charan kumar Naik [ 2019 May 28 ]

Configuration sync is set to default, One hour.

Comment by Vladislavs Sokurenko [ 2019 May 28 ]

Issue can be reproduced easily with following patch and event correlation closing previous event

diff --git a/src/zabbix_server/events.c b/src/zabbix_server/events.c
index 31ac5956db..4c18075a01 100644
--- a/src/zabbix_server/events.c
+++ b/src/zabbix_server/events.c
@@ -116,7 +116,7 @@ int zbx_add_event(unsigned char source, unsigned char object, zbx_uint64_t objec
 
        if (events_num == events_alloc)
        {
-               events_alloc += 64;
+               events_alloc += 1;
                events = (DB_EVENT *)zbx_realloc(events, sizeof(DB_EVENT) * events_alloc);
        }

==9703== Invalid read of size 8
==9703==    at 0x7F5A25: correlation_execute_operations (events.c:1044)
==9703==    by 0x7F648B: correlate_event_by_global_rules (events.c:1165)
==9703==    by 0x7F668D: correlate_events_by_global_rules (events.c:1203)
==9703==    by 0x7FEDC7: zbx_process_events (events.c:2500)
==9703==    by 0x58AF32: sync_server_history (dbcache.c:3011)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==    by 0x41F753: MAIN_ZABBIX_ENTRY (server.c:1160)
==9703==    by 0x648CDD: daemon_start (daemon.c:386)
==9703==    by 0x41E69B: main (server.c:879)
==9703==  Address 0x6ca0da0 is 160 bytes inside a block of size 168 free'd
==9703==    at 0x483AD19: realloc (vg_replace_malloc.c:836)
==9703==    by 0x686719: zbx_realloc2 (misc.c:550)
==9703==    by 0x7EF376: zbx_add_event (events.c:120)
==9703==    by 0x7F0FE3: close_trigger_event (events.c:229)
==9703==    by 0x7F59C0: correlation_execute_operations (events.c:1039)
==9703==    by 0x7F648B: correlate_event_by_global_rules (events.c:1165)
==9703==    by 0x7F668D: correlate_events_by_global_rules (events.c:1203)
==9703==    by 0x7FEDC7: zbx_process_events (events.c:2500)
==9703==    by 0x58AF32: sync_server_history (dbcache.c:3011)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==  Block was alloc'd at
==9703==    at 0x4838748: malloc (vg_replace_malloc.c:308)
==9703==    by 0x483AD63: realloc (vg_replace_malloc.c:836)
==9703==    by 0x686719: zbx_realloc2 (misc.c:550)
==9703==    by 0x7EF376: zbx_add_event (events.c:120)
==9703==    by 0x7A6327: zbx_process_trigger (trigger.c:121)
==9703==    by 0x7A705B: zbx_process_triggers (trigger.c:277)
==9703==    by 0x580D3A: recalculate_triggers (dbcache.c:1562)
==9703==    by 0x58AF19: sync_server_history (dbcache.c:3008)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==    by 0x41F753: MAIN_ZABBIX_ENTRY (server.c:1160)
==9703== 
==9703== Invalid write of size 8
==9703==    at 0x7F5A8D: correlation_execute_operations (events.c:1044)
==9703==    by 0x7F648B: correlate_event_by_global_rules (events.c:1165)
==9703==    by 0x7F668D: correlate_events_by_global_rules (events.c:1203)
==9703==    by 0x7FEDC7: zbx_process_events (events.c:2500)
==9703==    by 0x58AF32: sync_server_history (dbcache.c:3011)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==    by 0x41F753: MAIN_ZABBIX_ENTRY (server.c:1160)
==9703==    by 0x648CDD: daemon_start (daemon.c:386)
==9703==    by 0x41E69B: main (server.c:879)
==9703==  Address 0x6ca0da0 is 160 bytes inside a block of size 168 free'd
==9703==    at 0x483AD19: realloc (vg_replace_malloc.c:836)
==9703==    by 0x686719: zbx_realloc2 (misc.c:550)
==9703==    by 0x7EF376: zbx_add_event (events.c:120)
==9703==    by 0x7F0FE3: close_trigger_event (events.c:229)
==9703==    by 0x7F59C0: correlation_execute_operations (events.c:1039)
==9703==    by 0x7F648B: correlate_event_by_global_rules (events.c:1165)
==9703==    by 0x7F668D: correlate_events_by_global_rules (events.c:1203)
==9703==    by 0x7FEDC7: zbx_process_events (events.c:2500)
==9703==    by 0x58AF32: sync_server_history (dbcache.c:3011)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==  Block was alloc'd at
==9703==    at 0x4838748: malloc (vg_replace_malloc.c:308)
==9703==    by 0x483AD63: realloc (vg_replace_malloc.c:836)
==9703==    by 0x686719: zbx_realloc2 (misc.c:550)
==9703==    by 0x7EF376: zbx_add_event (events.c:120)
==9703==    by 0x7A6327: zbx_process_trigger (trigger.c:121)
==9703==    by 0x7A705B: zbx_process_triggers (trigger.c:277)
==9703==    by 0x580D3A: recalculate_triggers (dbcache.c:1562)
==9703==    by 0x58AF19: sync_server_history (dbcache.c:3008)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==    by 0x41F753: MAIN_ZABBIX_ENTRY (server.c:1160)
==9703== 
==9703== Invalid read of size 8
==9703==    at 0x7F66D9: correlate_events_by_global_rules (events.c:1207)
==9703==    by 0x7FEDC7: zbx_process_events (events.c:2500)
==9703==    by 0x58AF32: sync_server_history (dbcache.c:3011)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==    by 0x41F753: MAIN_ZABBIX_ENTRY (server.c:1160)
==9703==    by 0x648CDD: daemon_start (daemon.c:386)
==9703==    by 0x41E69B: main (server.c:879)
==9703==  Address 0x6ca0da0 is 160 bytes inside a block of size 168 free'd
==9703==    at 0x483AD19: realloc (vg_replace_malloc.c:836)
==9703==    by 0x686719: zbx_realloc2 (misc.c:550)
==9703==    by 0x7EF376: zbx_add_event (events.c:120)
==9703==    by 0x7F0FE3: close_trigger_event (events.c:229)
==9703==    by 0x7F59C0: correlation_execute_operations (events.c:1039)
==9703==    by 0x7F648B: correlate_event_by_global_rules (events.c:1165)
==9703==    by 0x7F668D: correlate_events_by_global_rules (events.c:1203)
==9703==    by 0x7FEDC7: zbx_process_events (events.c:2500)
==9703==    by 0x58AF32: sync_server_history (dbcache.c:3011)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==  Block was alloc'd at
==9703==    at 0x4838748: malloc (vg_replace_malloc.c:308)
==9703==    by 0x483AD63: realloc (vg_replace_malloc.c:836)
==9703==    by 0x686719: zbx_realloc2 (misc.c:550)
==9703==    by 0x7EF376: zbx_add_event (events.c:120)
==9703==    by 0x7A6327: zbx_process_trigger (trigger.c:121)
==9703==    by 0x7A705B: zbx_process_triggers (trigger.c:277)
==9703==    by 0x580D3A: recalculate_triggers (dbcache.c:1562)
==9703==    by 0x58AF19: sync_server_history (dbcache.c:3008)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==    by 0x41F753: MAIN_ZABBIX_ENTRY (server.c:1160)
==9703== 
==9703== Invalid read of size 8
==9703==    at 0x64CC54: zbx_default_uint64_compare_func (algodefs.c:283)
==9703==    by 0x64CD88: zbx_default_uint64_ptr_compare_func (algodefs.c:293)
==9703==    by 0x61B3178: bsearch (in /usr/lib64/libc-2.29.so)
==9703==    by 0x6716B2: zbx_vector_ptr_bsearch (vector.c:28)
==9703==    by 0x7F6749: correlate_events_by_global_rules (events.c:1209)
==9703==    by 0x7FEDC7: zbx_process_events (events.c:2500)
==9703==    by 0x58AF32: sync_server_history (dbcache.c:3011)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==    by 0x41F753: MAIN_ZABBIX_ENTRY (server.c:1160)
==9703==    by 0x648CDD: daemon_start (daemon.c:386)
==9703==  Address 0x6ca0d48 is 72 bytes inside a block of size 168 free'd
==9703==    at 0x483AD19: realloc (vg_replace_malloc.c:836)
==9703==    by 0x686719: zbx_realloc2 (misc.c:550)
==9703==    by 0x7EF376: zbx_add_event (events.c:120)
==9703==    by 0x7F0FE3: close_trigger_event (events.c:229)
==9703==    by 0x7F59C0: correlation_execute_operations (events.c:1039)
==9703==    by 0x7F648B: correlate_event_by_global_rules (events.c:1165)
==9703==    by 0x7F668D: correlate_events_by_global_rules (events.c:1203)
==9703==    by 0x7FEDC7: zbx_process_events (events.c:2500)
==9703==    by 0x58AF32: sync_server_history (dbcache.c:3011)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==  Block was alloc'd at
==9703==    at 0x4838748: malloc (vg_replace_malloc.c:308)
==9703==    by 0x483AD63: realloc (vg_replace_malloc.c:836)
==9703==    by 0x686719: zbx_realloc2 (misc.c:550)
==9703==    by 0x7EF376: zbx_add_event (events.c:120)
==9703==    by 0x7A6327: zbx_process_trigger (trigger.c:121)
==9703==    by 0x7A705B: zbx_process_triggers (trigger.c:277)
==9703==    by 0x580D3A: recalculate_triggers (dbcache.c:1562)
==9703==    by 0x58AF19: sync_server_history (dbcache.c:3008)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==    by 0x41F753: MAIN_ZABBIX_ENTRY (server.c:1160)
==9703== 
==9703== Invalid read of size 8
==9703==    at 0x64CCB4: zbx_default_uint64_compare_func (algodefs.c:283)
==9703==    by 0x64CD88: zbx_default_uint64_ptr_compare_func (algodefs.c:293)
==9703==    by 0x61B3178: bsearch (in /usr/lib64/libc-2.29.so)
==9703==    by 0x6716B2: zbx_vector_ptr_bsearch (vector.c:28)
==9703==    by 0x7F6749: correlate_events_by_global_rules (events.c:1209)
==9703==    by 0x7FEDC7: zbx_process_events (events.c:2500)
==9703==    by 0x58AF32: sync_server_history (dbcache.c:3011)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==    by 0x41F753: MAIN_ZABBIX_ENTRY (server.c:1160)
==9703==    by 0x648CDD: daemon_start (daemon.c:386)
==9703==  Address 0x6ca0d48 is 72 bytes inside a block of size 168 free'd
==9703==    at 0x483AD19: realloc (vg_replace_malloc.c:836)
==9703==    by 0x686719: zbx_realloc2 (misc.c:550)
==9703==    by 0x7EF376: zbx_add_event (events.c:120)
==9703==    by 0x7F0FE3: close_trigger_event (events.c:229)
==9703==    by 0x7F59C0: correlation_execute_operations (events.c:1039)
==9703==    by 0x7F648B: correlate_event_by_global_rules (events.c:1165)
==9703==    by 0x7F668D: correlate_events_by_global_rules (events.c:1203)
==9703==    by 0x7FEDC7: zbx_process_events (events.c:2500)
==9703==    by 0x58AF32: sync_server_history (dbcache.c:3011)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==  Block was alloc'd at
==9703==    at 0x4838748: malloc (vg_replace_malloc.c:308)
==9703==    by 0x483AD63: realloc (vg_replace_malloc.c:836)
==9703==    by 0x686719: zbx_realloc2 (misc.c:550)
==9703==    by 0x7EF376: zbx_add_event (events.c:120)
==9703==    by 0x7A6327: zbx_process_trigger (trigger.c:121)
==9703==    by 0x7A705B: zbx_process_triggers (trigger.c:277)
==9703==    by 0x580D3A: recalculate_triggers (dbcache.c:1562)
==9703==    by 0x58AF19: sync_server_history (dbcache.c:3008)
==9703==    by 0x58BCAC: zbx_sync_history_cache (dbcache.c:3216)
==9703==    by 0x432B54: dbsyncer_thread (dbsyncer.c:86)
==9703==    by 0x679B7B: zbx_thread_start (threads.c:132)
==9703==    by 0x41F753: MAIN_ZABBIX_ENTRY (server.c:1160)
Comment by Vladislavs Sokurenko [ 2019 May 28 ]

Please try patch ZBX-16170-4.0-be1300be5cd.diff, it should fix the issue

Comment by Vladislavs Sokurenko [ 2019 May 31 ]

Fixed in:

  • pre-4.0.9rc1 d7b001463ea
  • pre-4.2.3rc1 157d8447f2e
  • 4.4.0alpha1 (trunk)* a795c27355a
Generated at Thu Apr 25 11:35:19 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.