[ZBX-4151] server crash: memory corruption Created: 2011 Sep 16 Updated: 2017 May 30 Resolved: 2011 Oct 06 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | None |
Fix Version/s: | 1.9.7 (beta) |
Type: | Incident report | Priority: | Blocker |
Reporter: | richlv | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 0 |
Labels: | crash | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
trunk rev 21658 |
Attachments: | objdump_-DSswx_zabbix_server_crash.bz2 server_crash_mallox_2.txt zabbix_server_crash.log zabbix_server_dev_branch_corrupted_double-linked_list.log.bz2 zabbix_server_dev_branch_corrupted_double-linked_list_2.log.bz2 | ||||||||
Issue Links: |
|
Comments |
Comment by Aleksandrs Saveljevs [ 2011 Sep 22 ] |
Rich, we have tried investigating this memory issue and |
Comment by richlv [ 2011 Sep 30 ] |
it just did |
Comment by richlv [ 2011 Sep 30 ] |
and looks like in both cases killed process is timer |
Comment by Aleksandrs Saveljevs [ 2011 Oct 03 ] |
Seems to crash right after start. Can you reliably reproduce the problem? |
Comment by richlv [ 2011 Oct 04 ] |
not really, it seems to happen every now and then. i could do some cycle of start/stop and see how often it happens - would that help any ? maybe some debugging output can be added to the server that i could run with ? |
Comment by Aleksandrs Saveljevs [ 2011 Oct 05 ] |
There are two things we can try doing: (a) running Zabbix server under Valgrind and (b) running Zabbix server with additional debugging output. I propose we start with (b). To that end, could you please try running Zabbix server from svn://svn.zabbix.com/branches/dev/ZBX-4151 ? It adds additional debugging output to memory allocation routines so that we can find out which allocated buffer is most close to the corrupted part of memory. You can probably keep DebugLevel=3, but if you could run it under DebugLevel=4, that would be nice, too. |
Comment by richlv [ 2011 Oct 05 ] |
i created a script to repeatedly start/stop server. after running it, server crashed on the first try... log of that start/crash session at debuglevel4 attached (zabbix_server_dev_branch_corrupted_double-linked_list.log.bz2) |
Comment by richlv [ 2011 Oct 05 ] |
zabbix_server_dev_branch_corrupted_double-linked_list_2.log.bz2 is another crash right after the startup. additionally, in this case one zabbix_server process did not terminate upon kill -15. stracing it reveals that it got stuck on : futex(0xb7356380, FUTEX_WAIT_PRIVATE, 2, NULL |
Comment by Aleksandrs Saveljevs [ 2011 Oct 06 ] |
Thanks, that was useful! The fix is available in development branch svn://svn.zabbix.com/branches/dev/ZBX-4151 . The problem was that the buffer for time-based triggers was allocated for 0 triggers, then the configuration cache was synced, and then a non-zero amount of triggers were processed, which resulted in corrupted memory. |
Comment by Aleksandrs Saveljevs [ 2011 Oct 06 ] |
Fixed in pre-1.9.7 in r22185. |