[ZBX-4444] Got signal [signal:11(SIGSEGV),reason:128,refaddr:(nil)]. Crashing ... Created: 2011 Dec 15  Updated: 2017 May 30  Resolved: 2012 Sep 08

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 1.8.9
Fix Version/s: 1.8.9

Type: Incident report Priority: Critical
Reporter: leszek Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: crash
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

2.6.26-2-xen-amd64 #1 SMP Thu Nov 25 06:39:26 UTC 2010 x86_64 GNU/Linux


Attachments: Text File 7417_crash.txt    
Issue Links:
Duplicate
duplicates ZBX-4980 Trappers can hang (futex) or crash on... Closed

 Description   

Bug (crashes) appeared after upgrade to new version just 10 days ago. Specially after some modification in configuration. Yesterday I add 2 new hosts using clone previous and has several crashes - server working 1 to 6 hours and crashes

24711:20111214:164445.399 Sending list of active checks to [10.1.2.22] failed: host [10.1.2.22] not found
24719:20111214:164445.803 Sending list of active checks to [10.1.2.20] failed: host [10.1.2.20] not found
24673:20111214:164447.155 Got signal [signal:11(SIGSEGV),reason:128,refaddr:(nil)]. Crashing ...
24673:20111214:164447.155 ====== Fatal information: ======
24673:20111214:164447.155 Program counter: 0x7f84921fbf31
24673:20111214:164447.155 === Registers: ===
24673:20111214:164447.155 r8 = 29726576726553 = 11666254149608787 = 11666254149608787
24673:20111214:164447.155 r9 = 702087c4797a726f = 8079607009424077423 = 8079607009424077423
24673:20111214:164447.155 r10 = 657a6385c482c56f = 7312266371168126319 = 7312266371168126319
24673:20111214:164447.155 r11 = 202 = 514 = 514
24673:20111214:164447.155 r12 = 29c = 668 = 668
24673:20111214:164447.155 r13 = 6b2a70 = 7023216 = 7023216
24673:20111214:164447.155 r14 = e = 14 = 14
24673:20111214:164447.155 r15 = 4ee8c45c = 1323877468 = 1323877468
24673:20111214:164447.155 rdi = 2cbac5647761750c = 3223105518627288332 = 3223105518627288332
24673:20111214:164447.155 rsi = 1309660 = 19961440 = 19961440
24673:20111214:164447.155 rbp = 1309660 = 19961440 = 19961440
24673:20111214:164447.155 rbx = 7f848e82d358 = 140207303349080 = 140207303349080
24673:20111214:164447.155 rdx = e = 14 = 14
24673:20111214:164447.155 rax = 2cbac5647761750c = 3223105518627288332 = 3223105518627288332
24673:20111214:164447.155 rcx = 7250 = 29264 = 29264
24673:20111214:164447.155 rsp = 7fff443699e8 = 140734337817064 = 140734337817064



 Comments   
Comment by Alexander Vladishev [ 2011 Dec 16 ]

1. Please attach all information about crash from server log file. (from "Got signal.." till the end of a file)

2. Could you please execute and attach output of:

objdump -Dswx sbin/zabbix_server | gzip -c > zabbix_server.objdump.gz

Thank you!

Comment by leszek [ 2011 Dec 19 ]

Hi!
Because of lack of time I had to back to previous version 1.8.4 and now is impossible to mak any dump
Version 1.8.9 seems unstable in my environment. But, when my system will be less load I'll retry to upgrade again.
By the way, I wish You Marry Christmass and Happy New Year!
L.J.

Comment by leszek [ 2012 Jan 26 ]

In december I tried to upgrade to new Zabbix version (1.8.9) but system was unstable on my environment so 10 days later I decided to compile previous stable, version 1.8.4 (case ZBX-4444) and I explore it back without problem .... except logging windows events functionality.
In system (Latest data -> Windows_Logs) for every Windows client I can see last events dated 23.12.2012 and no more. When I clean history than I lost Windows_Logs from Latest_data. No other funcionality gone!. I have no errors in server log as well in client log. I use about 50 Windows clients and all of them have last info in Windows_Logs dated 23.12.2012 and no station has later dated info.
As I mentioned before everything working properly, I can telnet to customer from Zabbix server and check version, Zabbix controls state of client with ping, I can check all other system parameters on line (processor, memory, disk space etc.).
I tried to create new items and triggers to alive funcionality without efect. I have no new idea where and how to check or fix this state. I noticed in some treads similar description but no solution at all
Have anybody idea what happened and how to alive Windows_Logs?
L.J.

Comment by leszek [ 2012 Jan 27 ]

The problem is solved. The guilty was firewall on Zabbix server where one rule droped all traffic outocomming on port 10051 from server. Checking logs on Windows needs probably first activation from Zabbix server on 10051 port and responce back to server on port 10050. On Zabbix server I logged only incomming traffic so diagnostic was harder. I changed rule on firewall 23 of december because one coleque complained about heavy traffic coming from Zabbix to his server. Then I create rule on firewall, next was a lot of holidays and everybody forget about changes
Tread could be closed
L.J.

Comment by richlv [ 2012 Jan 27 ]

that shouldn't result in segfaults, though. can you reliably reproduce crashing ?

Comment by leszek [ 2012 Jan 27 ]

If You ask about crashing case, I cannot diagnose it right now because I had to downgrade to stable version 1.8.4. in december 2011
Problem with Windows logs I initially connected with upgrade and downgrade of Zabbix because of time coincidence. In fact both are different problems. So, I resolve Windows logs problem (by changing rule on firewall for 10051 port) which is separate from crash after upgrade to 1.8.9.
I will wait for next Zabbix version, higher than 1.8.9, and will back to upgrade later. Now I can't do upgrade because of more important things in my IT systems and as far, version 1.8.4 is enought for me today
Thanks for assistance anyway.
L.J.

Comment by Oleksii Zagorskyi [ 2012 May 10 ]

I definitely know that an identical log line:
Got signal [signal:11(SIGSEGV),reason:128,refaddr:(nil)]. Crashing ...
has been generated at RedHat6 64bit where working zabbix_server was 32bit binary (do not ask me why).
Version was 1.8.10

I'm attaching a file "7417_crash.txt" where it is visible (it was a trapper).
This case described in a ZBX-4980
I suppose this issue can be similar/identical to the ZBX-4980 (there crash looks differently, but I suppose it can have the same reason).

leszek, please let us know, do you use Orabbix ?

Comment by Alexei Vladishev [ 2012 Sep 08 ]

I am closing this issue, it seems to be fixed elsewhere.

Generated at Thu Mar 28 11:45:11 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.