-
Incident report
-
Resolution: Fixed
-
Major
-
None
-
1.8.11
-
CentOS 6.2 64bit
I've got zabbix server hang with following backtrace (from gcore) and strace
Backtrace
Core was generated by `zabbix_server_pgsql'. #0 0x00007fed9cd15893 in __select_nocancel () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install zabbix-server-pgsql-1.8.11-0.x86_64 (gdb) bt full #0 0x00007fed9cd15893 in __select_nocancel () from /lib64/libc.so.6 No symbol table info available. #1 0x00007fed9d8c267d in ?? () from /usr/lib64/libOpenIPMIposix.so.0 No symbol table info available. #2 0x00007fed9d8c2c5e in sel_select () from /usr/lib64/libOpenIPMIposix.so.0 No symbol table info available. #3 0x00007fed9d8c057c in ?? () from /usr/lib64/libOpenIPMIposix.so.0 No symbol table info available. #4 0x00000000004173e3 in init_ipmi_host () No symbol table info available. #5 0x0000000000418412 in get_value_ipmi () No symbol table info available. #6 0x000000000041a350 in get_values () No symbol table info available. #7 0x000000000041ab4f in main_poller_loop () No symbol table info available. #8 0x000000000041199d in MAIN_ZABBIX_ENTRY () No symbol table info available. #9 0x0000000000440297 in daemon_start () No symbol table info available. #10 0x00007fed9cc55cdd in __libc_start_main () from /lib64/libc.so.6 No symbol table info available. #11 0x000000000040dd19 in _start () No symbol table info available. (gdb) quit
strace
0.000000 select(1, [0], [], [], {7, 858908}) = -1 EBADF (Bad file descriptor) 0.000077 select(1, [0], [], [], {7, 858825}) = -1 EBADF (Bad file descriptor) 0.000042 select(1, [0], [], [], {7, 858782}) = -1 EBADF (Bad file descriptor) 0.000041 select(1, [0], [], [], {7, 858741}) = -1 EBADF (Bad file descriptor) 0.000074 select(1, [0], [], [], {7, 858674}) = -1 EBADF (Bad file descriptor) 0.000050 select(1, [0], [], [], {7, 858617}) = -1 EBADF (Bad file descriptor) 0.000041 select(1, [0], [], [], {7, 858575}) = -1 EBADF (Bad file descriptor) 0.000039 select(1, [0], [], [], {7, 858536}) = -1 EBADF (Bad file descriptor) 0.000038 select(1, [0], [], [], {7, 858498}) = -1 EBADF (Bad file descriptor) 0.000039 select(1, [0], [], [], {7, 858459}) = -1 EBADF (Bad file descriptor) 0.000064 select(1, [0], [], [], {7, 858395}) = -1 EBADF (Bad file descriptor) 0.000040 select(1, [0], [], [], {7, 858355}) = -1 EBADF (Bad file descriptor) 0.000039 select(1, [0], [], [], {7, 858316}) = -1 EBADF (Bad file descriptor) 0.000038 select(1, [0], [], [], {7, 858278}) = -1 EBADF (Bad file descriptor) 0.000039 select(1, [0], [], [], {7, 858239}) = -1 EBADF (Bad file descriptor) 0.000039 select(1, [0], [], [], {7, 858199}) = -1 EBADF (Bad file descriptor) 0.000039 select(1, [0], [], [], {7, 858161}) = -1 EBADF (Bad file descriptor) 0.000039 select(1, [0], [], [], {7, 858121}) = -1 EBADF (Bad file descriptor) 0.000040 select(1, [0], [], [], {7, 858082}) = -1 EBADF (Bad file descriptor)
lsof tail
zabbix_se 18214 zabbix DEL REG 0,4 753667 /SYSV7801030c zabbix_se 18214 zabbix DEL REG 0,4 884743 /SYSV5301030c zabbix_se 18214 zabbix 1w REG 9,2 93959114 262695 /var/zabbix/zabbix_server.log zabbix_se 18214 zabbix 2w REG 9,2 93959114 262695 /var/zabbix/zabbix_server.log zabbix_se 18214 zabbix 3w REG 9,2 5 524521 /var/run/zabbix/zabbix.pid zabbix_se 18214 zabbix 4u IPv4 57254714 0t0 TCP *:zabbix-trapper (LISTEN) zabbix_se 18214 zabbix 5u unix 0xffff88010e921c80 0t0 57254932 socket
Here is log tail for this pid
18214:20120327:075815.715 server #33 started [unreachable poller #1] 18214:20120402:145629.524 temporarily disabling Zabbix agent checks on host [xxx]: host unavailable 18214:20120402:145936.829 IPMI item [Analog_Fan_RPM[FAN 1]] on host [xxx] failed: another network error, wait for 15 seconds 18214:20120402:145939.831 IPMI item [Analog_Fan_RPM[Fan1]] on host [xxx] failed: another network error, wait for 15 seconds 18214:20120402:145951.955 temporarily disabling IPMI checks on host [xxx]: host unavailable 18214:20120402:145955.697 resuming IPMI checks on host [xxx]: connection restored 18214:20120402:162901.379 Got signal [signal:15(SIGTERM),sender_pid:21889,sender_uid:0,reason:0]. Exiting ...
Thanks,
Alex
- part of
-
ZBXNEXT-3386 IPMI connection to a single device is kept open by multiple processes
- Closed
-
ZBX-10983 zabbix server 3.0 crash - ipmi problem
- Closed