[ZBX-14559] Zabbix agent crash on AIX when processing net.dns[] items Created: 2018 Jul 03 Updated: 2024 Apr 10 Resolved: 2018 Sep 03 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Agent (G) |
Affects Version/s: | 3.4.11 |
Fix Version/s: | 3.0.22rc1, 3.4.14rc1, 4.0.0beta2, 4.0 (plan) |
Type: | Problem report | Priority: | Major |
Reporter: | Alexey Pustovalov | Assignee: | Andris Mednis |
Resolution: | Fixed | Votes: | 0 |
Labels: | AIX, crash | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
AIX 6.1, pcre, openssl |
Attachments: | ZBX-14559_patch_with_padding_for_30.txt ZBX-14559_patch_without_res_ninit_on_AIX_for_30.txt |
Team: | Team A |
Sprint: | Sprint 37, Sprint 38, Sprint 41 |
Story Points: | 2 |
Description |
Zabbix agent crashed while using -p option (print all metrics): # /usr/local/sbin/zabbix_agentd -p agent.hostname [s|Zabbix server] agent.ping [u|1] agent.version [s|3.4.8] system.localtime[utc] [u|1530603016] system.run[echo test] [m|ZBX_NOTSUPPORTED] [Remote commands are not enabled.] web.page.get[localhost,,80] [t|] web.page.perf[localhost,,80] [d|0.000000] web.page.regexp[localhost,,80,OK] [s|] vfs.file.size[/etc/passwd] [u|1162] vfs.file.time[/etc/passwd,modify] [u|1529397119] vfs.file.exists[/etc/passwd] [u|1] vfs.file.contents[/etc/passwd] [t|root:!:0:0::/home/root:/usr/bin/ksh daemon:!:1:1::/etc: .... truncated... zabbix:x:21602:11261:zabbix:/opt/zabbix:/bin/false] vfs.file.regexp[/etc/passwd,root] [s|root:!:0:0::/home/root:/usr/bin/ksh] vfs.file.regmatch[/etc/passwd,root] [u|1] vfs.file.md5sum[/etc/passwd] [s|some_cksum] vfs.file.cksum[/etc/passwd] [u|some_cksum] vfs.dir.size[/var/log] [u|44015584] zabbix_agentd [14745676]: ERROR: Got signal [signal:4(SIGILL),reason:30,refaddr:0]. Crashing ... zabbix_agentd [14745676]: ERROR: ====== Fatal information: ====== zabbix_agentd [14745676]: ERROR: program counter not available for this architecture zabbix_agentd [14745676]: ERROR: === Registers: === zabbix_agentd [14745676]: ERROR: register dump not available for this architecture zabbix_agentd [14745676]: ERROR: === Backtrace: === zabbix_agentd [14745676]: ERROR: backtrace not available for this platform zabbix_agentd [14745676]: ERROR: === Memory map: === zabbix_agentd [14745676]: ERROR: memory map not available for this platform zabbix_agentd [14745676]: ERROR: ================================ net.dns[,zabbix.com] # # |
Comments |
Comment by Vladislavs Sokurenko [ 2018 Jul 03 ] |
Is it vfs.dir.size that crash ? Could you please be so kind and execute it separately ? Also would be nice to see it with debug log level. |
Comment by Glebs Ivanovskis [ 2018 Jul 04 ] |
Dear dotneft, I believe -C works with -p, so you can raise logging there. |
Comment by Andris Mednis [ 2018 Jul 06 ] |
So, vfs.dir.size[/var/log] was the last successful metric before crash. # /usr/local/sbin/zabbix_agentd -t net.dns[,zabbix.com] |
Comment by Andris Mednis [ 2018 Aug 23 ] |
Confirmed with 3.0.15, 3.0.21rc1 on AIX 7.1 TL0. |
Comment by Andris Mednis [ 2018 Aug 23 ] |
Crash happens when returning from dns_query() with SYSINFO_RET_OK in static int dns_query(AGENT_REQUEST *request, AGENT_RESULT *result, int short_answer) { ... hp = (HEADER *)answer.buffer; if (1 == short_answer) { SET_UI64_RESULT(result, NOERROR != hp->rcode || 0 == ntohs(hp->ancount) || -1 == res ? 0 : 1); <--- It goes successfully until here, <--- then crashes in the process of returning. return SYSINFO_RET_OK; } ...
|
Comment by Andris Mednis [ 2018 Aug 28 ] |
In src/libs/zbxsysinfo/common/net.c there is a function dns_query() which calls res_ninit(): ... #ifdef HAVE_RES_NINIT struct __res_state res_state_local; #else /* thread-unsafe resolver API */ ... #ifdef HAVE_RES_NINIT memset(&res_state_local, 0, sizeof(res_state_local)); if (-1 == res_ninit(&res_state_local)) /* initialize always, settings might have changed */ #else It seems that on some AIX systems with no updates installed res_ninit() can corrupt stack causing a crash when returning from dns_query().
A similar problem was described in 2005 on Kerberos maillist [krbdev.mit.edu #3172 |
Comment by Andris Mednis [ 2018 Aug 29 ] |
Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-6565-13645-14559-30 (based on 3.0.22rc1) which contains proposed fixes for: Tested on AIX 6.1 TL0, 7.1 TL0, 7.1 TL4. |
Comment by Andris Mednis [ 2018 Aug 31 ] |
Fixed in versions:
|
Comment by Andris Mednis [ 2018 Aug 31 ] |
No documentation update required. |