[ZBX-15544] Zabbix proxy crashing after received configuration data from server Created: 2019 Jan 29 Updated: 2019 Feb 28 Resolved: 2019 Feb 28 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Proxy (P), Server (S) |
Affects Version/s: | 4.0.2, 4.0.3 |
Fix Version/s: | None |
Type: | Problem report | Priority: | Critical |
Reporter: | Ivan | Assignee: | Vladislavs Sokurenko |
Resolution: | Won't fix | Votes: | 0 |
Labels: | None | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
PRETTY_NAME="Debian GNU/Linux 8 (jessie)" Linux hostname 4.19.12-amd64-vyos #1 SMP Tue Jan 22 01:09:06 CET 2019 x86_64 GNU/Linux |
Attachments: | ZBX-15544-2.diff ZBX-15544-3.diff ZBX-15544-4.diff ZBX-15544-5.diff ZBX-15544-6.diff ZBX-15544.diff gethostbyaddr.c zabbix_agentd.conf zabbix_proxy.conf zabbix_proxy.log zabbix_proxy.log zabbix_proxy.log zabbix_proxy.log_discovery zabbix_proxy.log_ipv6_agent zabbix_proxy.log_ipv6_discov zabbix_proxy.log_patch3 zabbix_proxy.log_wa zabbix_proxy.log_wadisco |
Team: | Team A |
Sprint: | Sprint 49 (Feb 2019) |
Story Points: | 0 |
Description |
After installation of Zabbix Proxy for new branch i got error below:** Server version: zabbix_server -V Copyright (C) 2018 Zabbix SIA This product includes software developed by the OpenSSL Project Compiled with OpenSSL 1.1.0f 25 May 2017 Proxy 4.0.2 and server 4.0.2 was same result...
*3478:20190130:011019.908 proxy #25 started [trapper #5]* *3454:20190130:011020.193 received configuration data from server at "10.255.0.1", datalen 3123* *3478:20190130:011036.282 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...* *3478:20190130:011036.283 ====== Fatal information: ======* *3478:20190130:011036.283 Program counter: 0xffffffffff600000* *3478:20190130:011036.283 === Registers: ===* *3478:20190130:011036.283 r8 = 3 = 3 = 3* *3478:20190130:011036.283 r9 = 0 = 0 = 0* *3478:20190130:011036.284 r10 = fffffffffffffb77 = 18446744073709550455 = -1161* *3478:20190130:011036.284 r11 = ffffffffff600000 = 18446744073699065856 = -10485760* *3478:20190130:011036.284 r12 = c = 12 = 12* *3478:20190130:011036.284 r13 = 5 = 5 = 5* *3478:20190130:011036.284 r14 = 0 = 0 = 0* *3478:20190130:011036.284 r15 = 5 = 5 = 5* *3478:20190130:011036.285 rdi = 7ffe1054bcf0 = 140729172409584 = 140729172409584* *3478:20190130:011036.285 rsi = 0 = 0 = 0* *3478:20190130:011036.285 rbp = 3 = 3 = 3* *3478:20190130:011036.285 rbx = 7ffe1054be60 = 140729172409952 = 140729172409952* *3478:20190130:011036.285 rdx = 10 = 16 = 16* *3478:20190130:011036.286 rax = 7ffe1054be50 = 140729172409936 = 140729172409936* *3478:20190130:011036.286 rcx = 7f14cca23350 = 139727309255504 = 139727309255504* *3478:20190130:011036.287 rsp = 7ffe1054bce8 = 140729172409576 = 140729172409576* *3478:20190130:011036.287 rip = ffffffffff600000 = 18446744073699065856 = -10485760* *3478:20190130:011036.287 efl = 10206 = 66054 = 66054* *3478:20190130:011036.288 csgsfs = 2b000000000033 = 12103423998558259 = 12103423998558259* *3478:20190130:011036.288 err = 15 = 21 = 21* *3478:20190130:011036.288 trapno = e = 14 = 14* *3478:20190130:011036.288 oldmask = 0 = 0 = 0* *3478:20190130:011036.288 cr2 = ffffffffff600000 = 18446744073699065856 = -10485760* *3478:20190130:011036.288 === Backtrace: ===* *3453:20190130:011036.313 One child process died (PID:3478,exitcode/signal:11). Exiting ...* *zabbix_proxy [3453]: Error waiting for process with PID 3478: [10] No child processes* *3453:20190130:011036.330 syncing history data...* *3453:20190130:011036.330 syncing history data done* *3453:20190130:011036.331 Zabbix Proxy stopped. Zabbix 4.0.3 (revision 87993).*
|
Comments |
Comment by Vladislavs Sokurenko [ 2019 Jan 29 ] |
Could you please be so kind and attach log file where whole backtrace part is seen and objdump. |
Comment by Ivan [ 2019 Jan 29 ] |
As soon as i stoped agent on the same host, proxy stops crashing. On the debug level 4 i found:
4103:20190130:013236.486 trapper got '\{"request":"active checks","host":"agent-hostname"}' 4103:20190130:013236.486 In send_list_of_active_checks_json() 4103:20190130:013236.487 In is_ip4() ip:'10.255.38.5' 4103:20190130:013236.487 End of is_ip4():SUCCEED 4103:20190130:013236.487 In get_hostid_by_host() host:'agent-hostname' metadata:'' 4103:20190130:013236.487 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='agent-hostname' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null] 4103:20190130:013236.490 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ... 4103:20190130:013236.491 ====== Fatal information: ====== 4103:20190130:013236.491 Program counter: 0xffffffffff600000 4103:20190130:013236.491 === Registers: === 4103:20190130:013236.491 r8 = 3 = 3 = 3 4103:20190130:013236.491 r9 = 0 = 0 = 0 4103:20190130:013236.492 r10 = fffffffffffffb77 = 18446744073709550455 = -1161 4103:20190130:013236.492 r11 = ffffffffff600000 = 18446744073699065856 = -10485760 4103:20190130:013236.492 r12 = c = 12 = 12 4103:20190130:013236.492 r13 = 5 = 5 = 5 4103:20190130:013236.492 r14 = 0 = 0 = 0 4103:20190130:013236.492 r15 = 5 = 5 = 5 4103:20190130:013236.493 rdi = 7ffd43099510 = 140725728154896 = 140725728154896 4103:20190130:013236.493 rsi = 0 = 0 = 0 4103:20190130:013236.493 rbp = 3 = 3 = 3 4103:20190130:013236.493 rbx = 7ffd43099680 = 140725728155264 = 140725728155264 4103:20190130:013236.493 rdx = 10 = 16 = 16 4103:20190130:013236.493 rax = 7ffd43099670 = 140725728155248 = 140725728155248 4103:20190130:013236.494 rcx = 7f82c7ac3350 = 140199672427344 = 140199672427344 4103:20190130:013236.494 rsp = 7ffd43099508 = 140725728154888 = 140725728154888 4103:20190130:013236.494 rip = ffffffffff600000 = 18446744073699065856 = -10485760 4103:20190130:013236.495 efl = 10202 = 66050 = 66050 4103:20190130:013236.495 csgsfs = 2b000000000033 = 12103423998558259 = 12103423998558259 4103:20190130:013236.496 err = 15 = 21 = 21 4103:20190130:013236.496 trapno = e = 14 = 14 4103:20190130:013236.496 oldmask = 0 = 0 = 0 4103:20190130:013236.496 cr2 = ffffffffff600000 = 18446744073699065856 = -10485760 4103:20190130:013236.496 === Backtrace: === 4078:20190130:013236.513 One child process died (PID:4103,exitcode/signal:11). Exiting ... 4078:20190130:013236.514 zbx_on_exit() called 4079:20190130:013236.514 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ... 4080:20190130:013236.514 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ... 4081:20190130:013236.514 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ... 4082:20190130:013236.515 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ... 4083:20190130:013236.515 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ... 4084:20190130:013236.515 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ... 4085:20190130:013236.516 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ... |
Comment by Ivan [ 2019 Jan 29 ] |
Vladislav, actually that's not a regular distr, and it's hard to install there objdump. Do you want to see full proxy log? On which debug level? |
Comment by Vladislavs Sokurenko [ 2019 Jan 29 ] |
There is only entry But actual backtrace is missing after that |
Comment by Ivan [ 2019 Jan 29 ] |
That's all i got: cat zabbix_proxy.log 4334:20190130:013716.572 Starting Zabbix Proxy (active) [some-hostname]. Zabbix 4.0.3 (revision 87993). 4334:20190130:013716.572 **** Enabled features **** 4334:20190130:013716.572 SNMP monitoring: YES 4334:20190130:013716.573 IPMI monitoring: NO 4334:20190130:013716.573 Web monitoring: NO 4334:20190130:013716.573 VMware monitoring: NO 4334:20190130:013716.573 ODBC: NO 4334:20190130:013716.573 SSH2 support: NO 4334:20190130:013716.573 IPv6 support: NO 4334:20190130:013716.573 TLS support: NO 4334:20190130:013716.573 ************************** 4334:20190130:013716.573 using configuration file: /etc/zabbix/zabbix_proxy.conf 4334:20190130:013716.588 current database version (mandatory/optional): 04000000/04000003 4334:20190130:013716.588 required mandatory version: 04000000 4334:20190130:013716.604 proxy #0 started [main process] 4342:20190130:013716.607 proxy #7 started [discoverer #2] 4341:20190130:013716.610 proxy #6 started [discoverer #1] 4340:20190130:013716.616 proxy #5 started [http poller #1] 4343:20190130:013716.618 proxy #8 started [discoverer #3] 4339:20190130:013716.623 proxy #4 started [housekeeper #1] 4344:20190130:013716.623 proxy #9 started [history syncer #1] 4338:20190130:013716.624 proxy #3 started [data sender #1] 4345:20190130:013716.626 proxy #10 started [history syncer #2] 4346:20190130:013716.626 proxy #11 started [history syncer #3] 4347:20190130:013716.628 proxy #12 started [history syncer #4] 4337:20190130:013716.629 proxy #2 started [heartbeat sender #1] 4350:20190130:013716.631 proxy #15 started [poller #1] 4351:20190130:013716.634 proxy #16 started [poller #2] 4352:20190130:013716.639 proxy #17 started [poller #3] 4353:20190130:013716.643 proxy #18 started [poller #4] 4348:20190130:013716.647 proxy #13 started [self-monitoring #1] 4349:20190130:013716.647 proxy #14 started [task manager #1] 4354:20190130:013716.649 proxy #19 started [unreachable poller #1] 4355:20190130:013716.651 proxy #20 started [unreachable poller #2] 4356:20190130:013716.655 proxy #21 started [trapper #1] 4357:20190130:013716.656 proxy #22 started [trapper #2] 4358:20190130:013716.658 proxy #23 started [trapper #3] 4359:20190130:013716.659 proxy #24 started [trapper #4] 4336:20190130:013716.660 proxy #1 started [configuration syncer #1] 4360:20190130:013716.661 proxy #25 started [trapper #5] 4361:20190130:013716.663 proxy #26 started [icmp pinger #1] 4362:20190130:013716.663 proxy #27 started [icmp pinger #2] 4336:20190130:013716.934 received configuration data from server at "10.255.0.1", datalen 3123 4360:20190130:015150.293 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ... 4360:20190130:015150.293 ====== Fatal information: ====== 4360:20190130:015150.293 Program counter: 0xffffffffff600000 4360:20190130:015150.293 === Registers: === 4360:20190130:015150.293 r8 = 3 = 3 = 3 4360:20190130:015150.293 r9 = 0 = 0 = 0 4360:20190130:015150.293 r10 = fffffffffffffb77 = 18446744073709550455 = -1161 4360:20190130:015150.293 r11 = ffffffffff600000 = 18446744073699065856 = -10485760 4360:20190130:015150.294 r12 = c = 12 = 12 4360:20190130:015150.294 r13 = 5 = 5 = 5 4360:20190130:015150.294 r14 = 0 = 0 = 0 4360:20190130:015150.294 r15 = 5 = 5 = 5 4360:20190130:015150.294 rdi = 7ffd98a28ff0 = 140727164243952 = 140727164243952 4360:20190130:015150.294 rsi = 0 = 0 = 0 4360:20190130:015150.294 rbp = 3 = 3 = 3 4360:20190130:015150.294 rbx = 7ffd98a29160 = 140727164244320 = 140727164244320 4360:20190130:015150.294 rdx = 10 = 16 = 16 4360:20190130:015150.294 rax = 7ffd98a29150 = 140727164244304 = 140727164244304 4360:20190130:015150.294 rcx = 7f4af4d36350 = 139959911801680 = 139959911801680 4360:20190130:015150.294 rsp = 7ffd98a28fe8 = 140727164243944 = 140727164243944 4360:20190130:015150.294 rip = ffffffffff600000 = 18446744073699065856 = -10485760 4360:20190130:015150.294 efl = 10206 = 66054 = 66054 4360:20190130:015150.295 csgsfs = 2b000000000033 = 12103423998558259 = 12103423998558259 4360:20190130:015150.295 err = 15 = 21 = 21 4360:20190130:015150.295 trapno = e = 14 = 14 4360:20190130:015150.295 oldmask = 0 = 0 = 0 4360:20190130:015150.295 cr2 = ffffffffff600000 = 18446744073699065856 = -10485760 4360:20190130:015150.295 === Backtrace: === 4334:20190130:015150.303 One child process died (PID:4360,exitcode/signal:11). Exiting ... zabbix_proxy [4334]: Error waiting for process with PID 4360: [10] No child processes 4334:20190130:015150.319 syncing history data... 4334:20190130:015150.320 syncing history data done 4334:20190130:015150.320 Zabbix Proxy stopped. Zabbix 4.0.3 (revision 87993). |
Comment by Ivan [ 2019 Jan 29 ] |
4517:20190130:015150.290 Starting Zabbix Agent [agent-hostname]. Zabbix 4.0.3 (revision 87993). |
Comment by Ivan [ 2019 Jan 29 ] |
And some debug lvl4 log of proxy after startup agent: 4909:20190130:020150.866 In get_pinger_hosts() 4909:20190130:020150.866 In DCconfig_get_poller_items() poller_type:3 4909:20190130:020150.866 End of DCconfig_get_poller_items():0 4909:20190130:020150.866 End of get_pinger_hosts():0 4909:20190130:020150.866 In process_pinger_hosts() 4909:20190130:020150.866 End of process_pinger_hosts() 4909:20190130:020150.866 In DCconfig_get_poller_nextcheck() poller_type:3 4909:20190130:020150.866 End of DCconfig_get_poller_nextcheck():-1 4909:20190130:020150.867 __zbx_zbx_setproctitle() title:'icmp pinger #1 [got 0 values in 0.000402 sec, idle 5 sec]' 4907:20190130:020150.909 __zbx_zbx_setproctitle() title:'trapper #4 [processing data]' 4907:20190130:020150.909 trapper got '\{"request":"active checks","host":"agent-hostname"}' 4907:20190130:020150.909 In send_list_of_active_checks_json() 4907:20190130:020150.909 In is_ip4() ip:'10.255.38.5' 4907:20190130:020150.909 End of is_ip4():SUCCEED 4907:20190130:020150.909 In get_hostid_by_host() host:'agent-hostname' metadata:'' 4907:20190130:020150.909 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='agent-hostname' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null] 4907:20190130:020150.911 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ... 4907:20190130:020150.911 ====== Fatal information: ====== 4907:20190130:020150.911 Program counter: 0xffffffffff600000 4907:20190130:020150.911 === Registers: === 4907:20190130:020150.911 r8 = 3 = 3 = 3 4907:20190130:020150.911 r9 = 0 = 0 = 0 4907:20190130:020150.911 r10 = fffffffffffffb77 = 18446744073709550455 = -1161 4907:20190130:020150.911 r11 = ffffffffff600000 = 18446744073699065856 = -10485760 4907:20190130:020150.911 r12 = c = 12 = 12 4907:20190130:020150.911 r13 = 5 = 5 = 5 4907:20190130:020150.911 r14 = 0 = 0 = 0 4907:20190130:020150.911 r15 = 5 = 5 = 5 4907:20190130:020150.911 rdi = 7fff2b944aa0 = 140733924526752 = 140733924526752 4907:20190130:020150.911 rsi = 0 = 0 = 0 4907:20190130:020150.911 rbp = 3 = 3 = 3 4907:20190130:020150.911 rbx = 7fff2b944c10 = 140733924527120 = 140733924527120 4907:20190130:020150.911 rdx = 10 = 16 = 16 4907:20190130:020150.911 rax = 7fff2b944c00 = 140733924527104 = 140733924527104 4907:20190130:020150.911 rcx = 7fbf79a6c350 = 140460356453200 = 140460356453200 4907:20190130:020150.912 rsp = 7fff2b944a98 = 140733924526744 = 140733924526744 4907:20190130:020150.912 rip = ffffffffff600000 = 18446744073699065856 = -10485760 4907:20190130:020150.912 efl = 10202 = 66050 = 66050 4907:20190130:020150.912 csgsfs = 2b000000000033 = 12103423998558259 = 12103423998558259 4907:20190130:020150.912 err = 15 = 21 = 21 4907:20190130:020150.912 trapno = e = 14 = 14 4907:20190130:020150.912 oldmask = 0 = 0 = 0 4907:20190130:020150.912 cr2 = ffffffffff600000 = 18446744073699065856 = -10485760 4907:20190130:020150.912 === Backtrace: === 4883:20190130:020150.921 One child process died (PID:4907,exitcode/signal:11). Exiting ... 4883:20190130:020150.921 zbx_on_exit() called 4884:20190130:020150.921 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4885:20190130:020150.921 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4886:20190130:020150.922 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4887:20190130:020150.922 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4888:20190130:020150.922 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4889:20190130:020150.923 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4890:20190130:020150.923 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4891:20190130:020150.923 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4892:20190130:020150.924 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4893:20190130:020150.924 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4894:20190130:020150.924 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4895:20190130:020150.924 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4896:20190130:020150.925 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4897:20190130:020150.925 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4898:20190130:020150.925 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4899:20190130:020150.925 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4900:20190130:020150.926 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4901:20190130:020150.926 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4902:20190130:020150.926 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4903:20190130:020150.927 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4904:20190130:020150.927 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4905:20190130:020150.927 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4906:20190130:020150.927 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4908:20190130:020150.927 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4909:20190130:020150.928 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... 4910:20190130:020150.928 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ... zabbix_proxy [4883]: Error waiting for process with PID 4907: [10] No child processes 4883:20190130:020150.932 In DBconnect() flag:1 4883:20190130:020150.932 query without transaction detected 4883:20190130:020150.932 query [txnlev:0] [pragma synchronous=0] 4883:20190130:020150.939 query without transaction detected 4883:20190130:020150.939 query [txnlev:0] [pragma temp_store=2] 4883:20190130:020150.939 query without transaction detected 4883:20190130:020150.940 query [txnlev:0] [pragma temp_store_directory='/config/user-data/zabbix/'] 4883:20190130:020150.940 End of DBconnect():0 4883:20190130:020150.940 In free_database_cache() 4883:20190130:020150.940 In DCsync_all() 4883:20190130:020150.940 In sync_history_cache_full() history_num:0 4883:20190130:020150.940 syncing history data... 4883:20190130:020150.940 syncing history data done 4883:20190130:020150.940 End of sync_history_cache_full() 4883:20190130:020150.940 End of DCsync_all() 4883:20190130:020150.940 End of free_database_cache() 4883:20190130:020150.940 In free_configuration_cache() 4883:20190130:020150.940 End of free_configuration_cache() 4883:20190130:020150.941 In free_selfmon_collector() collector:0x7fbf7a150000 4883:20190130:020150.941 End of free_selfmon_collector() 4883:20190130:020150.941 In zbx_unload_modules() 4883:20190130:020150.941 End of zbx_unload_modules() 4883:20190130:020150.941 Zabbix Proxy stopped. Zabbix 4.0.3 (revision 87993). 4971:20190130:020201.154 Starting Zabbix Proxy (active) [some-hostname]. Zabbix 4.0.3 (revision 87993). 4971:20190130:020201.155 **** Enabled features **** 4971:20190130:020201.155 SNMP monitoring: YES 4971:20190130:020201.155 IPMI monitoring: NO 4971:20190130:020201.155 Web monitoring: NO 4971:20190130:020201.155 VMware monitoring: NO 4971:20190130:020201.155 ODBC: NO 4971:20190130:020201.155 SSH2 support: NO 4971:20190130:020201.155 IPv6 support: NO 4971:20190130:020201.155 TLS support: NO 4971:20190130:020201.155 ************************** |
Comment by Vladislavs Sokurenko [ 2019 Jan 29 ] |
Is agent using OpenSSL? Could you please increase log level and attach log for particular moment ? also please do show create table hosts; |
Comment by Ivan [ 2019 Jan 29 ] |
"Is agent using OpenSSL" - No increase log level - LLevel of proxy? 5? In attach. and with started agent: zabbix_proxy.log_wa |
Comment by Vladislavs Sokurenko [ 2019 Jan 29 ] |
Ok, I see it is crashing when processing discovery rule 3052:20190130:022657.756 process_rule() range:'192.168.100.0/24' 3052:20190130:022657.756 process_rule() ip:'192.168.100.1' 3052:20190130:022657.756 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ... Is it fine if I add patch with additional logging ? |
Comment by Ivan [ 2019 Jan 29 ] |
Yes, it's ok. And plz write me how to apply this patch. |
Comment by Ivan [ 2019 Jan 29 ] |
I think you are wrong about discovery... I disable this rule on server and got: zabbix_proxy.log_wadisco |
Comment by Vladislavs Sokurenko [ 2019 Jan 29 ] |
yes, now it's different and happens after getting list of active checks. 3920:20190130:025038.684 __zbx_zbx_setproctitle() title:'trapper #5 [processing data]' 3920:20190130:025038.684 trapper got '{"request":"active checks","host":"agent-hostname"}' 3920:20190130:025038.684 In send_list_of_active_checks_json() 3920:20190130:025038.684 In is_ip4() ip:'x.x.x.x' 3920:20190130:025038.684 End of is_ip4():SUCCEED 3920:20190130:025038.684 In get_hostid_by_host() host:'agent-hostname' metadata:'' 3920:20190130:025038.684 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='agent-hostname' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null] 3920:20190130:025038.686 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ... |
Comment by Ivan [ 2019 Jan 30 ] |
I got 1 row as result: "10566" "0" "1" null |
Comment by Vladislavs Sokurenko [ 2019 Jan 30 ] |
Please do patch -p0 -iZBX-15541.diff |
Comment by Ivan [ 2019 Jan 30 ] |
Where should I put it? Which folder? |
Comment by Vladislavs Sokurenko [ 2019 Jan 30 ] |
It should be executed in the folder with source code. |
Comment by Ivan [ 2019 Jan 30 ] |
28991:20190130:191222.987 In send_list_of_active_checks_json()
SQL Result: |
Comment by Vladislavs Sokurenko [ 2019 Jan 30 ] |
It seems to be crashing on a call to gethostbyaddr() function 28991:20190130:191222.987 In db_register_host() 28991:20190130:191222.987 before zbx_gethost_by_ip 28991:20190130:191222.987 1 28991:20190130:191222.987 3 28991:20190130:191222.987 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ... Can you please try one more patch ZBX-15544-2.diff and see if issue persists ? |
Comment by Ivan [ 2019 Jan 30 ] |
zabbix-4.0.3# patch -p0 -iZBX-15544-2.diff
Do I need to patch fresh sources? |
Comment by Ivan [ 2019 Jan 30 ] |
|
Comment by Vladislavs Sokurenko [ 2019 Jan 30 ] |
Please patch fresh sources |
Comment by Ivan [ 2019 Jan 30 ] |
And one more log with crash on discovery
Upper log with crash on agent response. |
Comment by Vladislavs Sokurenko [ 2019 Jan 30 ] |
Sorry for asking but have you recompiled after patching again ? |
Comment by Ivan [ 2019 Jan 30 ] |
I made all again, patch, compile, now everything works perfectly! |
Comment by Vladislavs Sokurenko [ 2019 Jan 30 ] |
I have removed resolving of IP address into DNS, you can also try compiling with --enable-ipv6 flag to use different functions in order to avoid this issue. You can also try compiling this program gethostbyaddr.c and supplying your IP address |
Comment by Ivan [ 2019 Jan 31 ] |
"have you tried updating your system to latest version ?" - Last version of OS or Zabbix? OS impossible to update. Zabbix 4.0.3 ldd --version " Does pinging this agent IP address result in some kind of DNS address ?" - No "./a.out x.x.x.x" - Which IP? ZabbixServer (10.255.0.1) - VPN - 10.255.30.5 ZabbixProxy 192.168.1.234 - LAN |
Comment by Vladislavs Sokurenko [ 2019 Jan 31 ] |
Please try all of the mentioned. |
Comment by Ivan [ 2019 Jan 31 ] |
ipv6 you mean? Should i try it on fresh or patched sources? a.out - which IP? Read above ^ |
Comment by Vladislavs Sokurenko [ 2019 Jan 31 ] |
for IPv6 you should try clean sources. |
Comment by Ivan [ 2019 Jan 31 ] |
Patched sources + ipv6 below:
clean sources i'll do soon (few min) /tmp/a.out 192.168.1.234 |
Comment by Vladislavs Sokurenko [ 2019 Jan 31 ] |
Ok, so test program does not crash this is a long shot but you could try one more patch ZBX-15544-3.diff with ipv6 |
Comment by Ivan [ 2019 Jan 31 ] |
Clean sources with ipv6 (without patch): 14434:20190131:161838.210 End of DCconfig_get_poller_nextcheck():1548937119 14434:20190131:161838.210 End of get_values():0 14434:20190131:161838.210 __zbx_zbx_setproctitle() title:'poller #1 [got 0 values in 0.000329 sec, idle 1 sec]' 14440:20190131:161838.724 __zbx_zbx_setproctitle() title:'trapper #1 [processing data]' 14440:20190131:161838.724 trapper got '\{"request":"active checks","host":"agent-hostname"}' 14440:20190131:161838.724 In send_list_of_active_checks_json() 14440:20190131:161838.724 In is_ip4() ip:'10.255.30.5' 14440:20190131:161838.724 End of is_ip4():SUCCEED 14440:20190131:161838.724 In get_hostid_by_host() host:'agent-hostname' metadata:'' 14440:20190131:161838.724 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='agent-hostname' 14440:20190131:161838.725 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ... 14440:20190131:161838.725 ====== Fatal information: ====== 14440:20190131:161838.725 Program counter: 0xffffffffff600000 14440:20190131:161838.725 === Registers: === 14440:20190131:161838.725 r8 = 3 = 3 = 3 14440:20190131:161838.725 r9 = 0 = 0 = 0 14440:20190131:161838.725 r10 = 7ffc91cbb4b0 = 140722754532528 = 140722754532528 14440:20190131:161838.725 r11 = ffffffffff600000 = 18446744073699065856 = -10485760 14440:20190131:161838.725 r12 = 8 = 8 = 8 14440:20190131:161838.725 r13 = 5 = 5 = 5 14440:20190131:161838.725 r14 = 0 = 0 = 0 14440:20190131:161838.725 r15 = 5 = 5 = 5 14440:20190131:161838.725 rdi = 7ffc91cbb6c0 = 140722754533056 = 140722754533056 14440:20190131:161838.725 rsi = 0 = 0 = 0 14440:20190131:161838.725 rbp = 2 = 2 = 2 14440:20190131:161838.725 rbx = 7ffc91cbb830 = 140722754533424 = 140722754533424 14440:20190131:161838.725 rdx = 10 = 16 = 16 14440:20190131:161838.725 rax = 7ffc91cbb820 = 140722754533408 = 140722754533408 14440:20190131:161838.725 rcx = 7f313172b350 = 139849259725648 = 139849259725648 14440:20190131:161838.725 rsp = 7ffc91cbb6b8 = 140722754533048 = 140722754533048 14440:20190131:161838.725 rip = ffffffffff600000 = 18446744073699065856 = -10485760 14440:20190131:161838.725 efl = 10206 = 66054 = 66054 14440:20190131:161838.725 csgsfs = 2b000000000033 = 12103423998558259 = 12103423998558259 14440:20190131:161838.725 err = 15 = 21 = 21 14440:20190131:161838.725 trapno = e = 14 = 14 14440:20190131:161838.725 oldmask = 0 = 0 = 0 14440:20190131:161838.725 cr2 = ffffffffff600000 = 18446744073699065856 = -10485760 14440:20190131:161838.725 === Backtrace: === 14420:20190131:161838.731 One child process died (PID:14440,exitcode/signal:11). Exiting ... 14420:20190131:161838.732 zbx_on_exit() called 14421:20190131:161838.732 Got signal [signal:15(SIGTERM),sender_pid:14420,sender_uid:117,reason:0]. Exiting ... 14422:20190131:161838.732 Got signal [signal:15(SIGTERM),sender_pid:14420,sender_uid:117,reason:0]. Exiting ... |
Comment by Ivan [ 2019 Jan 31 ] |
Patch3 + ipv6 |
Comment by Vladislavs Sokurenko [ 2019 Jan 31 ] |
Does putting same Zabbix proxy on some other operating system help ? |
Comment by Ivan [ 2019 Feb 01 ] |
On other router it works perfectly: zabbix_proxy -V ldd --version uname -a on other one uname -a no problems...
But after some (i don't know what exactly) changes of router OS, i got this issue... I have 16 proxys on other routers. |
Comment by Vladislavs Sokurenko [ 2019 Feb 01 ] |
Have you tried running gethostbyaddr.c on Zabbix prpxy that crashes ? Are you doing some kind of cross compilation ? |
Comment by Ivan [ 2019 Feb 04 ] |
Exactly on that host I ran it. I compiled it on my pc and copied it on proxy. |
Comment by Ivan [ 2019 Feb 08 ] |
No ideas? |
Comment by Vladislavs Sokurenko [ 2019 Feb 08 ] |
Can you try compiling on router ? |
Comment by Ivan [ 2019 Feb 09 ] |
try to compile zabbix? no, it's impossible. there are no required libraries.... |
Comment by Vladislavs Sokurenko [ 2019 Feb 09 ] |
Which version of glibc is there on that router ? is it possible building with the same versions and seeing if that helps ? |
Comment by Ivan [ 2019 Feb 10 ] |
ii libglib2.0-0:amd64 2.42.1-1+b1 amd64 GLib library of C routines |
Comment by Vladislavs Sokurenko [ 2019 Feb 11 ] |
Does it mean that glibc on router is higher than on the machine you do compilation on ? Router: |
Comment by Vladislavs Sokurenko [ 2019 Feb 11 ] |
Currently there is no indication of a bug in zabbix, please feel free to reopen if you disagree. |
Comment by Ivan [ 2019 Feb 11 ] |
No. Router: dpkg -l | grep GLib ldd --version Building machine: #dpkg -l | grep GLib ii libglib2.0-0:amd64 2.42.1-1+b1 amd64 GLib library of C routines ldd --version |
Comment by Ivan [ 2019 Feb 13 ] |
"Currently there is no indication of a bug in zabbix, please feel free to reopen if you disagree." I didn't see it, sorry. But where is a bug? How do you think? glibc? or where? If you removed resolving of IP address into DNS and it helps, BUT it didn't helps with discovery rules... AND your test app a.out didn't crashes... |
Comment by Vladislavs Sokurenko [ 2019 Feb 14 ] |
Could you please try one more patch ZBX-15544-4.diff to see what is passed to gethostbyaddr function, unfortunately it current looks like problem inside gethostbyaddr() without backtrace it's hard to tell, is it possible to get ? |
Comment by Ivan [ 2019 Feb 14 ] |
May be you could log param that passed to input and what we got from output... |
Comment by Vladislavs Sokurenko [ 2019 Feb 15 ] |
Attached ZBX-15544-6.diff |
Comment by Ivan [ 2019 Feb 15 ] |
5147:20190215:195333.002 __zbx_zbx_setproctitle() title:'discoverer #3 [processed 0 rules in 0.000000 sec, performing discovery]' |
Comment by Vladislavs Sokurenko [ 2019 Feb 15 ] |
I am sorry but it does not look like patch has been applied. |
Comment by Ivan [ 2019 Feb 15 ] |
6154:20190215:195857.916 trapper got '{"request":"active checks","host":"SL-VPN-Router"}' |
Comment by Ivan [ 2019 Feb 15 ] |
1st was discovery rule crash and 2nd agent crash. |
Comment by Ivan [ 2019 Feb 15 ] |
zabbix_proxy -V Copyright (C) 2019 Zabbix SIA
6430:20190215:200902.640 In proxy_data_sender() |
Comment by Ivan [ 2019 Feb 15 ] |
I took fresh clean sources 4.0.4, applied patch |
Comment by Ivan [ 2019 Feb 19 ] |
Vladislav, no ideas? |
Comment by Vladislavs Sokurenko [ 2019 Feb 19 ] |
I am sorry but it does not look like patch has been applied |
Comment by Ivan [ 2019 Feb 22 ] |
Is it possible that it's connected? https://phabricator.vyos.net/T1214 |
Comment by Vladislavs Sokurenko [ 2019 Feb 22 ] |
I am sorry but I don't know much about vyos for some reason also backtrace is missing, which flags have you used during compilation ? |
Comment by Ivan [ 2019 Feb 25 ] |
./configure --enable-ipv6 --enable-agent --enable-proxy --with-sqlite3 --enable-static --with-net-snmp --prefix=/home/vyos/zabbix_sources/vyos-noc-zabbix/sbin/ --sysconfdir=/home/vyos/zabbix_sources/vyos-noc-zabbix/etc/zabbix/ |
Comment by Vladislavs Sokurenko [ 2019 Feb 25 ] |
Can you please try without --enable-static ? |
Comment by Ivan [ 2019 Feb 25 ] |
I can (i'll try tomorrow), but why?
distribute compiled binaries among different proxy servers... |
Comment by Vladislavs Sokurenko [ 2019 Feb 25 ] |
Please also show output of ldd zabbix_proxy, VyOS might have some differences in glibc implementation, it's better not to use static and use their shared library |
Comment by Vladislavs Sokurenko [ 2019 Feb 28 ] |
Closing as Won't fix as there is no indication of a bug in zabbix |