[ZBX-15544] Zabbix proxy crashing after received configuration data from server Created: 2019 Jan 29  Updated: 2019 Feb 28  Resolved: 2019 Feb 28

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 4.0.2, 4.0.3
Fix Version/s: None

Type: Problem report Priority: Critical
Reporter: Ivan Assignee: Vladislavs Sokurenko
Resolution: Won't fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

PRETTY_NAME="Debian GNU/Linux 8 (jessie)"
NAME="Debian GNU/Linux"
VERSION_ID="8"
VERSION="8 (jessie)"
ID=debian

Linux hostname 4.19.12-amd64-vyos #1 SMP Tue Jan 22 01:09:06 CET 2019 x86_64 GNU/Linux


Attachments: File ZBX-15544-2.diff     File ZBX-15544-3.diff     File ZBX-15544-4.diff     File ZBX-15544-5.diff     File ZBX-15544-6.diff     File ZBX-15544.diff     File gethostbyaddr.c     File zabbix_agentd.conf     File zabbix_proxy.conf     File zabbix_proxy.log     File zabbix_proxy.log     File zabbix_proxy.log     File zabbix_proxy.log_discovery     File zabbix_proxy.log_ipv6_agent     File zabbix_proxy.log_ipv6_discov     File zabbix_proxy.log_patch3     File zabbix_proxy.log_wa     File zabbix_proxy.log_wadisco    
Team: Team A
Sprint: Sprint 49 (Feb 2019)
Story Points: 0

 Description   

After installation of Zabbix Proxy for new branch i got error below:**

Server version:

zabbix_server -V
zabbix_server (Zabbix) 4.0.2
Revision 87228 26 November 2018, compilation time: Nov 26 2018 09:32:37

Copyright (C) 2018 Zabbix SIA
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it according to
the license. There is NO WARRANTY, to the extent permitted by law.

This product includes software developed by the OpenSSL Project
for use in the OpenSSL Toolkit (http://www.openssl.org/).

Compiled with OpenSSL 1.1.0f  25 May 2017
Running with OpenSSL 1.1.0j  20 Nov 2018

Proxy 4.0.2 and server 4.0.2 was same result...

 

  *3478:20190130:011019.908 proxy #25 started [trapper #5]*
  *3454:20190130:011020.193 received configuration data from server at "10.255.0.1", datalen 3123*
  *3478:20190130:011036.282 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...*
  *3478:20190130:011036.283 ====== Fatal information: ======*
  *3478:20190130:011036.283 Program counter: 0xffffffffff600000*
  *3478:20190130:011036.283 === Registers: ===*
  *3478:20190130:011036.283 r8      =                3 =                    3 =                    3*
  *3478:20190130:011036.283 r9      =                0 =                    0 =                    0*
  *3478:20190130:011036.284 r10     = fffffffffffffb77 = 18446744073709550455 =                -1161*
  *3478:20190130:011036.284 r11     = ffffffffff600000 = 18446744073699065856 =            -10485760*
  *3478:20190130:011036.284 r12     =                c =                   12 =                   12*
  *3478:20190130:011036.284 r13     =                5 =                    5 =                    5*
  *3478:20190130:011036.284 r14     =                0 =                    0 =                    0*
  *3478:20190130:011036.284 r15     =                5 =                    5 =                    5*
  *3478:20190130:011036.285 rdi     =     7ffe1054bcf0 =      140729172409584 =      140729172409584*
  *3478:20190130:011036.285 rsi     =                0 =                    0 =                    0*
  *3478:20190130:011036.285 rbp     =                3 =                    3 =                    3*
  *3478:20190130:011036.285 rbx     =     7ffe1054be60 =      140729172409952 =      140729172409952*
  *3478:20190130:011036.285 rdx     =               10 =                   16 =                   16*
  *3478:20190130:011036.286 rax     =     7ffe1054be50 =      140729172409936 =      140729172409936*
  *3478:20190130:011036.286 rcx     =     7f14cca23350 =      139727309255504 =      139727309255504*
  *3478:20190130:011036.287 rsp     =     7ffe1054bce8 =      140729172409576 =      140729172409576*
  *3478:20190130:011036.287 rip     = ffffffffff600000 = 18446744073699065856 =            -10485760*
  *3478:20190130:011036.287 efl     =            10206 =                66054 =                66054*
  *3478:20190130:011036.288 csgsfs  =   2b000000000033 =    12103423998558259 =    12103423998558259*
  *3478:20190130:011036.288 err     =               15 =                   21 =                   21*
  *3478:20190130:011036.288 trapno  =                e =                   14 =                   14*
  *3478:20190130:011036.288 oldmask =                0 =                    0 =                    0*
  *3478:20190130:011036.288 cr2     = ffffffffff600000 = 18446744073699065856 =            -10485760*
  *3478:20190130:011036.288 === Backtrace: ===*
  *3453:20190130:011036.313 One child process died (PID:3478,exitcode/signal:11). Exiting ...*
*zabbix_proxy [3453]: Error waiting for process with PID 3478: [10] No child processes*
  *3453:20190130:011036.330 syncing history data...*
  *3453:20190130:011036.330 syncing history data done*
  *3453:20190130:011036.331 Zabbix Proxy stopped. Zabbix 4.0.3 (revision 87993).*

 

 



 Comments   
Comment by Vladislavs Sokurenko [ 2019 Jan 29 ]

Could you please be so kind and attach log file where whole backtrace part is seen and objdump.

Comment by Ivan [ 2019 Jan 29 ]

As soon as i stoped agent on the same host, proxy stops crashing.

On the debug level 4 i found:

 


  4103:20190130:013236.486 trapper got '\{"request":"active checks","host":"agent-hostname"}'
   4103:20190130:013236.486 In send_list_of_active_checks_json()
   4103:20190130:013236.487 In is_ip4() ip:'10.255.38.5'
   4103:20190130:013236.487 End of is_ip4():SUCCEED
   4103:20190130:013236.487 In get_hostid_by_host() host:'agent-hostname' metadata:''
   4103:20190130:013236.487 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='agent-hostname' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null]
   4103:20190130:013236.490 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...
   4103:20190130:013236.491 ====== Fatal information: ======
   4103:20190130:013236.491 Program counter: 0xffffffffff600000
   4103:20190130:013236.491 === Registers: ===
   4103:20190130:013236.491 r8      =                3 =                    3 =                    3
   4103:20190130:013236.491 r9      =                0 =                    0 =                    0
   4103:20190130:013236.492 r10     = fffffffffffffb77 = 18446744073709550455 =                -1161
   4103:20190130:013236.492 r11     = ffffffffff600000 = 18446744073699065856 =            -10485760
   4103:20190130:013236.492 r12     =                c =                   12 =                   12
   4103:20190130:013236.492 r13     =                5 =                    5 =                    5
   4103:20190130:013236.492 r14     =                0 =                    0 =                    0
   4103:20190130:013236.492 r15     =                5 =                    5 =                    5
   4103:20190130:013236.493 rdi     =     7ffd43099510 =      140725728154896 =      140725728154896
   4103:20190130:013236.493 rsi     =                0 =                    0 =                    0
   4103:20190130:013236.493 rbp     =                3 =                    3 =                    3
   4103:20190130:013236.493 rbx     =     7ffd43099680 =      140725728155264 =      140725728155264
   4103:20190130:013236.493 rdx     =               10 =                   16 =                   16
   4103:20190130:013236.493 rax     =     7ffd43099670 =      140725728155248 =      140725728155248
   4103:20190130:013236.494 rcx     =     7f82c7ac3350 =      140199672427344 =      140199672427344
   4103:20190130:013236.494 rsp     =     7ffd43099508 =      140725728154888 =      140725728154888
   4103:20190130:013236.494 rip     = ffffffffff600000 = 18446744073699065856 =            -10485760
   4103:20190130:013236.495 efl     =            10202 =                66050 =                66050
   4103:20190130:013236.495 csgsfs  =   2b000000000033 =    12103423998558259 =    12103423998558259
   4103:20190130:013236.496 err     =               15 =                   21 =                   21
   4103:20190130:013236.496 trapno  =                e =                   14 =                   14
   4103:20190130:013236.496 oldmask =                0 =                    0 =                    0
   4103:20190130:013236.496 cr2     = ffffffffff600000 = 18446744073699065856 =            -10485760
   4103:20190130:013236.496 === Backtrace: ===
   4078:20190130:013236.513 One child process died (PID:4103,exitcode/signal:11). Exiting ...
   4078:20190130:013236.514 zbx_on_exit() called
   4079:20190130:013236.514 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ...
   4080:20190130:013236.514 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ...
   4081:20190130:013236.514 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ...
   4082:20190130:013236.515 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ...
   4083:20190130:013236.515 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ...
   4084:20190130:013236.515 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ...
   4085:20190130:013236.516 Got signal [signal:15(SIGTERM),sender_pid:4078,sender_uid:117,reason:0]. Exiting ...
 
Comment by Ivan [ 2019 Jan 29 ]

Vladislav, actually that's not a regular distr, and it's hard to install there objdump.

Do you want to see full proxy log? On which debug level?

Comment by Vladislavs Sokurenko [ 2019 Jan 29 ]

There is only entry
4103:20190130:013236.496 === Backtrace: ===

But actual backtrace is missing after that

Comment by Ivan [ 2019 Jan 29 ]

That's all i got:

cat zabbix_proxy.log 
  4334:20190130:013716.572 Starting Zabbix Proxy (active) [some-hostname]. Zabbix 4.0.3 (revision 87993).
  4334:20190130:013716.572 **** Enabled features ****
  4334:20190130:013716.572 SNMP monitoring:       YES
  4334:20190130:013716.573 IPMI monitoring:        NO
  4334:20190130:013716.573 Web monitoring:         NO
  4334:20190130:013716.573 VMware monitoring:      NO
  4334:20190130:013716.573 ODBC:                   NO
  4334:20190130:013716.573 SSH2 support:           NO
  4334:20190130:013716.573 IPv6 support:           NO
  4334:20190130:013716.573 TLS support:            NO
  4334:20190130:013716.573 **************************
  4334:20190130:013716.573 using configuration file: /etc/zabbix/zabbix_proxy.conf
  4334:20190130:013716.588 current database version (mandatory/optional): 04000000/04000003
  4334:20190130:013716.588 required mandatory version: 04000000
  4334:20190130:013716.604 proxy #0 started [main process]
  4342:20190130:013716.607 proxy #7 started [discoverer #2]
  4341:20190130:013716.610 proxy #6 started [discoverer #1]
  4340:20190130:013716.616 proxy #5 started [http poller #1]
  4343:20190130:013716.618 proxy #8 started [discoverer #3]
  4339:20190130:013716.623 proxy #4 started [housekeeper #1]
  4344:20190130:013716.623 proxy #9 started [history syncer #1]
  4338:20190130:013716.624 proxy #3 started [data sender #1]
  4345:20190130:013716.626 proxy #10 started [history syncer #2]
  4346:20190130:013716.626 proxy #11 started [history syncer #3]
  4347:20190130:013716.628 proxy #12 started [history syncer #4]
  4337:20190130:013716.629 proxy #2 started [heartbeat sender #1]
  4350:20190130:013716.631 proxy #15 started [poller #1]
  4351:20190130:013716.634 proxy #16 started [poller #2]
  4352:20190130:013716.639 proxy #17 started [poller #3]
  4353:20190130:013716.643 proxy #18 started [poller #4]
  4348:20190130:013716.647 proxy #13 started [self-monitoring #1]
  4349:20190130:013716.647 proxy #14 started [task manager #1]
  4354:20190130:013716.649 proxy #19 started [unreachable poller #1]
  4355:20190130:013716.651 proxy #20 started [unreachable poller #2]
  4356:20190130:013716.655 proxy #21 started [trapper #1]
  4357:20190130:013716.656 proxy #22 started [trapper #2]
  4358:20190130:013716.658 proxy #23 started [trapper #3]
  4359:20190130:013716.659 proxy #24 started [trapper #4]
  4336:20190130:013716.660 proxy #1 started [configuration syncer #1]
  4360:20190130:013716.661 proxy #25 started [trapper #5]
  4361:20190130:013716.663 proxy #26 started [icmp pinger #1]
  4362:20190130:013716.663 proxy #27 started [icmp pinger #2]
  4336:20190130:013716.934 received configuration data from server at "10.255.0.1", datalen 3123
  4360:20190130:015150.293 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...
  4360:20190130:015150.293 ====== Fatal information: ======
  4360:20190130:015150.293 Program counter: 0xffffffffff600000
  4360:20190130:015150.293 === Registers: ===
  4360:20190130:015150.293 r8      =                3 =                    3 =                    3
  4360:20190130:015150.293 r9      =                0 =                    0 =                    0
  4360:20190130:015150.293 r10     = fffffffffffffb77 = 18446744073709550455 =                -1161
  4360:20190130:015150.293 r11     = ffffffffff600000 = 18446744073699065856 =            -10485760
  4360:20190130:015150.294 r12     =                c =                   12 =                   12
  4360:20190130:015150.294 r13     =                5 =                    5 =                    5
  4360:20190130:015150.294 r14     =                0 =                    0 =                    0
  4360:20190130:015150.294 r15     =                5 =                    5 =                    5
  4360:20190130:015150.294 rdi     =     7ffd98a28ff0 =      140727164243952 =      140727164243952
  4360:20190130:015150.294 rsi     =                0 =                    0 =                    0
  4360:20190130:015150.294 rbp     =                3 =                    3 =                    3
  4360:20190130:015150.294 rbx     =     7ffd98a29160 =      140727164244320 =      140727164244320
  4360:20190130:015150.294 rdx     =               10 =                   16 =                   16
  4360:20190130:015150.294 rax     =     7ffd98a29150 =      140727164244304 =      140727164244304
  4360:20190130:015150.294 rcx     =     7f4af4d36350 =      139959911801680 =      139959911801680
  4360:20190130:015150.294 rsp     =     7ffd98a28fe8 =      140727164243944 =      140727164243944
  4360:20190130:015150.294 rip     = ffffffffff600000 = 18446744073699065856 =            -10485760
  4360:20190130:015150.294 efl     =            10206 =                66054 =                66054
  4360:20190130:015150.295 csgsfs  =   2b000000000033 =    12103423998558259 =    12103423998558259
  4360:20190130:015150.295 err     =               15 =                   21 =                   21
  4360:20190130:015150.295 trapno  =                e =                   14 =                   14
  4360:20190130:015150.295 oldmask =                0 =                    0 =                    0
  4360:20190130:015150.295 cr2     = ffffffffff600000 = 18446744073699065856 =            -10485760
  4360:20190130:015150.295 === Backtrace: ===
  4334:20190130:015150.303 One child process died (PID:4360,exitcode/signal:11). Exiting ...
zabbix_proxy [4334]: Error waiting for process with PID 4360: [10] No child processes
  4334:20190130:015150.319 syncing history data...
  4334:20190130:015150.320 syncing history data done
  4334:20190130:015150.320 Zabbix Proxy stopped. Zabbix 4.0.3 (revision 87993).
Comment by Ivan [ 2019 Jan 29 ]

  4517:20190130:015150.290 Starting Zabbix Agent [agent-hostname]. Zabbix 4.0.3 (revision 87993).
  4517:20190130:015150.290 **** Enabled features ****
  4517:20190130:015150.290 IPv6 support:           NO
  4517:20190130:015150.290 TLS support:            NO
  4517:20190130:015150.290 **************************
  4517:20190130:015150.290 using configuration file: /etc/zabbix/zabbix_agentd.conf
  4517:20190130:015150.291 agent #0 started [main process]
  4522:20190130:015150.291 agent #4 started active checks #1
  4521:20190130:015150.292 agent #3 started listener #2
  4520:20190130:015150.292 agent #2 started listener #1
  4519:20190130:015150.298 agent #1 started [collector]
  4522:20190130:015150.302 cannot parse list of active checks:
  4522:20190130:015350.400 cannot parse list of active checks:
  4522:20190130:015550.501 cannot parse list of active checks:
  4522:20190130:015750.618 cannot parse list of active checks:

Comment by Ivan [ 2019 Jan 29 ]

And some debug lvl4 log of proxy after startup agent:

  4909:20190130:020150.866 In get_pinger_hosts()
  4909:20190130:020150.866 In DCconfig_get_poller_items() poller_type:3
  4909:20190130:020150.866 End of DCconfig_get_poller_items():0
  4909:20190130:020150.866 End of get_pinger_hosts():0
  4909:20190130:020150.866 In process_pinger_hosts()
  4909:20190130:020150.866 End of process_pinger_hosts()
  4909:20190130:020150.866 In DCconfig_get_poller_nextcheck() poller_type:3
  4909:20190130:020150.866 End of DCconfig_get_poller_nextcheck():-1
  4909:20190130:020150.867 __zbx_zbx_setproctitle() title:'icmp pinger #1 [got 0 values in 0.000402 sec, idle 5 sec]'
  4907:20190130:020150.909 __zbx_zbx_setproctitle() title:'trapper #4 [processing data]'
  4907:20190130:020150.909 trapper got '\{"request":"active checks","host":"agent-hostname"}'
  4907:20190130:020150.909 In send_list_of_active_checks_json()
  4907:20190130:020150.909 In is_ip4() ip:'10.255.38.5'
  4907:20190130:020150.909 End of is_ip4():SUCCEED
  4907:20190130:020150.909 In get_hostid_by_host() host:'agent-hostname' metadata:''
  4907:20190130:020150.909 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='agent-hostname' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null]
  4907:20190130:020150.911 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...
  4907:20190130:020150.911 ====== Fatal information: ======
  4907:20190130:020150.911 Program counter: 0xffffffffff600000
  4907:20190130:020150.911 === Registers: ===
  4907:20190130:020150.911 r8      =                3 =                    3 =                    3
  4907:20190130:020150.911 r9      =                0 =                    0 =                    0
  4907:20190130:020150.911 r10     = fffffffffffffb77 = 18446744073709550455 =                -1161
  4907:20190130:020150.911 r11     = ffffffffff600000 = 18446744073699065856 =            -10485760
  4907:20190130:020150.911 r12     =                c =                   12 =                   12
  4907:20190130:020150.911 r13     =                5 =                    5 =                    5
  4907:20190130:020150.911 r14     =                0 =                    0 =                    0
  4907:20190130:020150.911 r15     =                5 =                    5 =                    5
  4907:20190130:020150.911 rdi     =     7fff2b944aa0 =      140733924526752 =      140733924526752
  4907:20190130:020150.911 rsi     =                0 =                    0 =                    0
  4907:20190130:020150.911 rbp     =                3 =                    3 =                    3
  4907:20190130:020150.911 rbx     =     7fff2b944c10 =      140733924527120 =      140733924527120
  4907:20190130:020150.911 rdx     =               10 =                   16 =                   16
  4907:20190130:020150.911 rax     =     7fff2b944c00 =      140733924527104 =      140733924527104
  4907:20190130:020150.911 rcx     =     7fbf79a6c350 =      140460356453200 =      140460356453200
  4907:20190130:020150.912 rsp     =     7fff2b944a98 =      140733924526744 =      140733924526744
  4907:20190130:020150.912 rip     = ffffffffff600000 = 18446744073699065856 =            -10485760
  4907:20190130:020150.912 efl     =            10202 =                66050 =                66050
  4907:20190130:020150.912 csgsfs  =   2b000000000033 =    12103423998558259 =    12103423998558259
  4907:20190130:020150.912 err     =               15 =                   21 =                   21
  4907:20190130:020150.912 trapno  =                e =                   14 =                   14
  4907:20190130:020150.912 oldmask =                0 =                    0 =                    0
  4907:20190130:020150.912 cr2     = ffffffffff600000 = 18446744073699065856 =            -10485760
  4907:20190130:020150.912 === Backtrace: ===
  4883:20190130:020150.921 One child process died (PID:4907,exitcode/signal:11). Exiting ...
  4883:20190130:020150.921 zbx_on_exit() called
  4884:20190130:020150.921 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4885:20190130:020150.921 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4886:20190130:020150.922 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4887:20190130:020150.922 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4888:20190130:020150.922 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4889:20190130:020150.923 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4890:20190130:020150.923 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4891:20190130:020150.923 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4892:20190130:020150.924 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4893:20190130:020150.924 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4894:20190130:020150.924 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4895:20190130:020150.924 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4896:20190130:020150.925 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4897:20190130:020150.925 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4898:20190130:020150.925 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4899:20190130:020150.925 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4900:20190130:020150.926 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4901:20190130:020150.926 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4902:20190130:020150.926 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4903:20190130:020150.927 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4904:20190130:020150.927 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4905:20190130:020150.927 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4906:20190130:020150.927 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4908:20190130:020150.927 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4909:20190130:020150.928 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
  4910:20190130:020150.928 Got signal [signal:15(SIGTERM),sender_pid:4883,sender_uid:117,reason:0]. Exiting ...
zabbix_proxy [4883]: Error waiting for process with PID 4907: [10] No child processes
  4883:20190130:020150.932 In DBconnect() flag:1
  4883:20190130:020150.932 query without transaction detected
  4883:20190130:020150.932 query [txnlev:0] [pragma synchronous=0]
  4883:20190130:020150.939 query without transaction detected
  4883:20190130:020150.939 query [txnlev:0] [pragma temp_store=2]
  4883:20190130:020150.939 query without transaction detected
  4883:20190130:020150.940 query [txnlev:0] [pragma temp_store_directory='/config/user-data/zabbix/']
  4883:20190130:020150.940 End of DBconnect():0
  4883:20190130:020150.940 In free_database_cache()
  4883:20190130:020150.940 In DCsync_all()
  4883:20190130:020150.940 In sync_history_cache_full() history_num:0
  4883:20190130:020150.940 syncing history data...
  4883:20190130:020150.940 syncing history data done
  4883:20190130:020150.940 End of sync_history_cache_full()
  4883:20190130:020150.940 End of DCsync_all()
  4883:20190130:020150.940 End of free_database_cache()
  4883:20190130:020150.940 In free_configuration_cache()
  4883:20190130:020150.940 End of free_configuration_cache()
  4883:20190130:020150.941 In free_selfmon_collector() collector:0x7fbf7a150000
  4883:20190130:020150.941 End of free_selfmon_collector()
  4883:20190130:020150.941 In zbx_unload_modules()
  4883:20190130:020150.941 End of zbx_unload_modules()
  4883:20190130:020150.941 Zabbix Proxy stopped. Zabbix 4.0.3 (revision 87993).
  4971:20190130:020201.154 Starting Zabbix Proxy (active) [some-hostname]. Zabbix 4.0.3 (revision 87993).
  4971:20190130:020201.155 **** Enabled features ****
  4971:20190130:020201.155 SNMP monitoring:       YES
  4971:20190130:020201.155 IPMI monitoring:        NO
  4971:20190130:020201.155 Web monitoring:         NO
  4971:20190130:020201.155 VMware monitoring:      NO
  4971:20190130:020201.155 ODBC:                   NO
  4971:20190130:020201.155 SSH2 support:           NO
  4971:20190130:020201.155 IPv6 support:           NO
  4971:20190130:020201.155 TLS support:            NO
  4971:20190130:020201.155 **************************

Comment by Vladislavs Sokurenko [ 2019 Jan 29 ]

Is agent using OpenSSL? Could you please increase log level and attach log for particular moment ? also please do show create table hosts;
If it will not help I can provide patch with additional debug.

Comment by Ivan [ 2019 Jan 29 ]

"Is agent using OpenSSL" - No

increase log level - LLevel of proxy? 5? In attach.
zabbix_proxy.log

and with started agent: zabbix_proxy.log_wa

Comment by Vladislavs Sokurenko [ 2019 Jan 29 ]

Ok, I see it is crashing when processing discovery rule

  3052:20190130:022657.756 process_rule() range:'192.168.100.0/24'
  3052:20190130:022657.756 process_rule() ip:'192.168.100.1'
  3052:20190130:022657.756 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...

Is it fine if I add patch with additional logging ?

Comment by Ivan [ 2019 Jan 29 ]

Yes, it's ok. And plz write me how to apply this patch.

Comment by Ivan [ 2019 Jan 29 ]

I think you are wrong about discovery... I disable this rule on server and got: zabbix_proxy.log_wadisco

Comment by Vladislavs Sokurenko [ 2019 Jan 29 ]

yes, now it's different and happens after getting list of active checks.
Could you please provide result of following query ?

  3920:20190130:025038.684 __zbx_zbx_setproctitle() title:'trapper #5 [processing data]'
  3920:20190130:025038.684 trapper got '{"request":"active checks","host":"agent-hostname"}'
  3920:20190130:025038.684 In send_list_of_active_checks_json()
  3920:20190130:025038.684 In is_ip4() ip:'x.x.x.x'
  3920:20190130:025038.684 End of is_ip4():SUCCEED
  3920:20190130:025038.684 In get_hostid_by_host() host:'agent-hostname' metadata:''
  3920:20190130:025038.684 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='agent-hostname' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null]
  3920:20190130:025038.686 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...
Comment by Ivan [ 2019 Jan 30 ]

I got 1 row as result:

"10566"    "0"    "1"   null

Comment by Vladislavs Sokurenko [ 2019 Jan 30 ]

Please do patch -p0 -iZBX-15541.diff

Comment by Ivan [ 2019 Jan 30 ]

Where should I put it? Which folder?

Comment by Vladislavs Sokurenko [ 2019 Jan 30 ]

It should be executed in the folder with source code.

Comment by Ivan [ 2019 Jan 30 ]

zabbix_proxy.log

28991:20190130:191222.987 In send_list_of_active_checks_json()
 28991:20190130:191222.987 In is_ip4() ip:'10.255.30.5'
 28991:20190130:191222.987 End of is_ip4():SUCCEED
 28991:20190130:191222.987 In get_hostid_by_host() host:'agent-hostname' metadata:''
 28991:20190130:191222.987 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='agent-hostname'
and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null]
 28991:20190130:191222.987 In db_register_host()
 28991:20190130:191222.987 before zbx_gethost_by_ip
 28991:20190130:191222.987 1
 28991:20190130:191222.987 3

 

 

SQL Result:
"10564"    "0"    "1"    NULL

Comment by Vladislavs Sokurenko [ 2019 Jan 30 ]

It seems to be crashing on a call to gethostbyaddr() function

 28991:20190130:191222.987 In db_register_host()
 28991:20190130:191222.987 before zbx_gethost_by_ip
 28991:20190130:191222.987 1
 28991:20190130:191222.987 3
 28991:20190130:191222.987 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...

Can you please try one more patch ZBX-15544-2.diff and see if issue persists ?

Comment by Ivan [ 2019 Jan 30 ]

zabbix-4.0.3# patch -p0 -iZBX-15544-2.diff
patching file src/libs/zbxcomms/comms.c
Hunk #1 FAILED at 164.
1 out of 1 hunk FAILED – saving rejects to file src/libs/zbxcomms/comms.c.rej
patching file src/libs/zbxdbhigh/db.c
Reversed (or previously applied) patch detected!  Assume -R? [n]

 

Do I need to patch fresh sources?

Comment by Ivan [ 2019 Jan 30 ]

zabbix_proxy.log

 

Comment by Vladislavs Sokurenko [ 2019 Jan 30 ]

Please patch fresh sources

Comment by Ivan [ 2019 Jan 30 ]

And one more log with crash on discovery

zabbix_proxy.log_discovery

 

Upper log with crash on agent response.

Comment by Vladislavs Sokurenko [ 2019 Jan 30 ]

Sorry for asking but have you recompiled after patching again ?

Comment by Ivan [ 2019 Jan 30 ]

I made all again, patch, compile, now everything works perfectly!
No crashes, thanks a lot!
Could you explain, what was the problem? And why earlier I had no such problems?

Comment by Vladislavs Sokurenko [ 2019 Jan 30 ]

I have removed resolving of IP address into DNS, you can also try compiling with --enable-ipv6 flag to use different functions in order to avoid this issue.
Can you please provide a little bit more information about you system ? It is very strange that gethostbyaddr() function crash, have you tried updating your system to latest version ?
Can you please show output of this command ?
ldd --version
Does pinging this agent IP address result in some kind of DNS address ?

You can also try compiling this program gethostbyaddr.c and supplying your IP address
gcc gethostbyaddr.c
./a.out x.x.x.x

Comment by Ivan [ 2019 Jan 31 ]

"have you tried updating your system to latest version ?" - Last version of OS or Zabbix? OS impossible to update. Zabbix 4.0.3

ldd --version
ldd (Debian GLIBC 2.19-18+deb8u10) 2.19
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

" Does pinging this agent IP address result in some kind of DNS address ?" - No

"./a.out x.x.x.x" - Which IP?   ZabbixServer (10.255.0.1) - VPN - 10.255.30.5 ZabbixProxy 192.168.1.234 - LAN

Comment by Vladislavs Sokurenko [ 2019 Jan 31 ]

Please try all of the mentioned.

Comment by Ivan [ 2019 Jan 31 ]

ipv6 you mean? Should i try it on fresh or patched sources?

a.out - which IP? Read above ^

Comment by Vladislavs Sokurenko [ 2019 Jan 31 ]

for IPv6 you should try clean sources.
a.out - try them all and please see if crash occurs.

Comment by Ivan [ 2019 Jan 31 ]

Patched sources + ipv6 below:

zabbix_proxy.log_ipv6_discov

zabbix_proxy.log_ipv6_agent

 

clean sources i'll do soon (few min)

/tmp/a.out 192.168.1.234
/tmp/a.out 192.168.1.14
/tmp/a.out 10.255.0.1
/tmp/a.out 10.255.30.5
(all results is empty)
/tmp/a.out 127.0.0.1
host name 'localhost'

Comment by Vladislavs Sokurenko [ 2019 Jan 31 ]

Ok, so test program does not crash this is a long shot but you could try one more patch ZBX-15544-3.diff with ipv6
Unfortunately I cannot reproduce same issue

Comment by Ivan [ 2019 Jan 31 ]

Clean sources with ipv6 (without patch):

 14434:20190131:161838.210 End of DCconfig_get_poller_nextcheck():1548937119
 14434:20190131:161838.210 End of get_values():0
 14434:20190131:161838.210 __zbx_zbx_setproctitle() title:'poller #1 [got 0 values in 0.000329 sec, idle 1 sec]'
 14440:20190131:161838.724 __zbx_zbx_setproctitle() title:'trapper #1 [processing data]'
 14440:20190131:161838.724 trapper got '\{"request":"active checks","host":"agent-hostname"}'
 14440:20190131:161838.724 In send_list_of_active_checks_json()
 14440:20190131:161838.724 In is_ip4() ip:'10.255.30.5'
 14440:20190131:161838.724 End of is_ip4():SUCCEED
 14440:20190131:161838.724 In get_hostid_by_host() host:'agent-hostname' metadata:''
 14440:20190131:161838.724 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='agent-hostname'
 14440:20190131:161838.725 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...
 14440:20190131:161838.725 ====== Fatal information: ======
 14440:20190131:161838.725 Program counter: 0xffffffffff600000
 14440:20190131:161838.725 === Registers: ===
 14440:20190131:161838.725 r8      =                3 =                    3 =                    3
 14440:20190131:161838.725 r9      =                0 =                    0 =                    0
 14440:20190131:161838.725 r10     =     7ffc91cbb4b0 =      140722754532528 =      140722754532528
 14440:20190131:161838.725 r11     = ffffffffff600000 = 18446744073699065856 =            -10485760
 14440:20190131:161838.725 r12     =                8 =                    8 =                    8
 14440:20190131:161838.725 r13     =                5 =                    5 =                    5
 14440:20190131:161838.725 r14     =                0 =                    0 =                    0
 14440:20190131:161838.725 r15     =                5 =                    5 =                    5
 14440:20190131:161838.725 rdi     =     7ffc91cbb6c0 =      140722754533056 =      140722754533056
 14440:20190131:161838.725 rsi     =                0 =                    0 =                    0
 14440:20190131:161838.725 rbp     =                2 =                    2 =                    2
 14440:20190131:161838.725 rbx     =     7ffc91cbb830 =      140722754533424 =      140722754533424
 14440:20190131:161838.725 rdx     =               10 =                   16 =                   16
 14440:20190131:161838.725 rax     =     7ffc91cbb820 =      140722754533408 =      140722754533408
 14440:20190131:161838.725 rcx     =     7f313172b350 =      139849259725648 =      139849259725648
 14440:20190131:161838.725 rsp     =     7ffc91cbb6b8 =      140722754533048 =      140722754533048
 14440:20190131:161838.725 rip     = ffffffffff600000 = 18446744073699065856 =            -10485760
 14440:20190131:161838.725 efl     =            10206 =                66054 =                66054
 14440:20190131:161838.725 csgsfs  =   2b000000000033 =    12103423998558259 =    12103423998558259
 14440:20190131:161838.725 err     =               15 =                   21 =                   21
 14440:20190131:161838.725 trapno  =                e =                   14 =                   14
 14440:20190131:161838.725 oldmask =                0 =                    0 =                    0
 14440:20190131:161838.725 cr2     = ffffffffff600000 = 18446744073699065856 =            -10485760
 14440:20190131:161838.725 === Backtrace: ===
 14420:20190131:161838.731 One child process died (PID:14440,exitcode/signal:11). Exiting ...
 14420:20190131:161838.732 zbx_on_exit() called
 14421:20190131:161838.732 Got signal [signal:15(SIGTERM),sender_pid:14420,sender_uid:117,reason:0]. Exiting ...
 14422:20190131:161838.732 Got signal [signal:15(SIGTERM),sender_pid:14420,sender_uid:117,reason:0]. Exiting ...
Comment by Ivan [ 2019 Jan 31 ]

Patch3 + ipv6

zabbix_proxy.log_patch3

Comment by Vladislavs Sokurenko [ 2019 Jan 31 ]

Does putting same Zabbix proxy on some other operating system help ?

Comment by Ivan [ 2019 Feb 01 ]

On other router it works perfectly:

zabbix_proxy -V
zabbix_proxy (Zabbix) 4.0.2
Revision 87228 26 November 2018, compilation time: Dec  2 2018 22:39:16

ldd --version
ldd (Debian GLIBC 2.19-18+deb8u10) 2.19
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

uname -a
Linux other-host 4.14.26-amd64-vyos #9 SMP Mon Jul 30 14:36:44 CEST 2018 x86_64 GNU/Linux

on other one

uname -a
Linux other-host 4.4.113-amd64-vyos #3 SMP Fri Jan 26 12:30:00 CET 2018 x86_64 GNU/Linux

no problems...

 

But after some (i don't know what exactly) changes of router OS, i got this issue...

I have 16 proxys on other routers.

Comment by Vladislavs Sokurenko [ 2019 Feb 01 ]

Have you tried running gethostbyaddr.c on Zabbix prpxy that crashes ? Are you doing some kind of cross compilation ?

Comment by Ivan [ 2019 Feb 04 ]

Exactly on that host I ran it. I compiled it on my pc and copied it on proxy.

Comment by Ivan [ 2019 Feb 08 ]

No ideas?

Comment by Vladislavs Sokurenko [ 2019 Feb 08 ]

Can you try compiling on router ?

Comment by Ivan [ 2019 Feb 09 ]

try to compile zabbix? no, it's impossible. there are no required libraries....

Comment by Vladislavs Sokurenko [ 2019 Feb 09 ]

Which version of glibc is there on that router ? is it possible building with the same versions and seeing if that helps ?

Comment by Ivan [ 2019 Feb 10 ]

ii  libglib2.0-0:amd64               2.42.1-1+b1                      amd64        GLib library of C routines

Comment by Vladislavs Sokurenko [ 2019 Feb 11 ]

Does it mean that glibc on router is higher than on the machine you do compilation on ?

Router:
2.42
Build machine:
2.19

Comment by Vladislavs Sokurenko [ 2019 Feb 11 ]

Currently there is no indication of a bug in zabbix, please feel free to reopen if you disagree.

Comment by Ivan [ 2019 Feb 11 ]

No.

Router:

dpkg -l | grep GLib
ii libglib2.0-0:amd64 2.42.1-1+b1 amd64 GLib library of C routines
ii libglib2.0-data 2.42.1-1 all Common files for GLib library

ldd --version
ldd (Debian GLIBC 2.19-18+deb8u10) 2.19
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

Building machine:

#dpkg -l | grep GLib

ii  libglib2.0-0:amd64                    2.42.1-1+b1                                amd64        GLib library of C routines
ii  libglib2.0-bin                        2.42.1-1+b1                                amd64        Programs for the GLib library
ii  libglib2.0-data                       2.42.1-1                                   all          Common files for GLib library

ldd --version
ldd (Debian GLIBC 2.19-18+deb8u10) 2.19
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.

Comment by Ivan [ 2019 Feb 13 ]

"Currently there is no indication of a bug in zabbix, please feel free to reopen if you disagree."

I didn't see it, sorry. But where is a bug? How do you think? glibc? or where?

If you removed resolving of IP address into DNS and it helps, BUT it didn't helps with discovery rules...

AND your test app a.out didn't crashes...

Comment by Vladislavs Sokurenko [ 2019 Feb 14 ]

Could you please try one more patch ZBX-15544-4.diff to see what is passed to gethostbyaddr function, unfortunately it current looks like problem inside gethostbyaddr() without backtrace it's hard to tell, is it possible to get ?

Comment by Ivan [ 2019 Feb 14 ]

May be you could log param that passed to input and what we got from output...

Comment by Vladislavs Sokurenko [ 2019 Feb 15 ]

Attached ZBX-15544-6.diff

Comment by Ivan [ 2019 Feb 15 ]

 5147:20190215:195333.002 __zbx_zbx_setproctitle() title:'discoverer #3 [processed 0 rules in 0.000000 sec, performing discovery]'
  5147:20190215:195333.002 query [txnlev:0] [select distinct r.druleid,r.iprange,r.name,c.dcheckid,r.proxy_hostid,r.delay from drules r left join dchecks c on c.druleid=r.druleid and c.uniq=1 where r.status=0 and r.nextcheck<=1550249612 and r.druleid%3=2]
  5147:20190215:195333.002 In substitute_simple_macros() data:'3600'
  5147:20190215:195333.002 In process_rule() rule:'LLD SL' range:'192.168.100.0/24'
  5147:20190215:195333.002 process_rule() range:'192.168.100.0/24'
  5147:20190215:195333.002 process_rule() ip:'192.168.100.1'
  5147:20190215:195333.003 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...
  5147:20190215:195333.003 ====== Fatal information: ======
  5147:20190215:195333.003 Program counter: 0xffffffffff600000
  5147:20190215:195333.003 === Registers: ===
  5147:20190215:195333.003 r8      =                3 =                    3 =                    3
  5147:20190215:195333.003 r9      =                0 =                    0 =                    0
  5147:20190215:195333.003 r10     =     7fff450b4d40 =      140734351756608 =      140734351756608
  5147:20190215:195333.003 r11     = ffffffffff600000 = 18446744073699065856 =            -10485760
  5147:20190215:195333.003 r12     =                c =                   12 =                   12
  5147:20190215:195333.003 r13     =                5 =                    5 =                    5
  5147:20190215:195333.003 r14     =                0 =                    0 =                    0
  5147:20190215:195333.003 r15     =                5 =                    5 =                    5
  5147:20190215:195333.003 rdi     =     7fff450b4f50 =      140734351757136 =      140734351757136
  5147:20190215:195333.003 rsi     =                0 =                    0 =                    0
  5147:20190215:195333.003 rbp     =                3 =                    3 =                    3
  5147:20190215:195333.003 rbx     =     7fff450b50c0 =      140734351757504 =      140734351757504
  5147:20190215:195333.003 rdx     =               10 =                   16 =                   16
  5147:20190215:195333.003 rax     =     7fff450b50b0 =      140734351757488 =      140734351757488
  5147:20190215:195333.003 rcx     =     7f71e6815350 =      140127175267152 =      140127175267152
  5147:20190215:195333.003 rsp     =     7fff450b4f48 =      140734351757128 =      140734351757128
  5147:20190215:195333.003 rip     = ffffffffff600000 = 18446744073699065856 =            -10485760
  5147:20190215:195333.003 efl     =            10206 =                66054 =                66054
  5147:20190215:195333.003 csgsfs  =   2b000000000033 =    12103423998558259 =    12103423998558259
  5147:20190215:195333.003 err     =               15 =                   21 =                   21
  5147:20190215:195333.003 trapno  =                e =                   14 =                   14
  5147:20190215:195333.003 oldmask =                0 =                    0 =                    0
  5147:20190215:195333.003 cr2     = ffffffffff600000 = 18446744073699065856 =            -10485760
  5147:20190215:195333.004 === Backtrace: ===
  5145:20190215:195333.004 query without transaction detected
  5145:20190215:195333.004 query [txnlev:0] [pragma temp_store_directory='/config/user-data/zabbix/']
  5160:20190215:195333.004 query without transaction detected
  5160:20190215:195333.004 query [txnlev:0] [pragma temp_store_directory='/config/user-data/zabbix/']
  5140:20190215:195333.004 query [txnlev:1] [begin;]
  5140:20190215:195333.004 In process_proxyconfig_table() table:'globalmacro'
  5140:20190215:195333.004 query [txnlev:1] [select globalmacroid,macro,value from globalmacro]
  5140:20190215:195333.004 End of process_proxyconfig_table():SUCCEED
  5140:20190215:195333.004 In process_proxyconfig_table() table:'hosts'
  5140:20190215:195333.004 query [txnlev:1] [select hostid,host,status,available,ipmi_authtype,ipmi_privilege,ipmi_username,ipmi_password,ipmi_available,snmp_available,jmx_available,name,tls_connect,tls_accept,tls_issuer,tls_subject,tls_psk_identity,tls_psk from hosts]
  5140:20190215:195333.005 End of process_proxyconfig_table():SUCCEED
  5140:20190215:195333.005 In process_proxyconfig_table() table:'interface'

Comment by Vladislavs Sokurenko [ 2019 Feb 15 ]

I am sorry but it does not look like patch has been applied.

Comment by Ivan [ 2019 Feb 15 ]

  6154:20190215:195857.916 trapper got '{"request":"active checks","host":"SL-VPN-Router"}'
  6154:20190215:195857.916 In send_list_of_active_checks_json()
  6154:20190215:195857.916 In is_ip4() ip:'10.255.82.5'
  6154:20190215:195857.916 End of is_ip4():SUCCEED
  6154:20190215:195857.916 In get_hostid_by_host() host:'SL-VPN-Router' metadata:''
  6154:20190215:195857.916 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='SL-VPN-Router' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null]
  6154:20190215:195857.917 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...
  6154:20190215:195857.917 ====== Fatal information: ======
  6154:20190215:195857.917 Program counter: 0xffffffffff600000
  6154:20190215:195857.917 === Registers: ===
  6154:20190215:195857.917 r8      =                3 =                    3 =                    3
  6154:20190215:195857.917 r9      =                0 =                    0 =                    0
  6154:20190215:195857.917 r10     =     7ffd08a38010 =      140724748386320 =      140724748386320
  6154:20190215:195857.917 r11     = ffffffffff600000 = 18446744073699065856 =            -10485760
  6154:20190215:195857.917 r12     =                c =                   12 =                   12
  6154:20190215:195857.917 r13     =                5 =                    5 =                    5
  6154:20190215:195857.917 r14     =                0 =                    0 =                    0
  6154:20190215:195857.917 r15     =                5 =                    5 =                    5
  6154:20190215:195857.917 rdi     =     7ffd08a38220 =      140724748386848 =      140724748386848
  6154:20190215:195857.917 rsi     =                0 =                    0 =                    0
  6154:20190215:195857.917 rbp     =                3 =                    3 =                    3
  6154:20190215:195857.917 rbx     =     7ffd08a38390 =      140724748387216 =      140724748387216
  6154:20190215:195857.917 rdx     =               10 =                   16 =                   16
  6154:20190215:195857.917 rax     =     7ffd08a38380 =      140724748387200 =      140724748387200
  6154:20190215:195857.917 rcx     =     7f538d200350 =      139996826698576 =      139996826698576
  6154:20190215:195857.917 rsp     =     7ffd08a38218 =      140724748386840 =      140724748386840
  6154:20190215:195857.917 rip     = ffffffffff600000 = 18446744073699065856 =            -10485760
  6154:20190215:195857.917 efl     =            10206 =                66054 =                66054
  6154:20190215:195857.917 csgsfs  =   2b000000000033 =    12103423998558259 =    12103423998558259
  6154:20190215:195857.917 err     =               15 =                   21 =                   21
  6154:20190215:195857.917 trapno  =                e =                   14 =                   14
  6154:20190215:195857.917 oldmask =                0 =                    0 =                    0
  6154:20190215:195857.917 cr2     = ffffffffff600000 = 18446744073699065856 =            -10485760
  6154:20190215:195857.917 === Backtrace: ===
  6109:20190215:195857.939 One child process died (PID:6154,exitcode/signal:11). Exiting ...
  6109:20190215:195857.939 zbx_on_exit() called

Comment by Ivan [ 2019 Feb 15 ]

1st was discovery rule crash and 2nd agent crash.

Comment by Ivan [ 2019 Feb 15 ]

 

 

zabbix_proxy -V
zabbix_proxy (Zabbix) 4.0.4
Revision 89349 4 February 2019, compilation time: Feb 16 2019 00:05:54

Copyright (C) 2019 Zabbix SIA
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it according to
the license. There is NO WARRANTY, to the extent permitted by law.

 

 

  6430:20190215:200902.640 In proxy_data_sender()
  6430:20190215:200902.640 query [txnlev:0] [select t.taskid,t.type,t.clock,t.ttl,r.status,r.parent_taskid,r.info from task t,task_remote_command_result r where t.taskid=r.taskid and t.status=1 and t.type=3 order by t.taskid]
  6430:20190215:200902.640 In connect_to_server() [10.255.0.1]:10051 [timeout:600]
  6430:20190215:200902.683 In put_data_to_server() datalen:153
  6430:20190215:200902.683 In zbx_recv_response()
  6430:20190215:200902.728 zbx_tcp_recv_ext(): received 30 bytes with compression ratio 0.7
  6430:20190215:200902.728 zbx_recv_response() '{"response":"success"}'
  6430:20190215:200902.728 End of zbx_recv_response():SUCCEED
  6430:20190215:200902.728 End of put_data_to_server():SUCCEED
  6430:20190215:200902.728 End of proxy_data_sender():SUCCEED more:0 flags:0x8000
  6430:20190215:200902.728 __zbx_zbx_setproctitle() title:'data sender [sent 0 values in 0.088535 sec, idle 30 sec]'
  6449:20190215:200903.075 __zbx_zbx_setproctitle() title:'trapper #1 [processing data]'
  6449:20190215:200903.075 trapper got '{"request":"active checks","host":"SL-VPN-Router"}'
  6449:20190215:200903.075 In send_list_of_active_checks_json()
  6449:20190215:200903.075 In is_ip4() ip:'10.255.82.5'
  6449:20190215:200903.075 End of is_ip4():SUCCEED
  6449:20190215:200903.075 In get_hostid_by_host() host:'SL-VPN-Router' metadata:''
  6449:20190215:200903.075 query [txnlev:0] [select h.hostid,h.status,h.tls_accept,a.host_metadata from hosts h left join autoreg_host a on a.proxy_hostid is null and a.host=h.host where h.host='SL-VPN-Router' and h.status in (0,1) and h.flags<>2 and h.proxy_hostid is null]
  6449:20190215:200903.076 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0xffffffffff600000]. Crashing ...
  6449:20190215:200903.076 ====== Fatal information: ======
  6449:20190215:200903.076 Program counter: 0xffffffffff600000
  6449:20190215:200903.076 === Registers: ===
  6449:20190215:200903.076 r8      =                3 =                    3 =                    3
  6449:20190215:200903.076 r9      =                0 =                    0 =                    0
  6449:20190215:200903.076 r10     =     7ffe89ad4b20 =      140731208256288 =      140731208256288
  6449:20190215:200903.076 r11     = ffffffffff600000 = 18446744073699065856 =            -10485760
  6449:20190215:200903.076 r12     =                c =                   12 =                   12
  6449:20190215:200903.076 r13     =                5 =                    5 =                    5
  6449:20190215:200903.076 r14     =                0 =                    0 =                    0
  6449:20190215:200903.076 r15     =                5 =                    5 =                    5
  6449:20190215:200903.076 rdi     =     7ffe89ad4d30 =      140731208256816 =      140731208256816
  6449:20190215:200903.076 rsi     =                0 =                    0 =                    0
  6449:20190215:200903.076 rbp     =                3 =                    3 =                    3
  6449:20190215:200903.076 rbx     =     7ffe89ad4ea0 =      140731208257184 =      140731208257184
  6449:20190215:200903.076 rdx     =               10 =                   16 =                   16
  6449:20190215:200903.076 rax     =     7ffe89ad4e90 =      140731208257168 =      140731208257168
  6449:20190215:200903.076 rcx     =     7f30d200f350 =      139847658435408 =      139847658435408
  6449:20190215:200903.076 rsp     =     7ffe89ad4d28 =      140731208256808 =      140731208256808
  6449:20190215:200903.076 rip     = ffffffffff600000 = 18446744073699065856 =            -10485760
  6449:20190215:200903.076 efl     =            10206 =                66054 =                66054
  6449:20190215:200903.076 csgsfs  =   2b000000000033 =    12103423998558259 =    12103423998558259
  6449:20190215:200903.076 err     =               15 =                   21 =                   21
  6449:20190215:200903.076 trapno  =                e =                   14 =                   14
  6449:20190215:200903.076 oldmask =                0 =                    0 =                    0
  6449:20190215:200903.077 cr2     = ffffffffff600000 = 18446744073699065856 =            -10485760
  6449:20190215:200903.077 === Backtrace: ===
  6441:20190215:200903.089 __zbx_zbx_setproctitle() title:'self-monitoring [processing data]'
  6441:20190215:200903.089 In collect_selfmon_stats()
  6441:20190215:200903.089 End of collect_selfmon_stats()
  6441:20190215:200903.089 __zbx_zbx_setproctitle() title:'self-monitoring [processed data in 0.000110 sec, idle 1 sec]'
  6405:20190215:200903.094 One child process died (PID:6449,exitcode/signal:11). Exiting ...

Comment by Ivan [ 2019 Feb 15 ]

I took fresh clean sources 4.0.4, applied patch ZBX-15544-6.diff and compile it...

Comment by Ivan [ 2019 Feb 19 ]

Vladislav, no ideas?

Comment by Vladislavs Sokurenko [ 2019 Feb 19 ]

I am sorry but it does not look like patch has been applied

Comment by Ivan [ 2019 Feb 22 ]

Is it possible that it's connected? https://phabricator.vyos.net/T1214

Comment by Vladislavs Sokurenko [ 2019 Feb 22 ]

I am sorry but I don't know much about vyos for some reason also backtrace is missing, which flags have you used during compilation ?

Comment by Ivan [ 2019 Feb 25 ]

./configure --enable-ipv6 --enable-agent --enable-proxy --with-sqlite3 --enable-static --with-net-snmp --prefix=/home/vyos/zabbix_sources/vyos-noc-zabbix/sbin/ --sysconfdir=/home/vyos/zabbix_sources/vyos-noc-zabbix/etc/zabbix/

Comment by Vladislavs Sokurenko [ 2019 Feb 25 ]

Can you please try without --enable-static ?

Comment by Ivan [ 2019 Feb 25 ]

I can (i'll try tomorrow), but why?
"You may use the --enable-static flag to statically link libraries. If you plan to distribute compiled binaries among different servers, you must use this flag to make these binaries work without required libraries."

 

distribute compiled binaries among different proxy servers...

Comment by Vladislavs Sokurenko [ 2019 Feb 25 ]

Please also show output of ldd zabbix_proxy, VyOS might have some differences in glibc implementation, it's better not to use static and use their shared library
Also please show result output messages when compiling proxy

Comment by Vladislavs Sokurenko [ 2019 Feb 28 ]

Closing as Won't fix as there is no indication of a bug in zabbix

Generated at Fri Mar 29 12:22:25 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.