[ZBX-9655] Zabbix Server CRASH Created: 2015 Jun 23 Updated: 2017 Oct 25 Resolved: 2017 Oct 25 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 2.4.5 |
Fix Version/s: | 2.2.11rc1, 2.4.7rc1, 3.0.0alpha2 |
Type: | Incident report | Priority: | Major |
Reporter: | sysops | Assignee: | Unassigned |
Resolution: | Duplicate | Votes: | 0 |
Labels: | crash, freebsd | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
10.1-RELEASE FreeBSD 10.1-RELEASE #0 r274401: [email protected]:/usr/obj/usr/src/sys/GENERIC amd64 |
Attachments: |
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
||||||||||||||||||||
Issue Links: |
|
Description |
Suddenly Crashing ... 526:20150622:172758.659 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=23278] 526:20150622:172758.659 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ... 526:20150622:172758.659 ====== Fatal information: ====== 526:20150622:172758.659 program counter not available for this architecture 526:20150622:172758.659 === Registers: === 526:20150622:172758.659 register dump not available for this architecture 526:20150622:172758.660 === Backtrace: === 526:20150622:172758.660 1: 0x466fc6 <print_fatal_info+0x86> at /usr/local/sbin/zabbix_server 526:20150622:172758.660 0: 0x46743c <zbx_set_common_signal_handlers+0x2fc> at /usr/local/sbin/zabbix_server 526:20150622:172758.660 === Memory map: === 526:20150622:172758.660 memory map not available for this platform 526:20150622:172758.660 ================================ 513:20150622:172758.662 One child process died (PID:526,exitcode/signal:1). Exiting ... 513:20150622:172800.726 syncing history data... 513:20150622:172800.726 syncing history data done 513:20150622:172800.726 syncing trends data... 513:20150622:172800.735 syncing trends data done 513:20150622:172800.736 Zabbix Server stopped. Zabbix 2.4.5 (revision 53282). Zabbix server v2.4.5 (revision 53282) (21 April 2015) |
Comments |
Comment by sysops [ 2015 Jun 24 ] |
list of package installed on FreeBSD server (where zabbix server run) |
Comment by sysops [ 2015 Jun 24 ] |
Another crash... 539:20150623:233751.082 executing housekeeper 539:20150623:233751.564 housekeeper [deleted 11745 hist/trends, 0 items, 0 events, 0 sessions, 0 alarms, 0 audit items in 0.479474 sec, idle 1 hour(s)] 556:20150624:000002.120 [Z3005] query failed: [1317] Query execution was interrupted [select escalationid,actionid,triggerid,eventid,r_eventid,nextcheck,esc_step,status,itemid from escalations order by actionid,triggerid,itemid,escalationid] 544:20150624:000002.121 [Z3005] query failed: [2013] Lost connection to MySQL server during query [commit;] 544:20150624:000002.121 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61) 544:20150624:000002.121 database is down: reconnecting in 10 seconds 522:20150624:000002.121 [Z3005] query failed: [2013] Lost connection to MySQL server during query [update hosts set errors_from=1435104002,disable_until=1435104017,error='Get value from agent failed: cannot connect to [[192.168.77.111]:10050]: [61] Connection refused' where hostid=10112] 522:20150624:000002.122 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ... 522:20150624:000002.122 ====== Fatal information: ====== 522:20150624:000002.122 program counter not available for this architecture 522:20150624:000002.122 === Registers: === 522:20150624:000002.122 register dump not available for this architecture 522:20150624:000002.122 === Backtrace: === 522:20150624:000002.122 1: 0x466fc6 <print_fatal_info+0x86> at /usr/local/sbin/zabbix_server 522:20150624:000002.122 0: 0x46743c <zbx_set_common_signal_handlers+0x2fc> at /usr/local/sbin/zabbix_server 522:20150624:000002.122 === Memory map: === 522:20150624:000002.122 memory map not available for this platform 522:20150624:000002.122 ================================ 513:20150624:000002.124 One child process died (PID:522,exitcode/signal:1). Exiting ... 513:20150624:000004.193 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61) 513:20150624:000004.193 Cannot connect to the database. Exiting... |
Comment by sysops [ 2015 Jun 24 ] |
another one crash: 3048:20150624:080703.419 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='zabbix01.bi-different.intra' and status in (0,1) and flags<>2 and proxy_hostid is null] 3049:20150624:081459.419 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='kirk01' and status in (0,1) and flags<>2 and proxy_hostid is null] 3055:20150624:081643.479 executing housekeeper 3055:20150624:081644.608 housekeeper [deleted 46463 hist/trends, 15 items, 0 events, 0 sessions, 0 alarms, 0 audit items in 1.126811 sec, idle 1 hour(s)] 3051:20150624:081659.609 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='kirk01' and status in (0,1) and flags<>2 and proxy_hostid is null] 3050:20150624:082504.109 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='zabbix01.bi-different.intra' and status in (0,1) and flags<>2 and proxy_hostid is null] 3043:20150624:082758.105 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=23278] 3043:20150624:082758.105 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ... 3043:20150624:082758.105 ====== Fatal information: ====== 3043:20150624:082758.105 program counter not available for this architecture 3043:20150624:082758.105 === Registers: === 3043:20150624:082758.105 register dump not available for this architecture 3043:20150624:082758.105 === Backtrace: === 3043:20150624:082758.106 1: 0x466fc6 <print_fatal_info+0x86> at /usr/local/sbin/zabbix_server 3043:20150624:082758.106 0: 0x46743c <zbx_set_common_signal_handlers+0x2fc> at /usr/local/sbin/zabbix_server 3043:20150624:082758.106 === Memory map: === 3043:20150624:082758.106 memory map not available for this platform 3043:20150624:082758.106 ================================ 3039:20150624:082758.109 One child process died (PID:3043,exitcode/signal:1). Exiting ... 3039:20150624:082800.137 syncing history data... 3039:20150624:082800.141 syncing history data done 3039:20150624:082800.141 syncing trends data... 3039:20150624:082800.209 syncing trends data done 3039:20150624:082800.209 Zabbix Server stopped. Zabbix 2.4.5 (revision 53282). |
Comment by sysops [ 2015 Jun 24 ] |
I was be able to reproduce crash: … start zabbix server on HOST (A) 13758:20150624:083400.035 server #13 started [trapper #5] 13763:20150624:083400.036 server #18 started [http poller #1] 13766:20150624:083400.036 server #21 started [history syncer #2] 13767:20150624:083400.038 server #22 started [history syncer #3] 13765:20150624:083400.039 server #20 started [history syncer #1] 13770:20150624:083400.040 server #25 started [proxy poller #1] 13764:20150624:083400.041 server #19 started [discoverer #1] 13771:20150624:083400.041 server #26 started [self-monitoring #1] 13769:20150624:083400.042 server #24 started [escalator #1] 13768:20150624:083400.042 server #23 started [history syncer #4] .. everything is ok… .. manual stopping mysql server on remote host (B) where zabbix database is stored [[systemctl stop mysql.service]] 13762:20150624:091631.778 [Z3005] query failed: [2013] Lost connection to MySQL server during query [commit;] 13762:20150624:091631.779 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61) 13762:20150624:091631.779 database is down: reconnecting in 10 seconds 13754:20150624:091738.040 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61) 13754:20150624:091738.040 database is down: reconnecting in 10 seconds ..after one minute….manual start mysql service on remote host (B) where zabbix database is stored 13768:20150624:091741.745 database connection re-established 13754:20150624:091748.073 database connection re-established ..after a while log of zabbix server (on host A) is populate with : 13755:20150624:091750.370 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='hanzo.bi-different.intra' and status in (0,1) and flags<>2 and proxy_hostid is null] 13757:20150624:091806.090 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='zabbix01.bi-different.intra' and status in (0,1) and flags<>2 and proxy_hostid is null] 13758:20150624:092151.090 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='hanzo.bi-different.intra' and status in (0,1) and flags<>2 and proxy_hostid is null] ..and ten minutes later without other logs zabbix server (on host A) crash: 13752:20150624:092758.041 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=23278] 13752:20150624:092758.042 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ... 13752:20150624:092758.042 ====== Fatal information: ====== 13752:20150624:092758.042 program counter not available for this architecture 13752:20150624:092758.042 === Registers: === 13752:20150624:092758.042 register dump not available for this architecture 13752:20150624:092758.042 === Backtrace: === 13752:20150624:092758.043 1: 0x466fc6 <print_fatal_info+0x86> at /usr/local/sbin/zabbix_server 13752:20150624:092758.043 0: 0x46743c <zbx_set_common_signal_handlers+0x2fc> at /usr/local/sbin/zabbix_server 13752:20150624:092758.043 === Memory map: === 13752:20150624:092758.043 memory map not available for this platform 13752:20150624:092758.043 ================================ 13745:20150624:092758.047 One child process died (PID:13752,exitcode/signal:1). Exiting ... 13745:20150624:092800.089 syncing history data... 13745:20150624:092800.091 syncing history data done 13745:20150624:092800.091 syncing trends data... 13745:20150624:092800.171 syncing trends data done 13745:20150624:092800.171 Zabbix Server stopped. Zabbix 2.4.5 (revision 53282). |
Comment by sysops [ 2015 Jun 24 ] |
One hour later after restart of mysql (see attachment) zabbix server crash (see log with debug level=4) |
Comment by Aleksandrs Saveljevs [ 2015 Jun 25 ] |
This looks like a duplicate of |
Comment by Andris Zeila [ 2015 Jun 30 ] |
I've been trying to reproduce it without any luck. My setup was two virtual machines running 10.1-RELEASE FreeBSD 10.1-RELEASE #0 r274401. I tried running zabbix server from the default package and also one compiled from sources. It had default configuration monitoring one host. On the other system I was running mysql server from the default mysql56-server package. So the setup was as close as I could get to the described one, but no stopping/starting mysql server caused any zabbix server crashes. What I noticed you had mysql56-client-5.6.24 package listed, while on my installations I had mysql56-client-5.6.24_1. Could not find anything about fixed crashes in changelog - so I doubt it was the reason. Still maybe you could upgrade your mysql package and try to reproduce the crash again? |
Comment by sysops [ 2015 Jun 30 ] |
Reproduced. [root@k.......]]# date root@zabbix# cat /var/log/zabbix_server.log | grep stopped |
Comment by sysops [ 2015 Jun 30 ] |
I have upgraded mysql56-client-5.6.24_1 of zabbix server via source ports stopped mysqlserver (on Oracle Linux ) for 3 min |
Comment by Andris Zeila [ 2015 Jul 02 ] |
Thanks, but still can't reproduce it. Nothing really suspicious in DB connection code. Would it be possible to build Zabbix server from sources with the attached patch (freebsd_backtrace_x64.diff) and reproduce the crash? It should give proper stack trace. (Note that the patch will work only on 64 bit freebsd systems). |
Comment by sysops [ 2015 Jul 09 ] |
..the problem persist after each mysql down I tried your patch but I am unable to complie: root@zabbix01:/usr/home/user/zabbix-2.4.5 # tar -xvf zabbix-2.4.5.tar.gz root@zabbix01:/usr/home/user # patch zabbix-2.4.5/src/libs/zbxnix/fatal.c freebsd_backtrace_x64.diff Hmm... Looks like a unified diff to me... The text leading up to this was: -------------------------- |Index: src/libs/zbxnix/fatal.c |=================================================================== |--- src/libs/zbxnix/fatal.c (revision 54194) |+++ src/libs/zbxnix/fatal.c (working copy) -------------------------- Patching file zabbix-2.4.5/src/libs/zbxnix/fatal.c using Plan A... Hunk #1 succeeded at 66. Hunk #2 succeeded at 231. Hunk #3 succeeded at 311. done root@zabbix01:/usr/home/user# cd zabbix-2.4.5 root@zabbix01:/usr/home/user/zabbix-2.4.5 # ./configure --enable-server --with-mysql --with-libcurl --with-libxml2 *********************************************************** * Now run 'make install' * * * * Thank you for using Zabbix! * * <http://www.zabbix.com> * *********************************************************** root@zabbix01:/usr/home/user/zabbix-2.4.5 # make install Making install in src Making install in libs Making install in zbxcrypto cc -DHAVE_CONFIG_H -I. -I../../../include -g -O2 -I/usr/local/include/mysql -pipe -fstack-protector -fno-strict-aliasing -g -fno-omit-frame-pointer -fno-strict-aliasing -I/usr/local/include/libxml2 -I/usr/include -MT md5.o -MD -MP -MF .deps/md5.Tpo -c -o md5.o md5.c In file included from md5.c:54: In file included from ../../../include/common.h:23: In file included from ../../../include/sysinc.h:372: /usr/include/sys/timeb.h:42:2: warning: "this file includes <sys/timeb.h> which is deprecated" [-W#warnings] #warning "this file includes <sys/timeb.h> which is deprecated" ^ In file included from md5.c:54: In file included from ../../../include/common.h:23: ../../../include/sysinc.h:381:11: fatal error: 'curl/curl.h' file not found # include <curl/curl.h> ^ 1 warning and 1 error generated. *** Error code 1 Stop. make[3]: stopped in /usr/home/user/zabbix-2.4.5/src/libs/zbxcrypto *** Error code 1 Stop. make[2]: stopped in /usr/home/user/zabbix-2.4.5/src/libs *** Error code 1 Stop. make[1]: stopped in /usr/home/user/zabbix-2.4.5/src *** Error code 1 Stop. make: stopped in /usr/home/user/zabbix-2.4.5 |
Comment by Aleksandrs Saveljevs [ 2015 Jul 09 ] |
The compilation error you see is not because of the patch (it is not related to cURL functionality is any way). Was the ./configure step successful? What was ./configure's output? |
Comment by sysops [ 2015 Jul 20 ] |
Sorry for late responde ... I was on holiday OK I have done configure without curl: ./configure --enable-server --with-mysql then then After that to run zabbix server I have to copy my conf in another path: /usr/local/etc/rc.d/zabbix_server start Do you need debug level 3 or 4 ? |
Comment by Andris Zeila [ 2015 Jul 27 ] |
No problem, so was I 4 would be better (though 3 should be enough if the crash originates from the place I thought). |
Comment by sysops [ 2015 Jul 27 ] |
So, and I reproduced the problem simply turning off mysqld service and leave it down. |
Comment by Andris Zeila [ 2015 Jul 27 ] |
You should try configure with the following parameters: ./configure --enable-server --enable-agent --with-mysql --with-libcurl --with-libxml2 --with-unixodbc --with-ssh2 There are no specific fping build options in Zabbix, so I'm not sure what FPING=on means in ports, probably access rights confguration. |
Comment by sysops [ 2015 Jul 27 ] |
error: checking for SSH2 support... yes |
Comment by Andris Zeila [ 2015 Jul 27 ] |
Strange, 10.1 should have native iconv support. Could you why libiconv check failed by running the following in zabbix build directory: |
Comment by sysops [ 2015 Jul 27 ] |
/home/bid12/zabbix-2.4.5 # grep -A20 'checking for ICONV' config.log configure:11064: checking for ICONV support configure:11106: cc -o conftest -g -O2 -I/usr/local/include/mysql -fstack-protector -fno-strict-aliasing -g -fno-omit-frame-pointer -fno-strict-aliasing -I/usr/local/include/libxml2 -I/usr/include -I/usr/local/include -I/usr/local/include -rdynamic conftest.c -lkvm -lm -lexecinfo -ldevstat >&5 /tmp/conftest-b7e05b.o: In function `main': /home/user/zabbix-2.4.5/conftest.c:126: undefined reference to `libiconv_open' /home/user/zabbix-2.4.5/conftest.c:127: undefined reference to `libiconv' /home/user/zabbix-2.4.5/conftest.c:128: undefined reference to `libiconv_close' cc: error: linker command failed with exit code 1 (use -v to see invocation) configure:11106: $? = 1 configure: failed program was: | /* confdefs.h */ | #define PACKAGE_NAME "Zabbix" | #define PACKAGE_TARNAME "zabbix" | #define PACKAGE_VERSION "2.4.5" | #define PACKAGE_STRING "Zabbix 2.4.5" | #define PACKAGE_BUGREPORT "" | #define PACKAGE_URL "" | #define PACKAGE "zabbix" | #define VERSION "2.4.5" | #define STDC_HEADERS 1 | #define HAVE_SYS_TYPES_H 1 | #define HAVE_SYS_STAT_H 1 # freebsd-version 10.1-RELEASE-p15 # pkg info | grep libiconv libiconv-1.14_8 Character set conversion library |
Comment by Andris Zeila [ 2015 Jul 27 ] |
Then you can try specifying the iconv installation directory: If you want to specify iconv installation directories: --with-iconv=[DIR] use iconv from given base install directory (DIR), default is to search through a number of common places for the iconv files. --with-iconv-include=[DIR] use iconv include headers from given path. --with-iconv-lib=[DIR] use iconv libraries from given path. You should be able to find the installation directories with pkg info -l libiconv-1.14_8 (or just grep filesystem for libiconv.so/iconv.h). |
Comment by sysops [ 2015 Jul 27 ] |
Always fail: ./configure --enable-server --enable-agent --with-mysql --with-libxml2 --with-unixodbc --with-ssh2 --with-iconv=/usr/local/include/iconv.h or ./configure --enable-server --enable-agent --with-mysql --with-libxml2 --with-unixodbc --with-ssh2 --with-iconv-include=/usr/local/include/iconv.h or ./configure --enable-server --enable-agent --with-mysql --with-libxml2 --with-unixodbc --with-ssh2 --with-iconv-lib=/usr/local/include/iconv.h /home/user/zabbix-2.4.5 # grep -A20 'checking for ICONV' config.log configure:11064: checking for ICONV support configure:11106: cc -o conftest -g -O2 -I/usr/local/include/mysql -fstack-protector -fno-strict-aliasing -g -fno-omit-frame-pointer -fno-strict-aliasing -I/usr/local/include/libxml2 -I/usr/include -I/usr/local/include -I/usr/local/include -I//usr/local/include/iconv.h/include -rdynamic -L//usr/local/include/iconv.h/lib conftest.c -lkvm -lm -lexecinfo -ldevstat >&5 /tmp/conftest-dcc6c6.o: In function `main': /home/user/zabbix-2.4.5/conftest.c:126: undefined reference to `libiconv_open' /home/user/zabbix-2.4.5/conftest.c:127: undefined reference to `libiconv' /home/user/zabbix-2.4.5/conftest.c:128: undefined reference to `libiconv_close' cc: error: linker command failed with exit code 1 (use -v to see invocation) configure:11106: $? = 1 configure: failed program was: | /* confdefs.h */ | #define PACKAGE_NAME "Zabbix" | #define PACKAGE_TARNAME "zabbix" | #define PACKAGE_VERSION "2.4.5" | #define PACKAGE_STRING "Zabbix 2.4.5" | #define PACKAGE_BUGREPORT "" | #define PACKAGE_URL "" | #define PACKAGE "zabbix" | #define VERSION "2.4.5" | #define STDC_HEADERS 1 | #define HAVE_SYS_TYPES_H 1 | #define HAVE_SYS_STAT_H 1 # pkg info -l libiconv libiconv-1.14_8: /usr/local/bin/iconv /usr/local/include/iconv.h /usr/local/include/libcharset.h /usr/local/include/localcharset.h /usr/local/lib/charset.alias /usr/local/lib/libcharset.a /usr/local/lib/libcharset.so /usr/local/lib/libcharset.so.1 /usr/local/lib/libcharset.so.1.0.0 /usr/local/lib/libiconv.a /usr/local/lib/libiconv.so /usr/local/lib/libiconv.so.2 /usr/local/lib/libiconv.so.2.5.1 /usr/local/lib/libiconv.so.3 /usr/local/man/man1/iconv.1.gz /usr/local/man/man3/iconv.3.gz /usr/local/man/man3/iconv_close.3.gz /usr/local/man/man3/iconv_open.3.gz /usr/local/man/man3/iconv_open_into.3.gz /usr/local/man/man3/iconvctl.3.gz /usr/local/share/doc/libiconv/iconv.1.html /usr/local/share/doc/libiconv/iconv.3.html /usr/local/share/doc/libiconv/iconv_close.3.html /usr/local/share/doc/libiconv/iconv_open.3.html /usr/local/share/doc/libiconv/iconv_open_into.3.html /usr/local/share/doc/libiconv/iconvctl.3.html # find / -name iconv.h /usr/local/include/iconv.h /usr/include/iconv.h /usr/include/sys/iconv.h |
Comment by Andris Zeila [ 2015 Jul 27 ] |
Oh, try then LDFLAGS=-liconv ./configure --enable-server --enable-agent --with-mysql --with-libxml2 --with-unixodbc --with-ssh2 |
Comment by sysops [ 2015 Jul 27 ] |
Ok ./configure CFLAGS="-I/usr/local/include" LDFLAGS="-L/usr/local/lib" --enable-server --enable-agent --with-mysql --with-libxml2 --with-unixodbc --with-ssh2 BUT AFTEER THIS: Configuration: Detected OS: freebsd10.1 Install path: /usr/local Compilation arch: freebsd Compiler: cc Compiler flags: -I/usr/local/include -I/usr/local/include/mysql -fstack-protector -fno-strict-aliasing -g -fno-omit-frame-pointer -fno-strict-aliasing -I/usr/local/include/libxml2 -I/usr/include -I/usr/local/include -I/usr/local/include Enable server: yes Server details: With database: MySQL WEB Monitoring: no Native Jabber: no SNMP: no IPMI: no SSH: yes ODBC: yes Linker flags: -rdynamic -L/usr/local/lib -L/usr/local/lib/mysql -L/usr/local/lib -L/usr/lib -L/usr/local/lib -L/usr/local/lib Libraries: -lkvm -lm -lexecinfo -ldevstat -liconv -lmysqlclient -lxml2 -lodbc -lssh2 Enable proxy: no Enable agent: yes Agent details: Linker flags: -rdynamic -L/usr/local/lib Libraries: -lkvm -lm -lexecinfo -ldevstat -liconv Enable Java gateway: no LDAP support: no IPv6 support: no *********************************************************** * Now run 'make install' * * * * Thank you for using Zabbix! * * <http://www.zabbix.com> * *********************************************************** I found error in confi log file: grep -A20 'checking for ICONV' config.log configure:11064: checking for ICONV support configure:11106: cc -o conftest -I/usr/local/include -I/usr/local/include/mysql -fstack-protector -fno-strict-aliasing -g -fno-omit-frame-pointer -fno-strict-aliasing -I/usr/local/include/libxml2 -I/usr/include -I/usr/local/include -I/usr/local/include -rdynamic -L/usr/local/lib conftest.c -lkvm -lm -lexecinfo -ldevstat >&5 /tmp/conftest-85ee20.o: In function `main': /home/bid12/zabbix-2.4.5/conftest.c:127: undefined reference to `libiconv_open' /home/bid12/zabbix-2.4.5/conftest.c:128: undefined reference to `libiconv' /home/bid12/zabbix-2.4.5/conftest.c:129: undefined reference to `libiconv_close' cc: error: linker command failed with exit code 1 (use -v to see invocation) configure:11106: $? = 1 configure: failed program was: | /* confdefs.h */ | #define PACKAGE_NAME "Zabbix" | #define PACKAGE_TARNAME "zabbix" | #define PACKAGE_VERSION "2.4.5" | #define PACKAGE_STRING "Zabbix 2.4.5" | #define PACKAGE_BUGREPORT "" | #define PACKAGE_URL "" | #define PACKAGE "zabbix" | #define VERSION "2.4.5" | #define STDC_HEADERS 1 | #define HAVE_SYS_TYPES_H 1 | #define HAVE_SYS_STAT_H 1 Should I run 'make install' anyway? |
Comment by Andris Zeila [ 2015 Jul 27 ] |
Maybe it was the old config.log file? The final configure output looks ok and it would not finish if there were any errors. Also -liconv has been added to the server and agent libraries. Try making it. |
Comment by sysops [ 2015 Jul 27 ] |
ok I do make install, make install ... no crashing .... |
Comment by sysops [ 2015 Jul 27 ] |
restarted mysql 15:16:03 zabbix server stopped 15:28:00 |
Comment by Andris Zeila [ 2015 Jul 28 ] |
Thanks! As I suspeceted it crashed in mysql_real_connect(), which almost certanly means that mysql_init() returned NULL. From MYSQL documentation it can return NULL only if there was insufficient memory to allocate a new object, which sounds unlikely. Would you be up to trying few more patches while I'm trying to poke around this issue? I have few ideas which might or might not work, but nothing certain. |
Comment by sysops [ 2015 Jul 28 ] |
OK, |
Comment by Andris Zeila [ 2015 Jul 28 ] |
Thanks! I attached 3 small patches:
You can apply all 4 patches (the backtrace patch included) first to check if it fixes the crash. If everything is okay then you could remove either mysql_library_init.diff or mysql_static_conn.diff to find the fix. The patches should be applied in the following order:
|
Comment by sysops [ 2015 Jul 29 ] |
Crash again... patch < mysql_static_conn.diff but, when I shutdown mysql on another server: ..20 sec after, zabbix server died: |
Comment by sysops [ 2015 Jul 29 ] |
See attachment"patch_command.txt" for Command History used for patching |
Comment by sysops [ 2015 Jul 29 ] |
again 15minutes after mysql down with debug level=4 See attachment "zabbix_server_crash2.log " |
Comment by Andris Zeila [ 2015 Jul 29 ] |
Yes, Thanks It's clear where the crash happens, but why mysql_init() might return NULL - currently have no idea. Will keep digging around. |
Comment by Andris Zeila [ 2015 Aug 31 ] |
At the end there is nothing much we can do if mysql connection initialization fails with what is supposed to be 'out of memory' error. It would be better to gracefully terminate the process, but the end result would be the same - the rest of processes would exit too. If the default configuration is working fine it must be something with the shared libraries. You could try enabling zabbix features one by one to find out which library causes zabbix to crash. Maybe updating/rebuilding it would fix things. |
Comment by Andris Zeila [ 2015 Aug 31 ] |
Added check for mysql_init() returning NULL in development branch svn://svn.zabbix.com/branches/dev/ZBX-9655 |
Comment by dimir [ 2015 Sep 04 ] |
Please review r55391 wiper reviewed, thanks |
Comment by Andris Zeila [ 2015 Sep 04 ] |
Released in:
|
Comment by sysops [ 2015 Sep 29 ] |
I moved mysql server on the same server where zabbix server run. NO problem for many days. 77512:20150929:102758.179 [Z3005] query failed: [2006] MySQL server has gone away [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=23278] 77512:20150929:102758.179 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ... 77512:20150929:102758.179 ====== Fatal information: ====== 77512:20150929:102758.179 program counter not available for this architecture 77512:20150929:102758.179 === Registers: === 77512:20150929:102758.179 register dump not available for this architecture 77512:20150929:102758.179 === Backtrace: === 77512:20150929:102758.184 3: 0x456046 <print_fatal_info+0x86> at /usr/local/sbin/zabbix_server 77512:20150929:102758.184 2: 0x4564bc <zbx_set_common_signal_handlers+0x2fc> at /usr/local/sbin/zabbix_server 77512:20150929:102758.184 1: 0x802c94997 <pthread_sigmask+0x497> at /lib/libthr.so.3 77512:20150929:102758.184 0: 0x802c941a8 <pthread_getspecific+0xdd8> at /lib/libthr.so.3 77512:20150929:102758.185 === Memory map: === 77512:20150929:102758.185 memory map not available for this platform 77512:20150929:102758.185 ================================ 77497:20150929:102758.200 One child process died (PID:77512,exitcode/signal:1). Exiting ... 77497:20150929:102800.211 syncing history data... 77497:20150929:102800.215 syncing history data done 77497:20150929:102800.215 syncing trends data... 77497:20150929:102800.361 syncing trends data done 77497:20150929:102800.361 Zabbix Server stopped. Zabbix 2.4.6 (revision 54796). zabbix agent log (on the same server) root@monitor001:/usr/ports/net-mgmt/zabbix24-agent # cat /var/log/zabbix_agentd.log 36905:20150929:093133.368 Starting Zabbix Agent [monitor001.myserver.intra]. Zabbix 2.4.6 (revision 54796). 36905:20150929:093133.368 using configuration file: /usr/local/etc/zabbix24/zabbix_agentd.conf 36905:20150929:093133.368 agent #0 started [main process] 36906:20150929:093133.369 agent #1 started [collector] 36907:20150929:093133.369 agent #2 started [listener #1] 36908:20150929:093133.370 agent #3 started [listener #2] 36909:20150929:093133.370 agent #4 started [listener #3] 36910:20150929:093133.370 agent #5 started [active checks #1] 36910:20150929:102838.372 active check data upload to [zabbix.myserver.intra:10051] started to fail ([connect] cannot connect to [[zabbix.yourewardyourlife.intra]:10051]: [61] Connection refused) 36910:20150929:102937.536 active check configuration update from [zabbix.myserver.intra:10051] started to fail (cannot connect to [[zabbix.yourewardyourlife.intra]:10051]: [61] Connection refused) root@monitor001:/usr/ports/net-mgmt/zabbix24-agent # |
Comment by sysops [ 2015 Sep 29 ] |
Mysql status |
Comment by Glebs Ivanovskis (Inactive) [ 2016 Jul 01 ] |
An observation. This: 77512:20150929:102758.179 [Z3005] query failed: [2006] MySQL server has gone away [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=23278] 77512:20150929:102758.179 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ... looks like processing of LLD rule id lld_process_discovery_rule() and this: 522:20150624:000002.121 [Z3005] query failed: [2013] Lost connection to MySQL server during query [update hosts set errors_from=1435104002,disable_until=1435104017,error='Get value from agent failed: cannot connect to [[192.168.77.111]:10050]: [61] Connection refused' where hostid=10112] 522:20150624:000002.122 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ... looks like a query from db_host_update_availability() - the only two occasions when poller accesses our DB directly. Also, I see that unixODBC is involved (monitoring MySQL databases?) making it very similar to ZBX-9082. Wild guess... Is it possible that during our ODBC check cycle (connect - select - fetch - close connection) under certain circumstances ODBC driver requests a too deep clean from DB client library and it cleans Zabbix DB handle too without poller noticing it? |
Comment by Glebs Ivanovskis (Inactive) [ 2016 Jul 01 ] |
So... When we perform ODBC check it ends with a cleanup. In odbc_DBclose() we free environment handle with unixODBC function SQLFreeHandle(), it hands the work over to MySQL ODBC Connector's my_SQLFreeEnv() (mysql-connector-odbc-5.3.6-src/driver/handle.c): SQLRETURN SQL_API my_SQLFreeEnv(SQLHENV henv) { ENV *env= (ENV *) henv; myodbc_mutex_destroy(&env->lock); #ifndef _UNIX_ GlobalUnlock(GlobalHandle((HGLOBAL) henv)); GlobalFree(GlobalHandle((HGLOBAL) henv)); #else x_free(henv); myodbc_end(); #endif /* _UNIX_ */ return(SQL_SUCCESS); } myodbc_end() (mysql-connector-odbc-5.3.6-src/driver/dll.c) calls my_end() (mysql-5.6/mysys/my_init.c) which does my_thread_end(); and my_thread_global_end(); amongst everything else. When poller loses connection to backend DB and attempts to reconnect it calls mysql_init() and as we see in attached logs for some reason it returns NULL. MySQL documentation says this could only be due to insufficient memory, but (mysql-5.6/sql-common/client.c): MYSQL * STDCALL mysql_init(MYSQL *mysql) { if (mysql_server_init(0, NULL, NULL)) return 0; ... } mysql_server_init() may return non-zero value is several cases (mysql-5.6/libmysql/libmysql.c): int STDCALL mysql_server_init(int argc __attribute__((unused)), char **argv __attribute__((unused)), char **groups __attribute__((unused))) { int result= 0; if (!mysql_client_init) { ... if (my_init()) /* Will init threads */ return 1; ... if (mysql_client_plugin_init()) return 1; ... } else result= (int)my_thread_init(); /* Init if new thread */ return result; } I will not comment on first two, but the last one seems to be what doctor ordered (mysql-5.6/mysys/my_thr_init.c): my_bool my_thread_init(void) { struct st_my_thread_var *tmp; my_bool error=0; if (!my_thread_global_init_done) return 1; /* cannot proceed with unintialized library */ ... Follow up. wiper asked a legitimate question whether mysql_client_init flag is reset in my_end(). Seems like it is not. It is reset only in mysql_server_end() (mysql-5.6/libmysql/libmysql.c) and mysql_server_end() is not called from my_end(). Actually, I'm not that sure about what I just said after I found this funny thing (mysql-5.6/sql/client_settings.h): #define mysql_server_init(a,b,c) mysql_client_plugin_init() #define mysql_server_end() mysql_client_plugin_deinit() This makes life more complicated |
Comment by Glebs Ivanovskis (Inactive) [ 2016 Jul 01 ] |
Managed to reproduce! Created a separate |
Comment by Andris Mednis [ 2017 Sep 20 ] |
By the way, the ICONV detection bug on FreeBSD was observed also in |
Comment by Rostislav Palivoda [ 2017 Sep 20 ] |
Investigation continues for supported versions in |