[ZBX-9655] Zabbix Server CRASH Created: 2015 Jun 23  Updated: 2017 Oct 25  Resolved: 2017 Oct 25

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 2.4.5
Fix Version/s: 2.2.11rc1, 2.4.7rc1, 3.0.0alpha2

Type: Incident report Priority: Major
Reporter: sysops Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: crash, freebsd
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

10.1-RELEASE FreeBSD 10.1-RELEASE #0 r274401: [email protected]:/usr/obj/usr/src/sys/GENERIC amd64


Attachments: PNG File Screen Shot 2015-09-29 at 15.08.31.png     File freebsd_backtrace_x64.diff     Text File list.txt     Text File mysql.log     Text File mysql2.log     File mysql_init_check.diff     File mysql_library_init.diff     File mysql_static_conn.diff     Text File mysqld.log     Text File patch_command.txt     Text File pkg.txt     Text File zabbix_server.log     Text File zabbix_server.log     Text File zabbix_server_crash.log     Text File zabbix_server_crash2.log    
Issue Links:
Duplicate
duplicates ZBX-7765 zbx_set_common_signal_handlers crash ... Closed
duplicates ZBX-10962 Crash/exit in poller performing ODBC ... Closed
Sub-task
depends on ZBX-12159 Resolving TNS names via LDAP crash on... Confirmed

 Description   

Suddenly Crashing ...

   526:20150622:172758.659 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=23278]
   526:20150622:172758.659 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ...
   526:20150622:172758.659 ====== Fatal information: ======
   526:20150622:172758.659 program counter not available for this architecture
   526:20150622:172758.659 === Registers: ===
   526:20150622:172758.659 register dump not available for this architecture
   526:20150622:172758.660 === Backtrace: ===
   526:20150622:172758.660 1: 0x466fc6 <print_fatal_info+0x86> at /usr/local/sbin/zabbix_server
   526:20150622:172758.660 0: 0x46743c <zbx_set_common_signal_handlers+0x2fc> at /usr/local/sbin/zabbix_server
   526:20150622:172758.660 === Memory map: ===
   526:20150622:172758.660 memory map not available for this platform
   526:20150622:172758.660 ================================
   513:20150622:172758.662 One child process died (PID:526,exitcode/signal:1). Exiting ...
   513:20150622:172800.726 syncing history data...
   513:20150622:172800.726 syncing history data done
   513:20150622:172800.726 syncing trends data...
   513:20150622:172800.735 syncing trends data done
   513:20150622:172800.736 Zabbix Server stopped. Zabbix 2.4.5 (revision 53282).

Zabbix server v2.4.5 (revision 53282) (21 April 2015)
Compilation time: Jun 12 2015 06:53:41



 Comments   
Comment by sysops [ 2015 Jun 24 ]

list of package installed on FreeBSD server (where zabbix server run)

Comment by sysops [ 2015 Jun 24 ]

Another crash...
in the moment of crashing the remote mysql server (where zabbix database reside) has been rebooted, but may be a coincidence.

   539:20150623:233751.082 executing housekeeper
   539:20150623:233751.564 housekeeper [deleted 11745 hist/trends, 0 items, 0 events, 0 sessions, 0 alarms, 0 audit items in 0.479474 sec, idle 1 hour(s)]
   556:20150624:000002.120 [Z3005] query failed: [1317] Query execution was interrupted [select escalationid,actionid,triggerid,eventid,r_eventid,nextcheck,esc_step,status,itemid from escalations order by actionid,triggerid,itemid,escalationid]
   544:20150624:000002.121 [Z3005] query failed: [2013] Lost connection to MySQL server during query [commit;]
   544:20150624:000002.121 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61)
   544:20150624:000002.121 database is down: reconnecting in 10 seconds
   522:20150624:000002.121 [Z3005] query failed: [2013] Lost connection to MySQL server during query [update hosts set errors_from=1435104002,disable_until=1435104017,error='Get value from agent failed: cannot connect to [[192.168.77.111]:10050]: [61] Connection refused' where hostid=10112]
   522:20150624:000002.122 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ...
   522:20150624:000002.122 ====== Fatal information: ======
   522:20150624:000002.122 program counter not available for this architecture
   522:20150624:000002.122 === Registers: ===
   522:20150624:000002.122 register dump not available for this architecture
   522:20150624:000002.122 === Backtrace: ===
   522:20150624:000002.122 1: 0x466fc6 <print_fatal_info+0x86> at /usr/local/sbin/zabbix_server
   522:20150624:000002.122 0: 0x46743c <zbx_set_common_signal_handlers+0x2fc> at /usr/local/sbin/zabbix_server
   522:20150624:000002.122 === Memory map: ===
   522:20150624:000002.122 memory map not available for this platform
   522:20150624:000002.122 ================================
   513:20150624:000002.124 One child process died (PID:522,exitcode/signal:1). Exiting ...
   513:20150624:000004.193 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61)
   513:20150624:000004.193 Cannot connect to the database. Exiting...
Comment by sysops [ 2015 Jun 24 ]

another one crash:

  3048:20150624:080703.419 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='zabbix01.bi-different.intra' and status in (0,1) and flags<>2 and proxy_hostid is null]
  3049:20150624:081459.419 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='kirk01' and status in (0,1) and flags<>2 and proxy_hostid is null]
  3055:20150624:081643.479 executing housekeeper
  3055:20150624:081644.608 housekeeper [deleted 46463 hist/trends, 15 items, 0 events, 0 sessions, 0 alarms, 0 audit items in 1.126811 sec, idle 1 hour(s)]
  3051:20150624:081659.609 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='kirk01' and status in (0,1) and flags<>2 and proxy_hostid is null]
  3050:20150624:082504.109 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='zabbix01.bi-different.intra' and status in (0,1) and flags<>2 and proxy_hostid is null]
  3043:20150624:082758.105 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=23278]
  3043:20150624:082758.105 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ...
  3043:20150624:082758.105 ====== Fatal information: ======
  3043:20150624:082758.105 program counter not available for this architecture
  3043:20150624:082758.105 === Registers: ===
  3043:20150624:082758.105 register dump not available for this architecture
  3043:20150624:082758.105 === Backtrace: ===
  3043:20150624:082758.106 1: 0x466fc6 <print_fatal_info+0x86> at /usr/local/sbin/zabbix_server
  3043:20150624:082758.106 0: 0x46743c <zbx_set_common_signal_handlers+0x2fc> at /usr/local/sbin/zabbix_server
  3043:20150624:082758.106 === Memory map: ===
  3043:20150624:082758.106 memory map not available for this platform
  3043:20150624:082758.106 ================================
  3039:20150624:082758.109 One child process died (PID:3043,exitcode/signal:1). Exiting ...
  3039:20150624:082800.137 syncing history data...
  3039:20150624:082800.141 syncing history data done
  3039:20150624:082800.141 syncing trends data...
  3039:20150624:082800.209 syncing trends data done
  3039:20150624:082800.209 Zabbix Server stopped. Zabbix 2.4.5 (revision 53282).
Comment by sysops [ 2015 Jun 24 ]

I was be able to reproduce crash:
HOST A : 192.168.77.222 server with installed and running zabbix24-server-2.4.5
HOST B : 192.168.77.111 server with installed and running mysql server (with zabbix database)

… start zabbix server on HOST (A)

 13758:20150624:083400.035 server #13 started [trapper #5]
 13763:20150624:083400.036 server #18 started [http poller #1]
 13766:20150624:083400.036 server #21 started [history syncer #2]
 13767:20150624:083400.038 server #22 started [history syncer #3]
 13765:20150624:083400.039 server #20 started [history syncer #1]
 13770:20150624:083400.040 server #25 started [proxy poller #1]
 13764:20150624:083400.041 server #19 started [discoverer #1]
 13771:20150624:083400.041 server #26 started [self-monitoring #1]
 13769:20150624:083400.042 server #24 started [escalator #1]
 13768:20150624:083400.042 server #23 started [history syncer #4]

.. everything is ok…

.. manual stopping mysql server on remote host (B) where zabbix database is stored [[systemctl stop mysql.service]]

 13762:20150624:091631.778 [Z3005] query failed: [2013] Lost connection to MySQL server during query [commit;]
 13762:20150624:091631.779 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61)
 13762:20150624:091631.779 database is down: reconnecting in 10 seconds
 13754:20150624:091738.040 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61)
 13754:20150624:091738.040 database is down: reconnecting in 10 seconds

..after one minute….manual start mysql service on remote host (B) where zabbix database is stored
(systemctl start mysql.service)

 13768:20150624:091741.745 database connection re-established
 13754:20150624:091748.073 database connection re-established

..after a while log of zabbix server (on host A) is populate with :

 13755:20150624:091750.370 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='hanzo.bi-different.intra' and status in (0,1) and flags<>2 and proxy_hostid is null]
 13757:20150624:091806.090 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='zabbix01.bi-different.intra' and status in (0,1) and flags<>2 and proxy_hostid is null]
 13758:20150624:092151.090 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,status from hosts where host='hanzo.bi-different.intra' and status in (0,1) and flags<>2 and proxy_hostid is null]

..and ten minutes later without other logs zabbix server (on host A) crash:

13752:20150624:092758.041 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=23278]
 13752:20150624:092758.042 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ...
 13752:20150624:092758.042 ====== Fatal information: ======
 13752:20150624:092758.042 program counter not available for this architecture
 13752:20150624:092758.042 === Registers: ===
 13752:20150624:092758.042 register dump not available for this architecture
 13752:20150624:092758.042 === Backtrace: ===
 13752:20150624:092758.043 1: 0x466fc6 <print_fatal_info+0x86> at /usr/local/sbin/zabbix_server
 13752:20150624:092758.043 0: 0x46743c <zbx_set_common_signal_handlers+0x2fc> at /usr/local/sbin/zabbix_server
 13752:20150624:092758.043 === Memory map: ===
 13752:20150624:092758.043 memory map not available for this platform
 13752:20150624:092758.043 ================================
 13745:20150624:092758.047 One child process died (PID:13752,exitcode/signal:1). Exiting ...
 13745:20150624:092800.089 syncing history data...
 13745:20150624:092800.091 syncing history data done
 13745:20150624:092800.091 syncing trends data...
 13745:20150624:092800.171 syncing trends data done
 13745:20150624:092800.171 Zabbix Server stopped. Zabbix 2.4.5 (revision 53282).
Comment by sysops [ 2015 Jun 24 ]

One hour later after restart of mysql (see attachment) zabbix server crash (see log with debug level=4)

Comment by Aleksandrs Saveljevs [ 2015 Jun 25 ]

This looks like a duplicate of ZBX-7765, but this task has a nice description and a way to reproduce the problem.

Comment by Andris Zeila [ 2015 Jun 30 ]

I've been trying to reproduce it without any luck. My setup was two virtual machines running 10.1-RELEASE FreeBSD 10.1-RELEASE #0 r274401. I tried running zabbix server from the default package and also one compiled from sources. It had default configuration monitoring one host. On the other system I was running mysql server from the default mysql56-server package.

So the setup was as close as I could get to the described one, but no stopping/starting mysql server caused any zabbix server crashes.

What I noticed you had mysql56-client-5.6.24 package listed, while on my installations I had mysql56-client-5.6.24_1. Could not find anything about fixed crashes in changelog - so I doubt it was the reason. Still maybe you could upgrade your mysql package and try to reproduce the crash again?

Comment by sysops [ 2015 Jun 30 ]

Reproduced.
[root@k.......]# date
Tue Jun 30 15:34:27 UTC 2015
[root@k.......]]# systemctl stop mysql.service

[root@k.......]]# date
Tue Jun 30 15:37:25 UTC 2015
[root@.......]# systemctl start mysql.service

root@zabbix# cat /var/log/zabbix_server.log | grep stopped
59902:20150630:153935.199 Zabbix Server stopped. Zabbix 2.4.5 (revision 53282).

Comment by sysops [ 2015 Jun 30 ]

I have upgraded mysql56-client-5.6.24_1 of zabbix server via source ports
but crash persist:

stopped mysqlserver (on Oracle Linux ) for 3 min
then zabbix (on freebsd 10.1) crashed few minutes later (view log with debug level=4)

Comment by Andris Zeila [ 2015 Jul 02 ]

Thanks, but still can't reproduce it. Nothing really suspicious in DB connection code.

Would it be possible to build Zabbix server from sources with the attached patch (freebsd_backtrace_x64.diff) and reproduce the crash? It should give proper stack trace. (Note that the patch will work only on 64 bit freebsd systems).

Comment by sysops [ 2015 Jul 09 ]

..the problem persist after each mysql down
Have you installed mysql-server on "Oracle Linux" distributiuon as my enviroment ?

I tried your patch but I am unable to complie:

root@zabbix01:/usr/home/user/zabbix-2.4.5 # tar -xvf zabbix-2.4.5.tar.gz

root@zabbix01:/usr/home/user # patch zabbix-2.4.5/src/libs/zbxnix/fatal.c freebsd_backtrace_x64.diff
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--------------------------
|Index: src/libs/zbxnix/fatal.c
|===================================================================
|--- src/libs/zbxnix/fatal.c	(revision 54194)
|+++ src/libs/zbxnix/fatal.c	(working copy)
--------------------------
Patching file zabbix-2.4.5/src/libs/zbxnix/fatal.c using Plan A...
Hunk #1 succeeded at 66.
Hunk #2 succeeded at 231.
Hunk #3 succeeded at 311.
done

root@zabbix01:/usr/home/user# cd zabbix-2.4.5

root@zabbix01:/usr/home/user/zabbix-2.4.5 # ./configure --enable-server --with-mysql --with-libcurl --with-libxml2


***********************************************************
*            Now run 'make install'                       *
*                                                         *
*            Thank you for using Zabbix!                  *
*              <http://www.zabbix.com>                    *
***********************************************************

root@zabbix01:/usr/home/user/zabbix-2.4.5 # make install
Making install in src
Making install in libs
Making install in zbxcrypto
cc -DHAVE_CONFIG_H -I. -I../../../include      -g -O2  -I/usr/local/include/mysql -pipe  -fstack-protector -fno-strict-aliasing  -g -fno-omit-frame-pointer -fno-strict-aliasing     -I/usr/local/include/libxml2 -I/usr/include -MT md5.o -MD -MP -MF .deps/md5.Tpo -c -o md5.o md5.c
In file included from md5.c:54:
In file included from ../../../include/common.h:23:
In file included from ../../../include/sysinc.h:372:
/usr/include/sys/timeb.h:42:2: warning: "this file includes <sys/timeb.h> which is deprecated" [-W#warnings]
#warning "this file includes <sys/timeb.h> which is deprecated"
 ^
In file included from md5.c:54:
In file included from ../../../include/common.h:23:
../../../include/sysinc.h:381:11: fatal error: 'curl/curl.h' file not found
#       include <curl/curl.h>
                ^
1 warning and 1 error generated.
*** Error code 1

Stop.
make[3]: stopped in /usr/home/user/zabbix-2.4.5/src/libs/zbxcrypto
*** Error code 1

Stop.
make[2]: stopped in /usr/home/user/zabbix-2.4.5/src/libs
*** Error code 1

Stop.
make[1]: stopped in /usr/home/user/zabbix-2.4.5/src
*** Error code 1

Stop.
make: stopped in /usr/home/user/zabbix-2.4.5
Comment by Aleksandrs Saveljevs [ 2015 Jul 09 ]

The compilation error you see is not because of the patch (it is not related to cURL functionality is any way). Was the ./configure step successful? What was ./configure's output?

Comment by sysops [ 2015 Jul 20 ]

Sorry for late responde ... I was on holiday

OK I have done configure without curl:
/usr/local/etc/rc.d/zabbix_server stop

./configure --enable-server --with-mysql

then
make install

then
/usr/local/etc/rc.d/zabbix_server start
Starting zabbix_server.

After that to run zabbix server I have to copy my conf in another path:
cp /usr/local/etc/zabbix24/zabbix_server.conf /usr/local/etc/zabbix_server.conf

/usr/local/etc/rc.d/zabbix_server start

Do you need debug level 3 or 4 ?

Comment by Andris Zeila [ 2015 Jul 27 ]

Sorry for late responde ... I was on holiday

No problem, so was I

4 would be better (though 3 should be enough if the crash originates from the place I thought).

Comment by sysops [ 2015 Jul 27 ]

So,
installing from source (with your patch) with this configure:
./configure --enable-server --enable-agent --with-mysql
I wasn't able to reproduce the problem.
But I need ODBC, SSH and other..
so I reinstalled from ports with this config:
CURL=on: Support for web monitoring
FPING=on: Build/install fping for ping checks
IPMI=off: Support for IPMI checks
IPV6=off: IPv6 protocol support
JABBER=off: Support for Jabber media type
JAVAGW=off: Support for Java gateway
LDAP=off: Support for LDAP server checks
LIBXML2=on: Support for libxml2 (required by monitoring VMware)
NMAP=off: Build/install nmap for o/s detection
SSH=on: Support for SSH-based checks
====> Options available for the single DB: you have to select exactly one of them
MYSQL=on: MySQL database support
PGSQL=off: PostgreSQL database support
SQLITE=off: SQLite database support
ORACLE=off: Oracle database support
====> Support for database checks via ODBC: you have to select exactly one of them
IODBC=off: ODBC backend via iODBC
UNIXODBC=on: ODBC backend via unixODBC

and I reproduced the problem simply turning off mysqld service and leave it down.
Could you suggest me the way to installing zabbix server from source with
CURL,FPING,LIBXML2,MYSQL,UNIXODBC

Comment by Andris Zeila [ 2015 Jul 27 ]

You should try configure with the following parameters:

./configure --enable-server --enable-agent --with-mysql --with-libcurl --with-libxml2 --with-unixodbc --with-ssh2

There are no specific fping build options in Zabbix, so I'm not sure what FPING=on means in ports, probably access rights confguration.

Comment by sysops [ 2015 Jul 27 ]

error:

checking for SSH2 support... yes
checking for gawk... (cached) nawk
checking for curl-config... /usr/local/bin/curl-config
checking for the version of libcurl... 7.43.0
checking for libcurl >= version 7.13.1... yes
checking for main in -lcurl... yes
checking whether libcurl is usable... yes
checking for curl_free... yes
checking for curl_easy_escape... yes
checking for ICONV support... configure: error: Unable to use iconv (libiconv check failed)

Comment by Andris Zeila [ 2015 Jul 27 ]

Strange, 10.1 should have native iconv support. Could you why libiconv check failed by running the following in zabbix build directory:
grep -A20 'checking for ICONV' config.log

Comment by sysops [ 2015 Jul 27 ]

/home/bid12/zabbix-2.4.5 # grep -A20 'checking for ICONV' config.log

configure:11064: checking for ICONV support
configure:11106: cc -o conftest -g -O2  -I/usr/local/include/mysql -fstack-protector -fno-strict-aliasing  -g -fno-omit-frame-pointer -fno-strict-aliasing     -I/usr/local/include/libxml2 -I/usr/include -I/usr/local/include  -I/usr/local/include      -rdynamic   conftest.c -lkvm -lm -lexecinfo -ldevstat   >&5
/tmp/conftest-b7e05b.o: In function `main':
/home/user/zabbix-2.4.5/conftest.c:126: undefined reference to `libiconv_open'
/home/user/zabbix-2.4.5/conftest.c:127: undefined reference to `libiconv'
/home/user/zabbix-2.4.5/conftest.c:128: undefined reference to `libiconv_close'
cc: error: linker command failed with exit code 1 (use -v to see invocation)
configure:11106: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "Zabbix"
| #define PACKAGE_TARNAME "zabbix"
| #define PACKAGE_VERSION "2.4.5"
| #define PACKAGE_STRING "Zabbix 2.4.5"
| #define PACKAGE_BUGREPORT ""
| #define PACKAGE_URL ""
| #define PACKAGE "zabbix"
| #define VERSION "2.4.5"
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
# freebsd-version 
10.1-RELEASE-p15
# pkg info | grep libiconv
libiconv-1.14_8                Character set conversion library
Comment by Andris Zeila [ 2015 Jul 27 ]

Then you can try specifying the iconv installation directory:

If you want to specify iconv installation directories:
  --with-iconv=[DIR]      use iconv from given base install directory (DIR),
                          default is to search through a number of common
                          places for the iconv files.
  --with-iconv-include=[DIR]
                          use iconv include headers from given path.
  --with-iconv-lib=[DIR]  use iconv libraries from given path.

You should be able to find the installation directories with pkg info -l libiconv-1.14_8 (or just grep filesystem for libiconv.so/iconv.h).

Comment by sysops [ 2015 Jul 27 ]

Always fail:

./configure --enable-server --enable-agent --with-mysql --with-libxml2 --with-unixodbc --with-ssh2 --with-iconv=/usr/local/include/iconv.h

or

./configure --enable-server --enable-agent --with-mysql --with-libxml2 --with-unixodbc --with-ssh2 --with-iconv-include=/usr/local/include/iconv.h

or

./configure --enable-server --enable-agent --with-mysql --with-libxml2 --with-unixodbc --with-ssh2 --with-iconv-lib=/usr/local/include/iconv.h
/home/user/zabbix-2.4.5 # grep -A20 'checking for ICONV' config.log
configure:11064: checking for ICONV support
configure:11106: cc -o conftest -g -O2  -I/usr/local/include/mysql -fstack-protector -fno-strict-aliasing  -g -fno-omit-frame-pointer -fno-strict-aliasing     -I/usr/local/include/libxml2 -I/usr/include -I/usr/local/include  -I/usr/local/include    -I//usr/local/include/iconv.h/include  -rdynamic  -L//usr/local/include/iconv.h/lib conftest.c -lkvm -lm -lexecinfo -ldevstat   >&5
/tmp/conftest-dcc6c6.o: In function `main':
/home/user/zabbix-2.4.5/conftest.c:126: undefined reference to `libiconv_open'
/home/user/zabbix-2.4.5/conftest.c:127: undefined reference to `libiconv'
/home/user/zabbix-2.4.5/conftest.c:128: undefined reference to `libiconv_close'
cc: error: linker command failed with exit code 1 (use -v to see invocation)
configure:11106: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "Zabbix"
| #define PACKAGE_TARNAME "zabbix"
| #define PACKAGE_VERSION "2.4.5"
| #define PACKAGE_STRING "Zabbix 2.4.5"
| #define PACKAGE_BUGREPORT ""
| #define PACKAGE_URL ""
| #define PACKAGE "zabbix"
| #define VERSION "2.4.5"
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
# pkg info -l libiconv
libiconv-1.14_8:
	/usr/local/bin/iconv
	/usr/local/include/iconv.h
	/usr/local/include/libcharset.h
	/usr/local/include/localcharset.h
	/usr/local/lib/charset.alias
	/usr/local/lib/libcharset.a
	/usr/local/lib/libcharset.so
	/usr/local/lib/libcharset.so.1
	/usr/local/lib/libcharset.so.1.0.0
	/usr/local/lib/libiconv.a
	/usr/local/lib/libiconv.so
	/usr/local/lib/libiconv.so.2
	/usr/local/lib/libiconv.so.2.5.1
	/usr/local/lib/libiconv.so.3
	/usr/local/man/man1/iconv.1.gz
	/usr/local/man/man3/iconv.3.gz
	/usr/local/man/man3/iconv_close.3.gz
	/usr/local/man/man3/iconv_open.3.gz
	/usr/local/man/man3/iconv_open_into.3.gz
	/usr/local/man/man3/iconvctl.3.gz
	/usr/local/share/doc/libiconv/iconv.1.html
	/usr/local/share/doc/libiconv/iconv.3.html
	/usr/local/share/doc/libiconv/iconv_close.3.html
	/usr/local/share/doc/libiconv/iconv_open.3.html
	/usr/local/share/doc/libiconv/iconv_open_into.3.html
	/usr/local/share/doc/libiconv/iconvctl.3.html
# find / -name iconv.h
/usr/local/include/iconv.h
/usr/include/iconv.h
/usr/include/sys/iconv.h
Comment by Andris Zeila [ 2015 Jul 27 ]

Oh, try then

LDFLAGS=-liconv ./configure --enable-server --enable-agent --with-mysql --with-libxml2 --with-unixodbc --with-ssh2
Comment by sysops [ 2015 Jul 27 ]

Ok
rm config.log

 ./configure CFLAGS="-I/usr/local/include" LDFLAGS="-L/usr/local/lib" --enable-server --enable-agent --with-mysql --with-libxml2 --with-unixodbc --with-ssh2

BUT AFTEER THIS:

Configuration:

  Detected OS:           freebsd10.1
  Install path:          /usr/local
  Compilation arch:      freebsd

  Compiler:              cc
  Compiler flags:        -I/usr/local/include  -I/usr/local/include/mysql -fstack-protector -fno-strict-aliasing  -g -fno-omit-frame-pointer -fno-strict-aliasing     -I/usr/local/include/libxml2 -I/usr/include -I/usr/local/include  -I/usr/local/include    

  Enable server:         yes
  Server details:
    With database:         MySQL
    WEB Monitoring:        no
    Native Jabber:         no
    SNMP:                  no
    IPMI:                  no
    SSH:                   yes
    ODBC:                  yes
    Linker flags:          -rdynamic -L/usr/local/lib     -L/usr/local/lib/mysql      -L/usr/local/lib -L/usr/lib -L/usr/local/lib  -L/usr/local/lib   
    Libraries:             -lkvm -lm -lexecinfo -ldevstat   -liconv   -lmysqlclient      -lxml2  -lodbc  -lssh2   

  Enable proxy:          no

  Enable agent:          yes
  Agent details:
    Linker flags:          -rdynamic -L/usr/local/lib    
    Libraries:             -lkvm -lm -lexecinfo -ldevstat   -liconv   

  Enable Java gateway:   no

  LDAP support:          no
  IPv6 support:          no

***********************************************************
*            Now run 'make install'                       *
*                                                         *
*            Thank you for using Zabbix!                  *
*              <http://www.zabbix.com>                    *
***********************************************************

I found error in confi log file:

 grep -A20 'checking for ICONV' config.log
configure:11064: checking for ICONV support
configure:11106: cc -o conftest -I/usr/local/include  -I/usr/local/include/mysql -fstack-protector -fno-strict-aliasing  -g -fno-omit-frame-pointer -fno-strict-aliasing     -I/usr/local/include/libxml2 -I/usr/include -I/usr/local/include  -I/usr/local/include      -rdynamic -L/usr/local/lib  conftest.c -lkvm -lm -lexecinfo -ldevstat   >&5
/tmp/conftest-85ee20.o: In function `main':
/home/bid12/zabbix-2.4.5/conftest.c:127: undefined reference to `libiconv_open'
/home/bid12/zabbix-2.4.5/conftest.c:128: undefined reference to `libiconv'
/home/bid12/zabbix-2.4.5/conftest.c:129: undefined reference to `libiconv_close'
cc: error: linker command failed with exit code 1 (use -v to see invocation)
configure:11106: $? = 1
configure: failed program was:
| /* confdefs.h */
| #define PACKAGE_NAME "Zabbix"
| #define PACKAGE_TARNAME "zabbix"
| #define PACKAGE_VERSION "2.4.5"
| #define PACKAGE_STRING "Zabbix 2.4.5"
| #define PACKAGE_BUGREPORT ""
| #define PACKAGE_URL ""
| #define PACKAGE "zabbix"
| #define VERSION "2.4.5"
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1

Should I run 'make install' anyway?

Comment by Andris Zeila [ 2015 Jul 27 ]

Maybe it was the old config.log file? The final configure output looks ok and it would not finish if there were any errors. Also -liconv has been added to the server and agent libraries. Try making it.

Comment by sysops [ 2015 Jul 27 ]

ok I do make install,
but it was a new config.log (see the first line of my previous post: rm config.log.
So, it wasn't old config.log file.

make install
stoopped mysql on another server
150727 14:27:02 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

... no crashing ....

Comment by sysops [ 2015 Jul 27 ]

restarted mysql 15:16:03

zabbix server stopped 15:28:00
see attach.

Comment by Andris Zeila [ 2015 Jul 28 ]

Thanks!

As I suspeceted it crashed in mysql_real_connect(), which almost certanly means that mysql_init() returned NULL. From MYSQL documentation it can return NULL only if there was insufficient memory to allocate a new object, which sounds unlikely.

Would you be up to trying few more patches while I'm trying to poke around this issue? I have few ideas which might or might not work, but nothing certain.

Comment by sysops [ 2015 Jul 28 ]

OK,
send me your patches

Comment by Andris Zeila [ 2015 Jul 28 ]

Thanks!

I attached 3 small patches:

  • mysql_init_check.diff - adds a check if mysql_init() really returned NULL. In this case server would log a critical error and exit.
  • mysql_library_init.diff - adds explicit mysql library initialization in main process. The mysql_library_init() is automatically called from mysql_init(), but in this case some thread related data would be initialized only on the next mysql_init() call. Zabbix does not use threads, so in therory it should not make any difference. But best is to check to be certain.
  • mysql_static_conn.diff - instead of alocation connection object for each mysql_init() call it will define a static connection object and reuse it.

You can apply all 4 patches (the backtrace patch included) first to check if it fixes the crash. If everything is okay then you could remove either mysql_library_init.diff or mysql_static_conn.diff to find the fix.

The patches should be applied in the following order:

  • mysql_static_conn.diff
  • mysql_init_check.diff
  • mysql_library_init.diff
  • mysql_static_conn.diff
Comment by sysops [ 2015 Jul 29 ]

Crash again...
I have applied all patches in the following order:

patch < mysql_static_conn.diff
patch < mysql_static_conn.diff mysql_init_check.diff
patch < mysql_static_conn.diffmysql_library_init.diff

but, when I shutdown mysql on another server:
2015-07-29 07:38:17 25153 [Note] /usr/sbin/mysqld: Shutdown complete
150729 07:38:17 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended

..20 sec after, zabbix server died:
13668:20150729:073836.205 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61)
13668:20150729:073836.205 database is down: reconnecting in 10 seconds
13651:20150729:073836.394 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61)
13651:20150729:073836.394 database is down: reconnecting in 10 seconds
13665:20150729:073836.394 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61)
13665:20150729:073836.394 database is down: reconnecting in 10 seconds
13663:20150729:073838.078 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61)
13663:20150729:073838.078 database is down: reconnecting in 10 seconds
13658:20150729:073839.195 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61)
13658:20150729:073839.195 database is down: reconnecting in 10 seconds
13655:20150729:073839.787 [Z3005] query failed: [2013] Lost connection to MySQL server during query [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=23919]
13655:20150729:073839.787 cannot initialize MYSQL connection object
13650:20150729:073839.788 One child process died (PID:13655,exitcode/signal:1). Exiting ...
13650:20150729:073841.821 [Z3001] connection to database 'zabbix' failed: [2003] Can't connect to MySQL server on '192.168.77.111' (61)
13650:20150729:073841.821 Cannot connect to the database. Exiting...

Comment by sysops [ 2015 Jul 29 ]

See attachment"patch_command.txt" for Command History used for patching

Comment by sysops [ 2015 Jul 29 ]

again 15minutes after mysql down with debug level=4
2015-07-29 08:13:11 12724 [Note] /usr/sbin/mysqld: Shutdown complete

See attachment "zabbix_server_crash2.log "

Comment by Andris Zeila [ 2015 Jul 29 ]

Yes, Thanks

It's clear where the crash happens, but why mysql_init() might return NULL - currently have no idea. Will keep digging around.

Comment by Andris Zeila [ 2015 Aug 31 ]

At the end there is nothing much we can do if mysql connection initialization fails with what is supposed to be 'out of memory' error. It would be better to gracefully terminate the process, but the end result would be the same - the rest of processes would exit too.

If the default configuration is working fine it must be something with the shared libraries. You could try enabling zabbix features one by one to find out which library causes zabbix to crash. Maybe updating/rebuilding it would fix things.

Comment by Andris Zeila [ 2015 Aug 31 ]

Added check for mysql_init() returning NULL in development branch svn://svn.zabbix.com/branches/dev/ZBX-9655

Comment by dimir [ 2015 Sep 04 ]

Please review r55391

wiper reviewed, thanks

Comment by Andris Zeila [ 2015 Sep 04 ]

Released in:

  • pre-2.2.11rc1 r55405
  • pre-2.4.7rc1 r55406
  • pre-3.0.0alpha2 r55407
Comment by sysops [ 2015 Sep 29 ]

I moved mysql server on the same server where zabbix server run. NO problem for many days.
But this morning, 58 min after adding a zabbix agent on the same server (so in one server there is mysql, zabbix server and zabbix agent) zabbix server suddenly crash.
Mysql is always up and running without reboot.

77512:20150929:102758.179 [Z3005] query failed: [2006] MySQL server has gone away [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=23278]
 77512:20150929:102758.179 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ...
 77512:20150929:102758.179 ====== Fatal information: ======
 77512:20150929:102758.179 program counter not available for this architecture
 77512:20150929:102758.179 === Registers: ===
 77512:20150929:102758.179 register dump not available for this architecture
 77512:20150929:102758.179 === Backtrace: ===
 77512:20150929:102758.184 3: 0x456046 <print_fatal_info+0x86> at /usr/local/sbin/zabbix_server
 77512:20150929:102758.184 2: 0x4564bc <zbx_set_common_signal_handlers+0x2fc> at /usr/local/sbin/zabbix_server
 77512:20150929:102758.184 1: 0x802c94997 <pthread_sigmask+0x497> at /lib/libthr.so.3
 77512:20150929:102758.184 0: 0x802c941a8 <pthread_getspecific+0xdd8> at /lib/libthr.so.3
 77512:20150929:102758.185 === Memory map: ===
 77512:20150929:102758.185 memory map not available for this platform
 77512:20150929:102758.185 ================================
 77497:20150929:102758.200 One child process died (PID:77512,exitcode/signal:1). Exiting ...
 77497:20150929:102800.211 syncing history data...
 77497:20150929:102800.215 syncing history data done
 77497:20150929:102800.215 syncing trends data...
 77497:20150929:102800.361 syncing trends data done
 77497:20150929:102800.361 Zabbix Server stopped. Zabbix 2.4.6 (revision 54796).

zabbix agent log (on the same server)

root@monitor001:/usr/ports/net-mgmt/zabbix24-agent # cat /var/log/zabbix_agentd.log
 36905:20150929:093133.368 Starting Zabbix Agent [monitor001.myserver.intra]. Zabbix 2.4.6 (revision 54796).
 36905:20150929:093133.368 using configuration file: /usr/local/etc/zabbix24/zabbix_agentd.conf
 36905:20150929:093133.368 agent #0 started [main process]
 36906:20150929:093133.369 agent #1 started [collector]
 36907:20150929:093133.369 agent #2 started [listener #1]
 36908:20150929:093133.370 agent #3 started [listener #2]
 36909:20150929:093133.370 agent #4 started [listener #3]
 36910:20150929:093133.370 agent #5 started [active checks #1]
 36910:20150929:102838.372 active check data upload to [zabbix.myserver.intra:10051] started to fail ([connect] cannot connect to [[zabbix.yourewardyourlife.intra]:10051]: [61] Connection refused)
 36910:20150929:102937.536 active check configuration update from [zabbix.myserver.intra:10051] started to fail (cannot connect to [[zabbix.yourewardyourlife.intra]:10051]: [61] Connection refused)
root@monitor001:/usr/ports/net-mgmt/zabbix24-agent # 
Comment by sysops [ 2015 Sep 29 ]

Mysql status

Comment by Glebs Ivanovskis (Inactive) [ 2016 Jul 01 ]

An observation. This:

77512:20150929:102758.179 [Z3005] query failed: [2006] MySQL server has gone away [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=23278]
 77512:20150929:102758.179 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ...

looks like processing of LLD rule id lld_process_discovery_rule() and this:

   522:20150624:000002.121 [Z3005] query failed: [2013] Lost connection to MySQL server during query [update hosts set errors_from=1435104002,disable_until=1435104017,error='Get value from agent failed: cannot connect to [[192.168.77.111]:10050]: [61] Connection refused' where hostid=10112]
   522:20150624:000002.122 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ...

looks like a query from db_host_update_availability() - the only two occasions when poller accesses our DB directly. Also, I see that unixODBC is involved (monitoring MySQL databases?) making it very similar to ZBX-9082.

Wild guess... Is it possible that during our ODBC check cycle (connect - select - fetch - close connection) under certain circumstances ODBC driver requests a too deep clean from DB client library and it cleans Zabbix DB handle too without poller noticing it?

Comment by Glebs Ivanovskis (Inactive) [ 2016 Jul 01 ]

So... When we perform ODBC check it ends with a cleanup. In odbc_DBclose() we free environment handle with unixODBC function SQLFreeHandle(), it hands the work over to MySQL ODBC Connector's my_SQLFreeEnv() (mysql-connector-odbc-5.3.6-src/driver/handle.c):

SQLRETURN SQL_API my_SQLFreeEnv(SQLHENV henv)
{
    ENV *env= (ENV *) henv;
    myodbc_mutex_destroy(&env->lock);
#ifndef _UNIX_
    GlobalUnlock(GlobalHandle((HGLOBAL) henv));
    GlobalFree(GlobalHandle((HGLOBAL) henv));
#else
    x_free(henv);
    myodbc_end();
#endif /* _UNIX_ */
    return(SQL_SUCCESS);
}

myodbc_end() (mysql-connector-odbc-5.3.6-src/driver/dll.c) calls my_end() (mysql-5.6/mysys/my_init.c) which does my_thread_end(); and my_thread_global_end(); amongst everything else.

When poller loses connection to backend DB and attempts to reconnect it calls mysql_init() and as we see in attached logs for some reason it returns NULL. MySQL documentation says this could only be due to insufficient memory, but (mysql-5.6/sql-common/client.c):

MYSQL * STDCALL
mysql_init(MYSQL *mysql)
{
  if (mysql_server_init(0, NULL, NULL))
    return 0;
  ...
}

mysql_server_init() may return non-zero value is several cases (mysql-5.6/libmysql/libmysql.c):

int STDCALL mysql_server_init(int argc __attribute__((unused)),
			      char **argv __attribute__((unused)),
			      char **groups __attribute__((unused)))
{
  int result= 0;
  if (!mysql_client_init)
  {
    ...
    if (my_init())				/* Will init threads */
      return 1;
    ...
    if (mysql_client_plugin_init())
      return 1;
    ...
  }
  else
    result= (int)my_thread_init();         /* Init if new thread */
  return result;
}

I will not comment on first two, but the last one seems to be what doctor ordered (mysql-5.6/mysys/my_thr_init.c):

my_bool my_thread_init(void)
{
  struct st_my_thread_var *tmp;
  my_bool error=0;

  if (!my_thread_global_init_done)
    return 1; /* cannot proceed with unintialized library */
  ...

Follow up. wiper asked a legitimate question whether mysql_client_init flag is reset in my_end(). Seems like it is not. It is reset only in mysql_server_end() (mysql-5.6/libmysql/libmysql.c) and mysql_server_end() is not called from my_end().

Actually, I'm not that sure about what I just said after I found this funny thing (mysql-5.6/sql/client_settings.h):

#define mysql_server_init(a,b,c) mysql_client_plugin_init()
#define mysql_server_end()       mysql_client_plugin_deinit()

This makes life more complicated

Comment by Glebs Ivanovskis (Inactive) [ 2016 Jul 01 ]

Managed to reproduce! Created a separate ZBX-10962 with detailed instructions.

Comment by Andris Mednis [ 2017 Sep 20 ]

By the way, the ICONV detection bug on FreeBSD was observed also in ZBX-12466.
The bug occurs if libiconv package is installed AND --with-libcurl configure option is used.
Workaround - add --with-iconv=/usr/local to configure options.

Comment by Rostislav Palivoda [ 2017 Sep 20 ]

Investigation continues for supported versions in ZBX-10962

Generated at Sat Apr 27 02:17:30 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.