[ZBX-10753] 3.0 server on MySQL (at least) stops immediately when cannot connect to database on start Created: 2016 May 05  Updated: 2017 May 30  Resolved: 2016 Aug 04

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 3.0.2
Fix Version/s: 3.0.5rc1, 3.2.0alpha1

Type: Incident report Priority: Major
Reporter: Oleksii Zagorskyi Assignee: Unassigned
Resolution: Fixed Votes: 3
Labels: connections, database, mysqld, start
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates ZBX-11675 Zabbix server crashing due to long qu... Closed
is duplicated by ZBX-11004 Zabbix exiting when unable to connect... Closed
is duplicated by ZBX-11228 Zabbix server terminates on MySQL res... Closed

 Description   

Definitely something has changed for 3.0 (at least for MySQL):
2.4 server stopped manually after a few connect attempts:

 28591:20160505:174712.923 Starting Zabbix Server. Zabbix 2.4.8rc1 (revision 58407).
 28591:20160505:174712.923 ****** Enabled features ******
 28591:20160505:174712.923 SNMP monitoring:           YES
 28591:20160505:174712.923 IPMI monitoring:           YES
 28591:20160505:174712.923 WEB monitoring:            YES
 28591:20160505:174712.923 VMware monitoring:         YES
 28591:20160505:174712.923 Jabber notifications:      YES
 28591:20160505:174712.923 Ez Texting notifications:  YES
 28591:20160505:174712.923 ODBC:                      YES
 28591:20160505:174712.923 SSH2 support:              YES
 28591:20160505:174712.923 IPv6 support:              YES
 28591:20160505:174712.923 ******************************
 28591:20160505:174712.923 using configuration file: /zab/bin/2.4/zabbix_server.conf
 28591:20160505:174712.925 [Z3001] connection to database '2.4' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
 28591:20160505:174712.925 database is down: reconnecting in 10 seconds
 28591:20160505:174722.926 [Z3001] connection to database '2.4' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
 28591:20160505:174722.926 database is down: reconnecting in 10 seconds
 28591:20160505:174730.299 Got signal [signal:15(SIGTERM),sender_pid:28599,sender_uid:0,reason:0]. Exiting ...
 28591:20160505:174732.299 [Z3001] connection to database '2.4' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
 28591:20160505:174732.299 Cannot connect to the database. Exiting...

3.0 server exits itself immediately:

 28686:20160505:174754.788 Starting Zabbix Server. Zabbix 3.0.3rc1 (revision 59873).
 28686:20160505:174754.788 ****** Enabled features ******
 28686:20160505:174754.788 SNMP monitoring:           YES
 28686:20160505:174754.789 IPMI monitoring:           YES
 28686:20160505:174754.789 Web monitoring:            YES
 28686:20160505:174754.789 VMware monitoring:         YES
 28686:20160505:174754.789 SMTP authentication:       YES
 28686:20160505:174754.789 Jabber notifications:      YES
 28686:20160505:174754.789 Ez Texting notifications:  YES
 28686:20160505:174754.789 ODBC:                      YES
 28686:20160505:174754.789 SSH2 support:              YES
 28686:20160505:174754.789 IPv6 support:              YES
 28686:20160505:174754.789 TLS support:                NO
 28686:20160505:174754.789 ******************************
 28686:20160505:174754.789 using configuration file: /zab/bin/3.0/zabbix_server.conf
 28686:20160505:174754.790 loaded modules: dummy.so
 28686:20160505:174754.791 [Z3001] connection to database '3.0def' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
 28686:20160505:174754.792 Cannot connect to the database. Exiting...

Discussed a bit it with Sasha and he said this is a bug and has to be fixed.

Related discussion is ZBX-4611



 Comments   
Comment by Oleksii Zagorskyi [ 2016 May 05 ]

Consider also carefully this log file part from a production, when MySQL server got a serious failure. I meant zabbix server termination not on start but on run time (it looks like so, for me):

 11126:20160505:030001.537 [Z3005] query failed: [1213] Deadlock found when trying to get lock; try restarting transaction [commit;]
 11125:20160505:030001.572 [Z3005] query failed: [1213] Deadlock found when trying to get lock; try restarting transaction [insert into history_uint (itemid,clock,ns,value) values (28561,1462435201,190482764,1462435201),(114841,1462435201,207893842,0),(253081,1462435201,209905590,0), trimmed ..... );
]
 11126:20160505:030002.054 [Z3005] query failed: [2013] Lost connection to MySQL server during query [commit;]
 11125:20160505:030002.054 [Z3005] query failed: [2013] Lost connection to MySQL server during query [commit;]
 11126:20160505:030002.059 [Z3001] connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'reading initial communication packet', system error: 104
 11125:20160505:030002.059 [Z3001] connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'reading initial communication packet', system error: 104
 11126:20160505:030002.059 database is down: reconnecting in 10 seconds
 11125:20160505:030002.059 database is down: reconnecting in 10 seconds
...
 11126:20160505:030022.062 [Z3001] connection to database 'zabbix' failed: [2013] Lost connection to MySQL server at 'reading initial communication packet', system error: 104
 11126:20160505:030022.062 database is down: reconnecting in 10 seconds
 11128:20160505:030022.193 [Z3001] connection to database 'zabbix' failed: [1047] WSREP has not yet prepared node for application use
 11128:20160505:030022.193 Cannot connect to the database. Exiting...
 10964:20160505:030022.265 One child process died (PID:11128,exitcode/signal:1). Exiting ...
 10964:20160505:030024.270 [Z3001] connection to database 'zabbix' failed: [1047] WSREP has not yet prepared node for application use
 10964:20160505:030024.270 Cannot connect to the database. Exiting...
 30929:20160505:030034.378 Starting Zabbix Server. Zabbix 3.0.2 (revision 59540).
 30929:20160505:030034.378 ****** Enabled features ******
 30929:20160505:030034.379 SNMP monitoring:           YES
 30929:20160505:030034.379 IPMI monitoring:           YES
 30929:20160505:030034.379 Web monitoring:            YES
 30929:20160505:030034.379 VMware monitoring:         YES
 30929:20160505:030034.379 SMTP authentication:       YES
 30929:20160505:030034.379 Jabber notifications:      YES
 30929:20160505:030034.379 Ez Texting notifications:  YES
 30929:20160505:030034.379 ODBC:                      YES
 30929:20160505:030034.379 SSH2 support:              YES
 30929:20160505:030034.379 IPv6 support:              YES
 30929:20160505:030034.379 TLS support:               YES
 30929:20160505:030034.379 ******************************
 30929:20160505:030034.379 using configuration file: /etc/zabbix/zabbix_server.conf
 30929:20160505:030034.394 [Z3001] connection to database 'zabbix' failed: [1047] WSREP has not yet prepared node for application use
 30929:20160505:030034.394 Cannot connect to the database. Exiting...
 30936:20160505:030044.619 Starting Zabbix Server. Zabbix 3.0.2 (revision 59540).
 30936:20160505:030044.619 ****** Enabled features ******
 30936:20160505:030044.619 SNMP monitoring:           YES
 30936:20160505:030044.619 IPMI monitoring:           YES
 30936:20160505:030044.619 Web monitoring:            YES
 30936:20160505:030044.619 VMware monitoring:         YES
 30936:20160505:030044.619 SMTP authentication:       YES
 30936:20160505:030044.619 Jabber notifications:      YES
 30936:20160505:030044.619 Ez Texting notifications:  YES
 30936:20160505:030044.619 ODBC:                      YES
 30936:20160505:030044.619 SSH2 support:              YES
 30936:20160505:030044.619 IPv6 support:              YES
 30936:20160505:030044.619 TLS support:               YES
 30936:20160505:030044.619 ******************************
 30936:20160505:030044.619 using configuration file: /etc/zabbix/zabbix_server.conf
 30936:20160505:030044.625 [Z3001] connection to database 'zabbix' failed: [1047] WSREP has not yet prepared node for application use
 30936:20160505:030044.625 Cannot connect to the database. Exiting...

... restarts repeated here 

note - server daemon started again automatically by systemd (by default).

viktors.tjarve The solution for the issue in this comment might be very similar to the original issue in this ticket I have created a new ticket for it - ZBX-11025. Mostly because it has to do with communication error but the problem in the original ticket is related to stopped DB and no connection cannot be established or is lost because of that.

Comment by pfoo [ 2016 Jun 11 ]

This issue will also be triggered when mysql is restarted, either by you or your package manager. This is quite bad as zabbix will exit every time a security fix is applied to mysql (my mysql server only takes 4s to restart).

I think a few connection retries and a timer should be added (and corresponding value defined in zabbix_server.conf ?)

 16474:20160611:125345.351 [Z3005] query failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) [select hostid,key_,state,evaltype,formula,error,lifetime from items where itemid=24825]
 16474:20160611:125345.351 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
 16474:20160611:125345.351 Cannot connect to the database. Exiting...
 16452:20160611:125345.353 One child process died (PID:16474,exitcode/signal:1). Exiting ...
 16452:20160611:125347.353 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
 16452:20160611:125347.353 Cannot connect to the database. Exiting...
Comment by Frank Wall [ 2016 Jul 12 ]

Zabbix Server (and Proxy) should definitely keep trying to connect to the database (forever) and should NOT stop itself.

Comment by Oleksii Zagorskyi [ 2016 Jul 21 ]

Take a look to ZBX-11004 too - it might be the same.
If yes - may be closed as duplicate.

viktors.tjarve Yes you are right - the same thing is causing these two issues.
ZBX-11004 Closed as duplicate.

Comment by Viktors Tjarve [ 2016 Jul 22 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-10753

Comment by Andrey Melnikov [ 2016 Jul 22 ]

You fix symptom not root cause - you must not call any of mysql_* function (except mysql_errno/mysql_error) when mysql_connect() is failed.

Comment by Glebs Ivanovskis (Inactive) [ 2016 Jul 25 ]

Caused by ZBX-6163.

Comment by Glebs Ivanovskis (Inactive) [ 2016 Jul 26 ]

Dear lynxchaus, your statement is not true. See Example here, mysql_options() is called before mysql_real_connect().

Comment by Andris Zeila [ 2016 Jul 28 ]

Successfully tested

Comment by Viktors Tjarve [ 2016 Jul 28 ]

Released in:

  • 3.0.5rc1 r61246
  • 3.1.0 r61247
Comment by Nico Hänsel [ 2016 Sep 19 ]

Hello,

i just updated my proxy today zuto 3.2.0 but the problem still exist...

After Reboot zabbix proxy log
12175:20160919:100121.076 Got signal [signal:15(SIGTERM),sender_pid:6475,sender_uid:0,reason:0]. Exiting ...
12175:20160919:100123.079 [Z3001] connection to database 'zabbix_proxy' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
12175:20160919:100123.079 Cannot connect to the database. Exiting...

root@xxxx:~# zabbix_proxy -V
zabbix_proxy (Zabbix) 3.2.0

Comment by Oleksii Zagorskyi [ 2016 Sep 19 ]

Nico, your proxy log doesn't look like related to current issue.

Comment by Glebs Ivanovskis (Inactive) [ 2016 Sep 19 ]

Dear pc-nico, your issue looks more like ZBX-11203.

Comment by Nico Hänsel [ 2016 Sep 19 ]

why not?

After Reboot, all my proxy a in this situation....

a quit /etc/init.d/zabbix_proxy restart resolv the problem

i have the problem investigated, mysql starts after zabbix... this was also in the past, but on 2.6 the proxy was waiting and recheck if mysql comes up and than resume....
zabbix proxy 3.x does not wait and recheck for mysql, it terminates immediately ...

Comment by Glebs Ivanovskis (Inactive) [ 2016 Sep 19 ]

Because your proxies were killed:

12175:20160919:100121.076 Got signal [signal:15(SIGTERM),sender_pid:6475,sender_uid:0,reason:0]. Exiting ...

and this was a root cause of them exiting, not the unavailability of database server.

Comment by Nico Hänsel [ 2016 Sep 19 ]

ok you are right... this line is from reboot the server...

zabbix proxy wasn´t configure to auto start on reboot...

systemctl enable zabbix-proxy resolve my probleme.... but i don´t know why zabbix installation this not set... but it could also be a debian problem...

Generated at Fri Apr 26 23:26:15 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.