[ZBX-16078] Zabbix hangs at shutdown Created: 2019 May 03  Updated: 2024 Aug 03  Resolved: 2019 Aug 12

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Incident report Priority: Trivial
Reporter: Franky Van Liedekerke Assignee: Andrei Gushchin (Inactive)
Resolution: Won't fix Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File image-2019-08-11-10-25-53-482.png    

 Description   

When rebooting the zabbix server (redhat 7.6, systemd), the server hangs on zabbix shutdown:

"a stop job is running for Zabbix Server"

Relevant rpm package info: zabbix-server-pgsql-4.2.1-1.el7.x86_64

Since the server was patched beforehand (yum update), could that have triggered this (a second reboot didn't cause this)? Because that would mean zabbix needs to be shut down manually before patching begins.



 Comments   
Comment by Vladislavs Sokurenko [ 2019 May 03 ]

Could you please be so kind and show output of "ps -ax | grep zabbix" ?

Comment by Andrei Gushchin (Inactive) [ 2019 May 03 ]

Anything in the zabbix_server.log? How long does it take?

Comment by Franky Van Liedekerke [ 2019 May 04 ]

ps is impossible, since the server was shutting down (no login was possible anymore).

I checked /var/log/messages and php/apache logs and found nothing, I'll check zabbix_server.log on Monday. But the question on how long it took: forever, it did not end anymore (I needed a hard reset to recover).

Comment by Franky Van Liedekerke [ 2019 May 04 ]

Btw: I need to check it, but it seems the systemd timeout when stopping the service was set to forever. Maybe T{{imeoutStopSec could help.}}
I'll check that on Monday too.

Comment by Andrei Gushchin (Inactive) [ 2019 May 07 ]

Thank you. Are zabbix server pid defined properly in systemd service file?
Could systemd restart zabbix server service?
I suppose that this is some misconfiguration.

Comment by Franky Van Liedekerke [ 2019 May 07 ]

The systemd file is the one provided by the zabbix rpm by default:

[Unit]
Description=Zabbix Server
After=syslog.target
After=network.target

[Service]
Environment="CONFFILE=/etc/zabbix/zabbix_server.conf"
EnvironmentFile=-/etc/sysconfig/zabbix-server
Type=forking
Restart=on-failure
PIDFile=/run/zabbix/zabbix_server.pid
KillMode=control-group
ExecStart=/usr/sbin/zabbix_server -c $CONFFILE
ExecStop=/bin/kill -SIGTERM $MAINPID
RestartSec=10s
TimeoutSec=0

[Install]
WantedBy=multi-user.target

Like I said: on normal startup, it works fine. But these messages show at boot:

systemd: Stopping Zabbix Server...
systemd: zabbix-server.service: main process exited, code=exited, status=1/FAILURE
systemd: Stopped Zabbix Server.
systemd: Unit zabbix-server.service entered failed state.
systemd: zabbix-server.service failed.
...
systemd: Starting Zabbix Server...
systemd: PID file /run/zabbix/zabbix_agentd.pid not readable (yet?) after start.
systemd: Started Zabbix Server.

The 2 failure lines are at shutdown, the third pid-issue at startup. Taking this into account, I think the zabbix service file can use some improvements.
Also the TimeoutSec=0 param should not be used (this caused the infinite loop at shutdown the other time). If you want: use TimeoutStartSec=0 and set TimeoutStopSec to some reasonable value (like 60sec of 120sec or so).
Since zabbix_server has the '-f" option, this should be used together with Type=Simple or so (and not forking).

(just suggesting here).

Comment by richlv [ 2019 May 07 ]

TimeoutSec=0 is used to avoid nuking server process during the DB upgrade.

Comment by Franky Van Liedekerke [ 2019 May 07 ]

While I can believe this to be true at startup, it can't be the case at shutdown. So then you should use TimeoutStartSec=0

Comment by richlv [ 2019 May 07 ]

It is in particular important for shutdown - let's say server starts up, DB upgrade is started. While it is in progress, Zabbix server process is stopped. This would leave the DB in a broken state.

Comment by Franky Van Liedekerke [ 2019 May 07 ]

While the process is starting up, it should not mark itself as started, that way a shutdown will not happen until the process has completed the startup. The correct method of doing that in systemd is using type=notify and using notifying systemd when all is finished, e.g.:

https://www.freedesktop.org/software/systemd/man/systemd-notify.html

 

Comment by Matt Stephenson [ 2019 Aug 11 ]

On Ubuntu 18.04.3 with zabbix-server-mysql 4.0.11-1+bionic, I get this behaviour at every reboot of the server.

This is caused by MySQL being shutdown before Zabbix Server, causing the log to be continually filled with MySQL connection attempts until systemd eventually times out and forces a shutdown/reboot.

Log:-
   776:20190810:210239.281 Got signal [signal:15(SIGTERM),sender_pid:1417,sender_uid:0,reason:0]. Exiting ...
  1173:20190810:210239.281 syncing history data in progress...
  1173:20190810:210239.281 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1173:20190810:210239.281 database is down: reconnecting in 10 seconds
  1181:20190810:210239.303 [Z3005] query failed: [1053] Server shutdown in progress [select taskid,type,clock,ttl from task where status in (1,2) order by taskid]
  1181:20190810:210239.303 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1181:20190810:210239.303 database is down: reconnecting in 10 seconds
  1173:20190810:210249.281 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1173:20190810:210249.282 database is down: reconnecting in 10 seconds
  1181:20190810:210249.303 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1181:20190810:210249.303 database is down: reconnecting in 10 seconds
  1173:20190810:210259.282 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1173:20190810:210259.282 database is down: reconnecting in 10 seconds
  1181:20190810:210259.303 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1181:20190810:210259.303 database is down: reconnecting in 10 seconds
  1173:20190810:210309.282 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1173:20190810:210309.282 database is down: reconnecting in 10 seconds
  1181:20190810:210309.304 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1181:20190810:210309.304 database is down: reconnecting in 10 seconds
  1173:20190810:210319.282 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1173:20190810:210319.283 database is down: reconnecting in 10 seconds
  1181:20190810:210319.304 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1181:20190810:210319.304 database is down: reconnecting in 10 seconds
  1167:20190810:210324.291 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1167:20190810:210324.291 database is down: reconnecting in 10 seconds
  1173:20190810:210329.283 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1173:20190810:210329.283 database is down: reconnecting in 10 seconds
  1181:20190810:210329.304 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1181:20190810:210329.304 database is down: reconnecting in 10 seconds
  1167:20190810:210334.291 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1167:20190810:210334.291 database is down: reconnecting in 10 seconds
  1173:20190810:210339.283 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1173:20190810:210339.283 database is down: reconnecting in 10 seconds
  1181:20190810:210339.305 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1181:20190810:210339.305 database is down: reconnecting in 10 seconds
  1167:20190810:210344.291 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1167:20190810:210344.291 database is down: reconnecting in 10 seconds
  1173:20190810:210349.284 [Z3001] connection to database 'zabbix' failed: [2002] Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
  1173:20190810:210349.284 database is down: reconnecting in 10 seconds

Comment by Vladislavs Sokurenko [ 2019 Aug 11 ]

mattstephenson Zabbix server cannot shut down without a working database as history syncers need to sync data to the database

Comment by Marek Krolikowski [ 2019 Aug 11 ]

Same problem here with 4.0.11 on Debian 10:

root@zabbix:~# dpkg -l |grep zabbix
ii zabbix-agent 1:4.0.11-1+buster amd64 Zabbix network monitoring solution - agent
ii zabbix-frontend-php 1:4.0.11-1+buster all Zabbix network monitoring solution - PHP front-end
ii zabbix-get 1:4.0.11-1+buster amd64 Zabbix network monitoring solution - get
ii zabbix-java-gateway 1:4.0.11-1+buster all Zabbix network monitoring solution - java-gateway
ii zabbix-release 1:4.0-3+buster all Zabbix official repository configuration
ii zabbix-server-mysql 1:4.0.11-1+buster amd64 Zabbix network monitoring solution - server (MySQL)

 

And same like other ppl i can`t login via SSH or console.

Best regards

TaKeN

Comment by Vladislavs Sokurenko [ 2019 Aug 11 ]

could you please show contents of zabbix unit file that is used by systemd ?

Comment by Vladislavs Sokurenko [ 2019 Aug 11 ]

As richlv mentioned originally reported issue could be caused by ZBX-11203, which is expected behavior as database upgrade patch should not be interrupted because some queries cannot be rolled back.

But new reports might be about another issue, it is possible that systemd for some reason stops MySQL server first and then try to stop Zabbix server, however Zabbix server depends on MySQL server and it should always be stopped before MySQL is stopped, otherwise history cannot be synced reliably .

Comment by Matt Stephenson [ 2019 Aug 11 ]

Unit file has no mention of MySQL

 

[Unit]
Description=Zabbix Server
After=syslog.target
After=network.target

[Service]
Environment="CONFFILE=/etc/zabbix/zabbix_server.conf"
EnvironmentFile=-/etc/default/zabbix-server
Type=forking
Restart=on-failure
PIDFile=/run/zabbix/zabbix_server.pid
KillMode=control-group
ExecStart=/usr/sbin/zabbix_server -c $CONFFILE
ExecStop=/bin/kill -SIGTERM $MAINPID
RestartSec=10s
TimeoutSec=infinity

[Install]
WantedBy=multi-user.target

Comment by Marek Krolikowski [ 2019 Aug 11 ]

i got exacly same like mattstephenson 

root@zabbix:~# md5sum /lib/systemd/system/zabbix-server.service
b1556e90e644d17175a48dccbebcff5a /lib/systemd/system/zabbix-server.service

root@zabbix:~# cat /lib/systemd/system/zabbix-server.service
[Unit]
Description=Zabbix Server
After=syslog.target
After=network.target

[Service]
Environment="CONFFILE=/etc/zabbix/zabbix_server.conf"
EnvironmentFile=-/etc/default/zabbix-server
Type=forking
Restart=on-failure
PIDFile=/run/zabbix/zabbix_server.pid
KillMode=control-group
ExecStart=/usr/sbin/zabbix_server -c $CONFFILE
ExecStop=/bin/kill -SIGTERM $MAINPID
RestartSec=10s
TimeoutSec=infinity

[Install]
WantedBy=multi-user.target

 

 

Best Regards

TaKeN

Comment by Vladislavs Sokurenko [ 2019 Aug 11 ]

Thanks, I think that optional dependency on MySQL server must be added there

Comment by Vladislavs Sokurenko [ 2019 Aug 12 ]

As for the second part of the issue this is duplicate of ZBX-15602

Comment by Vladislavs Sokurenko [ 2019 Aug 12 ]

Closing this as Won't Fix as originally reported issue resolved itself after upgrade was complete

Comment by Goran [ 2024 Aug 01 ]

The default settings in 2024 also makes it hang if mysql service is down.

I didn't think to restert mysql before I powered off the server forcefully so I lost about 6 hours of logs. 

 

[Unit]
Description=Zabbix Server
After=syslog.target
After=network.target
After=mysql.service
After=mysqld.service
After=mariadb.service

[Service]
Environment="CONFFILE=/etc/zabbix/zabbix_server.conf"
EnvironmentFile=-/etc/default/zabbix-server
Type=forking
Restart=on-failure
PIDFile=/run/zabbix/zabbix_server.pid
KillMode=control-group
ExecStart=/usr/sbin/zabbix_server -c $CONFFILE
ExecStop=/bin/sh -c '[ -n "$1" ] && kill -s TERM "$1"' – "$MAINPID"
RestartSec=10s
TimeoutSec=infinity

[Install]
WantedBy=multi-user.target

Comment by Matt Stephenson [ 2024 Aug 01 ]

I still experience this.. so I have in place a systemd override..

 

[Unit]
Requires=mysql.service

Comment by Goran [ 2024 Aug 01 ]

Thanks, I'll try that.

I don't understand systemd that well. What does that line do. I thought that was for when it boots? Will that actually start or restart mysql if it's down on boot? And also I guess it will do it when machine is rebooting and if mysql is down before zabbix it will restart mysql and then zabbix can go down, and later also mysql. 

 

Well I couldn't get the override to work with systemctl edit zabbix-server. No changes show up when i do systemctl cat zabbix.

Now I have to learn how to edit systemd files

Comment by Matt Stephenson [ 2024 Aug 02 ]

I just make a file at: /etc/systemd/system/zabbix-server.service.d/override.conf

With contents:-

[Unit]
Requires=mysql.service

Restart server and this seems to make everything start/shutdown in the correct order.

 

Comment by Goran [ 2024 Aug 02 ]

Awesome thanks. I did 'systemctl edit mysql' by accident and the override.conf file was created for mysql.service instead. It works now. I tested by stopping mysql and then rebooted.

Comment by Marek Krolikowski [ 2024 Aug 02 ]

Since Zabbix 4.0, I have been using my own entries for systemctl and haven't had any issues. When setting up Zabbix for a client, I didn't create these entries, and the problem still occurs even in version 6.0. I don't quite understand why this BUG was closed if it still persists. Here are the entries I have in systemctl:

root@zabbix:~# systemctl edit zabbix-server
[Unit]
Description=Zabbix Server
Wants=mariadb.service
After=mariadb.service
After=syslog.target
After=network.target
Wants=mysql.service
After=mysql.service
Wants=postgresql.service
After=postgresql.service
Comment by Matt Stephenson [ 2024 Aug 02 ]

TaKeN I completely agree - I am using version 6.0 also. Surely this must be affecting more users of Zabbix Server/Proxy.

Comment by Marek Krolikowski [ 2024 Aug 02 ]

In the default file /lib/systemd/system/zabbix-server.service, developers should add the Wants= lines in addition to the lines starting with After=.
I don't quite understand why the developers haven't done this so far, and instead, we have to make our own entries on each machine using systemctl edit zabbix-server.service.

Comment by Goran [ 2024 Aug 03 ]

I guess the reason is

we can't depend on mysql service for 2 reasons:

  1. They might be using a flavor of MySQL, e. g. MariaDB (mariadb.service in this case).
  2. The database might be running on a separate, dedicated host.

https://support.zabbix.com/browse/ZBX-15602

 

That being said. I don't really know what exactly 'Wants' does so I might be completely missing the point. I've read the manual what Wants is, but I don't really have experience with this. 

Comment by Matt Stephenson [ 2024 Aug 03 ]

I am no expert in systemd either, but i use "Requires" because this is a 'hard dependency', where stopping MySQL, will stop Zabbix too (Zabbix cannot be running without MySQL).

"Wants" by comparison is a 'soft dependency', Zabbix will start MySQL (if not already started), however stopping MySQL will allow Zabbix to keep running.

Generated at Wed Apr 30 06:29:59 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.