[ZBX-22122] Systemctl restart zabbix-server hangs Created: 2022 Dec 22  Updated: 2024 Apr 10  Resolved: 2023 Jan 04

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 6.2.6
Fix Version/s: None

Type: Problem report Priority: Trivial
Reporter: C H Assignee: Juris Lambda
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 20.04


Issue Links:
Causes
caused by ZBX-22097 HA manager is not stopped if Zabbix s... Closed
Team: Team B
Sprint: Sprint 95 (Dec 2022), Sprint 96 (Jan 2023)

 Description   

Steps to reproduce:

  1. Configuration: 
    LogFile=/var/log/zabbix/zabbix_server.log
    LogFileSize=0
    PidFile=/run/zabbix/zabbix_server.pid
    SocketDir=/run/zabbix
    DBHost=localhost
    DBName=zabbix
    DBUser=zabbix
    DBPassword=omitted
    DBPort=5000
    StartPollersUnreachable=3
    StartTrappers=5
    StartPingers=3
    StartDiscoverers=3
    SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
    MaxHousekeeperDelete=50000
    CacheSize=512M
    TrendCacheSize=8M
    TrendFunctionCacheSize=4M
    ValueCacheSize=36M
    Timeout=4
    FpingLocation=/usr/bin/fping
    Fping6Location=/usr/bin/fping6
    LogSlowQueries=3000
    StatsAllowedIP=127.0.0.1
    HANodeName=zabbix-prod-van2
    NodeAddress=127.0.0.1:10051 
  1. Run:
    sudo systemctl restart zabbix-server
  1. Wait forever...?
  2. Do in another terminal:
    sudo pkill -9 zabbix_server

Result:
It's taking too long to restart. 

$ ps auxww | grep zabbix_server

zabbix   1970382  0.0  0.0 112972  6164 ?        S    15:47   0:00 /usr/sbin/zabbix_server: ha manager
zabbix   1970389  0.0  0.0 678500  7280 ?        S    15:47   0:00 /usr/sbin/zabbix_server: service manager #1 [processed 0 events, updated 0 event tags, deleted 0 problems, synced 0 service updates, idle 4.002459 sec during 5.419148 sec]
zabbix   1970390  0.2  1.3 783264 220944 ?       S    15:47   0:06 /usr/sbin/zabbix_server: configuration syncer #1 [terminated]
zabbix   1970419  0.0  0.0 678236  5976 ?        S    15:48   0:00 /usr/sbin/zabbix_server: alert manager #1 [terminated]
zabbix   1970420  0.0  0.0 678236  4220 ?        S    15:48   0:00 /usr/sbin/zabbix_server: alerter #1 started
zabbix   1970421  0.0  0.0 678236  4220 ?        S    15:48   0:00 /usr/sbin/zabbix_server: alerter #2 started
zabbix   1970422  0.0  0.0 678236  4220 ?        S    15:48   0:00 /usr/sbin/zabbix_server: alerter #3 started
zabbix   1970423  0.1  0.8 761916 140084 ?       S    15:48   0:04 /usr/sbin/zabbix_server: preprocessing manager #1 [terminated]
zabbix   1970424  0.0  0.0 679472  6612 ?        S    15:48   0:00 /usr/sbin/zabbix_server: preprocessing worker #1 started
zabbix   1970425  0.0  0.0 679760  6876 ?        S    15:48   0:00 /usr/sbin/zabbix_server: preprocessing worker #2 started
zabbix   1970427  0.0  0.0 679688  6876 ?        S    15:48   0:00 /usr/sbin/zabbix_server: preprocessing worker #3 started
zabbix   1970428  0.0  0.0 678236  5212 ?        S    15:48   0:00 /usr/sbin/zabbix_server: lld manager #1 [terminated]
zabbix   1970429  0.0  0.0 682820 14652 ?        S    15:48   0:00 /usr/sbin/zabbix_server: lld worker #1 [processed 1 LLD rules, idle 295.191830 sec during 300.755677 sec]
zabbix   1970430  0.0  0.0 680956 12760 ?        S    15:48   0:00 /usr/sbin/zabbix_server: lld worker #2 [processed 1 LLD rules, idle 214.989587 sec during 220.226943 sec]
zabbix   1970431  0.0  0.1 689680 18804 ?        S    15:48   0:00 /usr/sbin/zabbix_server: housekeeper [removing old history and trends]
zabbix   1970432  0.0  0.0 678500  7512 ?        S    15:48   0:00 /usr/sbin/zabbix_server: timer #1 [terminated]
zabbix   1970433  0.0  0.0 678236  5936 ?        S    15:48   0:00 /usr/sbin/zabbix_server: http poller #1 [terminated]
zabbix   1970434  0.0  0.0 684144 13848 ?        S    15:48   0:00 /usr/sbin/zabbix_server: discoverer #1 [processed 0 rules in 0.000000 sec, performing discovery]
zabbix   1970435  0.0  0.0 684148 13848 ?        S    15:48   0:00 /usr/sbin/zabbix_server: discoverer #2 [processed 0 rules in 0.000000 sec, performing discovery]
zabbix   1970436  0.0  0.0 678236  7456 ?        S    15:48   0:00 /usr/sbin/zabbix_server: discoverer #3 [terminated]
zabbix   1970437  0.1  1.0 793348 176636 ?       S    15:48   0:02 /usr/sbin/zabbix_server: history syncer #1 [processed 867 values, 1417 triggers in 93.567296 sec, syncing history]
zabbix   1970438  0.1  0.9 772128 156484 ?       S    15:48   0:02 /usr/sbin/zabbix_server: history syncer #2 [processed 835 values, 1296 triggers in 87.592510 sec, syncing history]
zabbix   1970439  0.1  1.0 772836 165108 ?       S    15:48   0:02 /usr/sbin/zabbix_server: history syncer #3 [processed 885 values, 1387 triggers in 76.867496 sec, syncing history]
zabbix   1970440  0.1  1.1 790540 184104 ?       S    15:48   0:02 /usr/sbin/zabbix_server: history syncer #4 [processed 834 values, 1362 triggers in 92.601755 sec, syncing history]
zabbix   1970441  0.0  0.0 678236  7456 ?        S    15:48   0:00 /usr/sbin/zabbix_server: escalator #1 [terminated]
zabbix   1970442  0.0  0.0 678236  7456 ?        S    15:48   0:00 /usr/sbin/zabbix_server: proxy poller #1 [terminated]
zabbix   1970443  0.0  0.0 678236  4236 ?        S    15:48   0:00 /usr/sbin/zabbix_server: self-monitoring #1 [terminated]
zabbix   1970444  0.0  0.0 678236  5940 ?        S    15:48   0:00 /usr/sbin/zabbix_server: task manager #1 [terminated]
zabbix   1970445  0.0  0.1 688392 29760 ?        S    15:48   0:00 /usr/sbin/zabbix_server: poller #1 [got 6 values in 0.285501 sec, getting values]
zabbix   1970446  0.0  0.1 688360 29992 ?        S    15:48   0:00 /usr/sbin/zabbix_server: poller #2 [got 60 values in 0.734005 sec, getting values]
zabbix   1970447  0.0  0.1 690916 31224 ?        S    15:48   0:00 /usr/sbin/zabbix_server: poller #3 [got 3 values in 0.142524 sec, getting values]
zabbix   1970448  0.0  0.1 688488 29732 ?        S    15:48   0:00 /usr/sbin/zabbix_server: poller #4 [got 0 values in 0.000010 sec, getting values]
zabbix   1970449  0.0  0.1 688376 29704 ?        S    15:48   0:00 /usr/sbin/zabbix_server: poller #5 [got 3 values in 0.212871 sec, getting values]
zabbix   1970450  0.0  0.0 678236  8120 ?        S    15:48   0:00 /usr/sbin/zabbix_server: unreachable poller #1 [terminated]
zabbix   1970451  0.0  0.0 678236  8644 ?        S    15:48   0:00 /usr/sbin/zabbix_server: unreachable poller #2 [terminated]
zabbix   1970452  0.0  0.0 678236  8644 ?        S    15:48   0:00 /usr/sbin/zabbix_server: unreachable poller #3 [terminated]
zabbix   1970453  0.2  0.3 711668 55552 ?        S    15:48   0:05 /usr/sbin/zabbix_server: trapper #1 [processing data]
zabbix   1970454  0.2  0.2 700244 47196 ?        S    15:48   0:05 /usr/sbin/zabbix_server: trapper #2 [processing data]
zabbix   1970455  0.2  0.4 700560 67724 ?        S    15:48   0:05 /usr/sbin/zabbix_server: trapper #3 [processing data]
zabbix   1970456  0.2  0.3 711752 55624 ?        S    15:48   0:05 /usr/sbin/zabbix_server: trapper #4 [processing data]
zabbix   1970457  0.2  0.3 711676 55336 ?        S    15:48   0:05 /usr/sbin/zabbix_server: trapper #5 [processing data]
zabbix   1970458  0.0  0.0 681744  6756 ?        S    15:48   0:00 /usr/sbin/zabbix_server: icmp pinger #1 [terminated]
zabbix   1970459  0.0  0.0 681744  6756 ?        S    15:48   0:00 /usr/sbin/zabbix_server: icmp pinger #2 [terminated]
zabbix   1970460  0.0  0.0 681744  6756 ?        S    15:48   0:00 /usr/sbin/zabbix_server: icmp pinger #3 [terminated]
zabbix   1970461  0.0  0.0 678236  5976 ?        S    15:48   0:00 /usr/sbin/zabbix_server: alert syncer #1 [terminated]
zabbix   1970462  0.0  0.1 678380 20216 ?        S    15:48   0:00 /usr/sbin/zabbix_server: history poller #1 [got 1 values in 0.001266 sec, getting values]
zabbix   1970463  0.0  0.1 678380 20476 ?        S    15:48   0:00 /usr/sbin/zabbix_server: history poller #2 [got 8 values in 0.001009 sec, getting values]
zabbix   1970464  0.0  0.1 678380 20460 ?        S    15:48   0:00 /usr/sbin/zabbix_server: history poller #3 [got 13 values in 0.000897 sec, getting values]
zabbix   1970465  0.0  0.1 678380 20196 ?        S    15:48   0:00 /usr/sbin/zabbix_server: history poller #4 [got 6 values in 0.001527 sec, getting values]
zabbix   1970466  0.0  0.1 678380 20472 ?        S    15:48   0:00 /usr/sbin/zabbix_server: history poller #5 [got 7 values in 0.001050 sec, getting values]
zabbix   1970472  0.0  0.0 678236  5936 ?        S    15:48   0:00 /usr/sbin/zabbix_server: trigger housekeeper #1 [terminated]
zabbix   1970473  0.0  0.0 678236  5976 ?        S    15:48   0:00 /usr/sbin/zabbix_server: odbc poller #1 [terminated]
root     1978597  0.0  0.0   9032   648 pts/0    R+   16:27   0:00 grep --color=auto zabbix_server
 

Expected:

$ sudo systemctl cat zabbix-server

# /etc/systemd/system/zabbix-server.service
[Unit]
Description=Zabbix Server
After=syslog.target
After=network.target
After=postgresql.service

[Service]
Environment="CONFFILE=/etc/zabbix/zabbix_server.conf"
EnvironmentFile=-/etc/default/zabbix-server
Type=forking
Restart=on-failure
PIDFile=/run/zabbix/zabbix_server.pid
KillMode=control-group
ExecStart=/usr/sbin/zabbix_server -c $CONFFILE
ExecStop=/bin/sh -c '[ -n "$1" ] && kill -s TERM "$1"' -- "$MAINPID"
RestartSec=10s
TimeoutSec=infinity

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/zabbix-server.service.d/override.conf
[Service]
TimeoutSec=5s
 

Adding the TimeoutSec=5s "fixed" it, but ZBX-15602 indicates that the infinity timeout is there to help diagnose missing database dependencies on shutdown.

Also https://support.zabbix.com/browse/ZBX-16078



 Comments   
Comment by Juris Lambda [ 2022 Dec 22 ]

I can't reproduce this in a clean, current 20.04 install with a 6.2.6 server.

What is the database you're using? Is it up and accepting connections as you shut down the zabbix server service? Do the states of the zabbix-servers processes change? I see a few of them have already terminated, while some are still syncing.

c.h., could you please reproduce this behavior with DebugLevel flipped up to 4 and attach the log file for the session here? Thanks in advance.

Comment by C H [ 2022 Dec 23 ]

Hi; the database is Postgres 15, and it's up.  I'm just restarting zabbix-server because I've edited the settings in /etc/zabbix/zabbix_server.conf and want them to take effect.

The problem is currently not reproducible.

It's possible that the database was unreachable, as we're figuring out how to use Zabbix HA with Postgres, Patroni, HAProxy, and Etcd, and this system is not ready for production yet.

I'm looking through the logs to see if that was the case.

Comment by C H [ 2022 Dec 23 ]

Ok, it does appear that database connectivity was lost.

If I do `systemctl stop haproxy` on the active zabbix server, the `systemctl restart zabbix-server` command hangs as described.

Comment by Juris Lambda [ 2022 Dec 27 ]

Hey, c.h.!

I managed to reproduce this behaviour and it appears to be a known bug, filed as ZBX-22097. The patch, once applied, allowed for the service to terminate fully.

If you have the option, do apply that patch, rebuild the server and redeploy.

Comment by Juris Lambda [ 2023 Jan 04 ]

Closing this as a duplicate of ZBX-22097.

Generated at Wed May 07 07:14:01 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.