-
Type:
Problem report
-
Resolution: Unresolved
-
Priority:
Major
-
None
-
Affects Version/s: 7.0.22, 7.4.6, 8.0.0alpha1
-
Component/s: Server (S)
-
Support backlog
Action
Killing zabbix_server main process.
Expected
zabbix server switch to standby node
Observed
zabbix server does not respond to runtime commends and does not switch to a secondary node. At least some data continue to come in (not tested how long).
Replication:
Installed two clean install ubuntu servers with latest patches
192.168.196.23 ha1.local ha1
192.168.196.24 ha2.local ha2
# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 24.04.3 LTS Release: 24.04 Codename: noble
Standart 7.0.22. installation on ubuntu , frontend and DB on ha1 for testing.
On servers zabbix 7.0.22 is installed
ha1 packages
zabbix-agent/zabbix,now 1:7.0.22-1+ubuntu24.04 amd64 [installed] zabbix-frontend-php/zabbix,now 1:7.0.22-1+ubuntu24.04 all [installed] zabbix-nginx-conf/zabbix,now 1:7.0.22-1+ubuntu24.04 all [installed] zabbix-release/zabbix,zabbix,now 1:7.0-2+ubuntu24.04 all [installed] zabbix-server-mysql/zabbix,now 1:7.0.22-1+ubuntu24.04 amd64 [installed] zabbix-sql-scripts/zabbix,now 1:7.0.22-1+ubuntu24.04 all [installed]
ha2 packages
zabbix-agent/zabbix,now 1:7.0.22-1+ubuntu24.04 amd64 [installed] zabbix-release/zabbix,zabbix,now 1:7.0-2+ubuntu24.04 all [installed] zabbix-server-mysql/zabbix,now 1:7.0.22-1+ubuntu24.04 amd64 [installed]
Servers configured in HA mode
ha1
HANodeName=ha1 NodeAddress=192.168.196.23:10051
ha2
HANodeName=ha2 NodeAddress=192.168.196.24:10051
Processes on ha1
systemd-+-ModemManager---3*[{ModemManager}]
|-agetty
|-cron
|-dbus-daemon
|-mariadbd---79*[{mariadbd}]
|-multipathd---6*[{multipathd}]
|-nginx---7*[nginx]
|-php-fpm8.3---9*[php-fpm8.3]
|-polkitd---3*[{polkitd}]
|-rsyslogd---3*[{rsyslogd}]
|-snmpd
|-sshd---sshd---sshd---bash---tmux: client
|-systemd-+-(sd-pam)
| `-dbus-daemon
|-systemd-journal
|-systemd-logind
|-systemd-network
|-systemd-resolve
|-systemd-timesyn---{systemd-timesyn}
|-systemd-udevd
|-tmux: server---bash---sudo---sudo---su---bash---pstree
|-udisksd---5*[{udisksd}]
|-unattended-upgr---{unattended-upgr}
`-zabbix_server-+-45*[zabbix_server]
|-zabbix_server---16*[{zabbix_server}]
|-zabbix_server---5*[{zabbix_server}]
`-4*[zabbix_server---{zabbix_server}]
Processes on ha2
systemd-+-ModemManager---3*[{ModemManager}]
|-agetty
|-cron
|-dbus-daemon
|-multipathd---6*[{multipathd}]
|-polkitd---3*[{polkitd}]
|-rsyslogd---3*[{rsyslogd}]
|-snmpd
|-sshd---sshd---sshd---bash---tmux: client
|-systemd-+-(sd-pam)
| `-dbus-daemon
|-systemd-journal
|-systemd-logind
|-systemd-network
|-systemd-resolve
|-systemd-timesyn---{systemd-timesyn}
|-systemd-udevd
|-tmux: server---bash---sudo---sudo---su---bash---pstree
|-udisksd---5*[{udisksd}]
|-unattended-upgr---{unattended-upgr}
`-zabbix_server---zabbix_server
Zabbix server shows working HA with Failover delay 60 seconds
root@ha1:~# zabbix_server -R ha_status Failover delay: 60 seconds Cluster status: # ID Name Address Status Last Access 1. cmlfdgn8f0001b8819kmay2tp ha1 192.168.196.23:10051 active 1s 2. cmlfdgub50001le82yj2ulgwr ha2 192.168.196.24:10051 standby 5s
Find and kill active zabbix server process
root@ha1:~# ps -ef|grep "zabbix_server -c"
zabbix 1992 1 0 14:51 ? 00:00:00 /usr/sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf
root 2110 1291 0 14:54 pts/2 00:00:00 grep --color=auto zabbix_server -c
root@ha1:~# kill -9 1992
Processes now are owned by systemd
/systemd-+-ModemManager---3*[{ModemManager}]
|-agetty
|-cron
|-dbus-daemon
|-mariadbd---79*[{mariadbd}]
|-multipathd---6*[{multipathd}]
|-nginx---7*[nginx]
|-php-fpm8.3---10*[php-fpm8.3]
|-polkitd---3*[{polkitd}]
|-rsyslogd---3*[{rsyslogd}]
|-snmpd
|-sshd---sshd---sshd---bash---tmux: client
|-systemd-+-(sd-pam)
| `-dbus-daemon
|-systemd-journal
|-systemd-logind
|-systemd-network
|-systemd-resolve
|-systemd-timesyn---{systemd-timesyn}
|-systemd-udevd
|-tmux: server---bash---sudo---sudo---su---bash---pstree
|-udisksd---5*[{udisksd}]
|-unattended-upgr---{unattended-upgr}
|-45*[zabbix_server]
|-zabbix_server---16*[{zabbix_server}]
|-zabbix_server---5*[{zabbix_server}]
`-4*[zabbix_server---{zabbix_server}]
Start waiting period , zabbix server does not switch to secondary node . HA remains in standby
root@ha1:~# date Tue Feb 10 02:55:06 PM UTC 2026 root@ha1:~# zabbix_server -R ha_status zabbix_server [2125]: Cannot perform runtime control command: Timeout while waiting for response
after three minutes waiting , nothing changes
root@ha1:~# date Tue Feb 10 02:58:06 PM UTC 2026 root@ha1:~# zabbix_server -R ha_status zabbix_server [2145]: Cannot perform runtime control command: Timeout while waiting for response
no changes on secondary node too
root@ha2:~# date
Tue Feb 10 03:00:14 PM UTC 2026
root@ha2:~# zabbix_server -R ha_status
Runtime commands can be executed only in active mode
meanwhile zabbix server writes messages in zabbix server log in ha1
2001:20260210:150158.744 cannot send history syncer notification 2001:20260210:150216.760 cannot write to IPC socket: Broken pipe 2001:20260210:150216.760 cannot send history syncer notification
Service does not react on restart , processes can be only killed.
After killing processes on ha1 and starting zabbix server on ha1 again, have switched to second node
root@ha2:~# zabbix_server -R ha_status Failover delay: 60 seconds Cluster status: # ID Name Address Status Last Access 1. cmlfdgn8f0001b8819kmay2tp ha1 192.168.196.23:10051 standby 3s 2. cmlfdgub50001le82yj2ulgwr ha2 192.168.196.24:10051 active 2s
in server logfile is message
857:20260210:150823.947 "ha2" node switched to "active" mode