[ZBX-21159] Zabbix HA Manager crashing Created: 2022 Jun 02  Updated: 2024 Apr 10  Resolved: 2022 Jul 11

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 6.0.5
Fix Version/s: 6.0.7rc1, 6.2.1rc1, 6.4.0alpha1, 6.4 (plan)

Type: Problem report Priority: Critical
Reporter: Chris Bateson Assignee: Andris Zeila
Resolution: Fixed Votes: 1
Labels: HA, crash, highavailability, selinux
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

RHEL 8.6
Zabbix 6.0.5


Issue Links:
Causes
Team: Team A
Sprint: Sprint 90 (Jul 2022)
Story Points: 1

 Description   

I have my system set to auto-update (I know mistake )I believe when it got the 6.0.5 update this issue popped up.  I'm running a since server not in HA mode.  It appears to be stuck in a loop.  zabbix-server starts up, attempts to start HA Manager, fails, crashes zabbix-server, zabbix-server starts up, rinse and repeat.

Config
/etc/zabbix/zabbix_server.conf (Blank lines and comments removed)

LogFile=/var/log/zabbix/zabbix_server.log
LogFileSize=0
DebugLevel=4
PidFile=/var/run/zabbix/zabbix_server.pid
SocketDir=/var/run/zabbix
DBName=zabbix
DBUser=zabbix
DBPassword=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
StartPollers=25
StartPollersUnreachable=100
SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
CacheSize=64M
Timeout=20
LogSlowQueries=3000
StatsAllowedIP=127.0.0.1 

Steps to reproduce:

  1. systemctl start zabbix-server

Result:
See log file:

8823:20220602:094806.426 Starting Zabbix Server. Zabbix 6.0.5 (revision 8da3e1f8419).
  8823:20220602:094806.426 ****** Enabled features ******
  8823:20220602:094806.426 SNMP monitoring:           YES
  8823:20220602:094806.426 IPMI monitoring:           YES
  8823:20220602:094806.427 Web monitoring:            YES
  8823:20220602:094806.427 VMware monitoring:         YES
  8823:20220602:094806.427 SMTP authentication:       YES
  8823:20220602:094806.427 ODBC:                      YES
  8823:20220602:094806.427 SSH support:               YES
  8823:20220602:094806.427 IPv6 support:              YES
  8823:20220602:094806.427 TLS support:               YES
  8823:20220602:094806.427 ******************************
  8823:20220602:094806.427 using configuration file: /etc/zabbix/zabbix_server.conf
  8823:20220602:094806.427 In zbx_load_modules()
  8823:20220602:094806.427 End of zbx_load_modules():SUCCEED
  8823:20220602:094806.427 In zbx_ipc_service_start() service:rtc
  8823:20220602:094806.427 In zbx_ipc_socket_open()
  8823:20220602:094806.427 End of zbx_ipc_socket_open():FAIL
  8823:20220602:094806.427 End of zbx_ipc_service_start():SUCCEED
  8823:20220602:094806.427 In zbx_db_get_database_type()
  8823:20220602:094806.427 In DBconnect() flag:0
  8823:20220602:094806.432 End of DBconnect():0
  8823:20220602:094806.432 query [txnlev:0] [select userid from users limit 1]
  8823:20220602:094806.432 there is at least 1 record in "users" table
  8823:20220602:094806.432 End of zbx_db_get_database_type():ZBX_DB_SERVER
  8823:20220602:094806.432 In init_database_cache()
  8823:20220602:094806.432 In zbx_mem_create() param:'HistoryCacheSize' size:16777216
  8823:20220602:094806.432 valid user addresses: [0x7f0437693170, 0x7f0438692ff0] total size: 16776832
  8823:20220602:094806.432 End of zbx_mem_create()
  8823:20220602:094806.432 In zbx_mem_create() param:'HistoryIndexCacheSize' size:4194304
  8823:20220602:094806.432 valid user addresses: [0x7f0437293180, 0x7f0437692ff0] total size: 4193904
  8823:20220602:094806.432 End of zbx_mem_create()
  8823:20220602:094806.432 In init_trend_cache()
  8823:20220602:094806.432 In zbx_mem_required_size() size:0 chunks_num:1 descr:'trend cache' param:'TrendCacheSize'
  8823:20220602:094806.432 End of zbx_mem_required_size() size:422
  8823:20220602:094806.432 In zbx_mem_create() param:'TrendCacheSize' size:4194304
  8823:20220602:094806.432 valid user addresses: [0x7f0436e93170, 0x7f0437292ff0] total size: 4193920
  8823:20220602:094806.432 End of zbx_mem_create()
  8823:20220602:094806.432 End of init_trend_cache()
  8823:20220602:094806.432 End of init_database_cache()
  8823:20220602:094806.432 In DBconnect() flag:0
  8823:20220602:094806.434 End of DBconnect():0
  8823:20220602:094806.434 query [txnlev:0] [select default_character_set_name,default_collation_name from information_schema.SCHEMATA where schema_name='zabbix']
  8823:20220602:094806.434 query [txnlev:0] [select count(*) from information_schema.`COLUMNS` where table_schema='zabbix' and data_type in ('text','varchar','longtext') and (character_set_name not in ('utf8','utf8mb3','utf8mb4') or collation_name not in ('utf8_bin','utf8mb3_bin','utf8mb4_bin'))]
  8823:20220602:094806.444 In DBconnect() flag:0
  8823:20220602:094806.444 End of DBconnect():0
  8823:20220602:094806.444 In zbx_dbms_version_info_extract()
  8823:20220602:094806.444 End of zbx_dbms_version_info_extract() version:80026
  8823:20220602:094806.444 In DBcheck_version()
  8823:20220602:094806.444 In DBconnect() flag:0
  8823:20220602:094806.445 End of DBconnect():0
  8823:20220602:094806.445 query [txnlev:0] [show tables like 'dbversion']
  8823:20220602:094806.446 query [txnlev:0] [select mandatory,optional from dbversion]
  8823:20220602:094806.446 current database version (mandatory/optional): 06000000/06000002
  8823:20220602:094806.447 required mandatory version: 06000000
  8823:20220602:094806.447 End of DBcheck_version():SUCCEED
  8823:20220602:094806.447 In DBconnect() flag:0
  8823:20220602:094806.448 End of DBconnect():0
  8823:20220602:094806.448 query [txnlev:0] [show columns from config like 'dbversion_status']
  8823:20220602:094806.450 query [txnlev:0] [show index from history where key_name='PRIMARY']
  8823:20220602:094806.451 In DBflush_version_requirements()
  8823:20220602:094806.451 query without transaction detected
  8823:20220602:094806.451 query [txnlev:0] [update config set dbversion_status='[{"database":"MySQL","current_version":"8.00.26","min_version":"5.07.28","max_version":"8.00.x","history_pk":1,"min_supported_version":"8.00.0","flag":0}]']
  8823:20220602:094806.452 End of DBflush_version_requirements()
  8823:20220602:094806.452 In DBcheck_double_type()
  8823:20220602:094806.452 In DBconnect() flag:0
  8823:20220602:094806.453 End of DBconnect():0
  8823:20220602:094806.453 query [txnlev:0] [select count(*) from information_schema.columns where table_schema='zabbix' and column_type='double' and ((lower(table_name)='trends' and (lower(column_name) in ('value_min', 'value_avg', 'value_max'))) or (lower(table_name)='history' and lower(column_name)='value'))]
  8823:20220602:094806.454 End of DBcheck_double_type()
  8823:20220602:094806.454 In DBconnect() flag:0
  8823:20220602:094806.455 End of DBconnect():0
  8823:20220602:094806.455 query [txnlev:0] [select configid,instanceid from config order by configid]
  8823:20220602:094806.456 In zbx_ha_start()
  8823:20220602:094806.456 In zbx_ipc_service_recv() timeout:1.000
  8824:20220602:094806.456 zbx_setproctitle() title:'ha manager'
  8824:20220602:094806.456 starting HA manager
  8824:20220602:094806.457 In zbx_ipc_service_start() service:haservice
  8824:20220602:094806.457 In zbx_ipc_socket_open()
  8824:20220602:094806.457 End of zbx_ipc_socket_open():FAIL
  8824:20220602:094806.457 End of zbx_ipc_service_start():SUCCEED
  8824:20220602:094806.457 In zbx_ipc_async_socket_open()
  8824:20220602:094806.457 In zbx_ipc_socket_open()
  8823:20220602:094807.457 End of zbx_ipc_service_recv():2
  8823:20220602:094807.458 In zbx_ipc_service_recv() timeout:1.000
  8823:20220602:094808.459 End of zbx_ipc_service_recv():2
  8823:20220602:094808.459 In zbx_ipc_service_recv() timeout:1.000
  8823:20220602:094809.460 End of zbx_ipc_service_recv():2
  8823:20220602:094809.460 In zbx_ipc_service_recv() timeout:1.000
  8823:20220602:094810.461 End of zbx_ipc_service_recv():2
  8823:20220602:094810.461 In zbx_ipc_service_recv() timeout:1.000
  8823:20220602:094811.461 End of zbx_ipc_service_recv():2
  8823:20220602:094811.461 In zbx_ipc_service_recv() timeout:1.000
  8823:20220602:094812.462 End of zbx_ipc_service_recv():2
  8823:20220602:094812.462 In zbx_ipc_service_recv() timeout:1.000
  8823:20220602:094813.464 End of zbx_ipc_service_recv():2
  8823:20220602:094813.464 In zbx_ipc_service_recv() timeout:1.000
  8823:20220602:094814.465 End of zbx_ipc_service_recv():2
  8823:20220602:094814.465 In zbx_ipc_service_recv() timeout:1.000
  8823:20220602:094815.465 End of zbx_ipc_service_recv():2
  8823:20220602:094815.465 In zbx_ipc_service_recv() timeout:1.000
  8823:20220602:094816.466 End of zbx_ipc_service_recv():2
  8823:20220602:094816.467 One child process died (PID:8824,exitcode/signal:9). Exiting ...
  8823:20220602:094816.467 End of zbx_ha_start():FAIL
  8823:20220602:094816.467 cannot start HA manager: timeout while waiting for HA manager registration 

Expected:
zabbix server to start sucessfully



 Comments   
Comment by Christoph Schmocker [ 2022 Jun 07 ]

The same issue here. After update from 6.0.4 to 6.0.5, the HA manager is starting without a config.

 32787:20220607:085316.471 Starting Zabbix Server. Zabbix 6.0.5 (revision 8da3e1f8419).
 32787:20220607:085316.471 ****** Enabled features ******
 32787:20220607:085316.471 SNMP monitoring:           YES
 32787:20220607:085316.471 IPMI monitoring:           YES
 32787:20220607:085316.471 Web monitoring:            YES
 32787:20220607:085316.471 VMware monitoring:         YES
 32787:20220607:085316.471 SMTP authentication:       YES
 32787:20220607:085316.471 ODBC:                      YES
 32787:20220607:085316.471 SSH support:               YES
 32787:20220607:085316.471 IPv6 support:              YES
 32787:20220607:085316.471 TLS support:               YES
 32787:20220607:085316.471 ******************************
 32787:20220607:085316.471 using configuration file: /etc/zabbix/zabbix_server.conf
 32787:20220607:085316.588 current database version (mandatory/optional): 06000000/06000002
 32787:20220607:085316.588 required mandatory version: 06000000
 32787:20220607:085316.625 database could be upgraded to use primary keys in history tables
 32788:20220607:085316.666 starting HA manager
 32787:20220607:085326.678 One child process died (PID:32788,exitcode/signal:9). Exiting ...
 32787:20220607:085326.678 cannot start HA manager: timeout while waiting for HA manager registration
 
Comment by Chung Yun Loo [ 2022 Jun 09 ]

I ran into the same issue on a CentOS 8 Stream server. While reviewing the SELinux audit messages, I found the following log entries:

UID="zabbix" GID="zabbix" EUID="zabbix" SUID="zabbix" FSUID="zabbix" EGID="zabbix" SGID="zabbix" FSGID="zabbix"
type=AVC msg=audit(1654803948.194:241440): avc:  denied  { connectto } for  pid=18382 comm="zabbix_server" path="/run/zabbix/zabbix_server_rtc.sock" scontext=system_u:system_r:zabbix_t:s0 tcontext=system_u:system_r:zabbix_t:s0 tclass=unix_stream_socket permissive=0
type=SYSCALL msg=audit(1654803948.194:241440): arch=c000003e syscall=42 success=no exit=-13 a0=e a1=7ffdb4ab0900 a2=6e a3=2 items=0 ppid=18381 pid=18382 auid=4294967295 uid=993 gid=990 euid=993 suid=993 fsuid=993 egid=990 sgid=990 fsgid=990 tty=(none) ses=4294967295 comm="zabbix_server" exe="/usr/sbin/zabbix_server_mysql" subj=system_u:system_r:zabbix_t:s0 key=(null)ARCH=x86_64 SYSCALL=connect AUID="unset" UID="zabbix" GID="zabbix" EUID="zabbix" SUID="zabbix" FSUID="zabbix" EGID="zabbix" SGID="zabbix" FSGID="zabbix"
type=SERVICE_STOP msg=audit(1654803948.290:241441): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=zabbix-server comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'UID="root" AUID="unset" 

It's a SELinux access violation when zabbix server tries to open the socket file /run/zabbix/zabbix_server_rtc.sock. Passing the SELinux audit message thru audit2why returned the following result:

[root@zabbix ~]# audit2why < selinux.log
type=AVC msg=audit(1654803948.194:241440): avc:  denied  { connectto } for  pid=18382 comm="zabbix_server" path="/run/zabbix/zabbix_server_rtc.sock" scontext=system_u:system_r:zabbix_t:s0 tcontext=system_u:system_r:zabbix_t:s0 tclass=unix_stream_socket permissive=0

    Was caused by:
    The boolean daemons_enable_cluster_mode was set incorrectly.                          Description:
    Allow daemons to enable cluster mode

    Allow access by executing:
    # setsebool -P daemons_enable_cluster_mode 1

So, as root, issue the following command to allow clustering in SELinux...

setsebool -P daemons_enable_cluster_mode 1

or using the "on" and "off" keywords:

setsebool -P daemons_enable_cluster_mode on

(The -P option writes the policy change to disk so it persists between reboots.)

Two additional SELinux policy changes are required to allow Apache HTTP Server to initiate network connections and connect to zabbix server, otherwise there's a connection error on the Zabbix dashboard:

setsebool -P httpd_can_network_connect 1
setsebool -P httpd_can_connect_zabbix 1

Might also need to restart zabbix server depending on your particular setup:

systemctl restart zabbix-server

To see the entire list of SELinux boolean values:

getsebool -a

It looks like the problem is from the upgraded SELinux policy packages. From my server's /var/log/dnf.log:

2022-06-09T01:24:45-0500 DEBUG Upgraded: selinux-policy-3.14.3-99.el8.noarch
2022-06-09T01:24:45-0500 DEBUG Upgraded: selinux-policy-targeted-3.14.3-99.el8.noarch
2022-06-09T01:24:45-0500 DEBUG Upgraded: zabbix-selinux-policy-6.0.5-1.el8.x86_64
Comment by Chris Bateson [ 2022 Jun 10 ]

You are correct it looks like it was SELinux related.  To be honest I completely forgot I enabled that but figured I'd go check after reviewing your comment.  Sure enough I have it set to enforcing.

Comment by Jurijs Klopovskis [ 2022 Jun 29 ]

Released updated zabbix-selinux-policy-6.0.6-2 package for rhel 7 & 8

A buggy %postun added scriptlet was addded in 6.0.5. It purged the the installed zabbix_policy not only during package deinstallation as it should, but also during an update.

If you have the buggy 6.0.5-1 or 6.0.6-1 package installed, then direct update to 6.0.6-2 will not work, because the old buggy package will still purge zabbix_policy. You must first uninstall the old zabbix-selinux-policy package and then install the new one.

# dnf remove zabbix-selinux-policy
# dnf clean all
# dnf install zabbix-selinux-policy

During deinstallation of the buggy package, you may see the following message

libsemanage.semanage_direct_remove_key: Unable to remove module zabbix_policy at priority 400. (No such file or directory).
semodule:  Failed!

That's OK.

Upgrade from 6.0.4 and older should be OK.

Comment by Andris Zeila [ 2022 Jul 11 ]

Released ZBX-21159 in:

  • pre-6.0.7rc1 0ad5887976e
  • pre-6.2.1rc1 9803876667b
  • pre-6.4.0alpha1 74742edcb04
Generated at Sun May 04 07:27:20 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.