[ZBX-12984] Zabbix Server restarting continuously after upgrade to 3.4.3 Created: 2017 Nov 05  Updated: 2019 May 17  Resolved: 2017 Nov 08

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 3.4.3
Fix Version/s: None

Type: Incident report Priority: Critical
Reporter: Maksims Edelmans Assignee: Unassigned
Resolution: Commercial support required Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by ZBX-12985 Too many open files Closed

 Description   

Hello,
We are facing an issue with our Zabbix Server after upgrade from 3.2.9 to 3.4.3. Server is restarting continuously.
Firstly we didn't take into consideration that StartPreprocessors parameter should have been tuned up and Preprocessor was consuming 100% time of one of the CPUs. After changing to StartPreprocessors=300 this line appears in error_log:

zabbix_server [7327]: failed to open log file: [24] Too many open files
zabbix_server [7327]: failed to write [cannot accept incoming IPC connection: [24] Too many open files] into log file

If changing StartPreprocessors to 100, Zabbix Server is still restarting with the same error message but once per 15-20 minutes.
These parameters had been changed but it doesn't solve the issue:

[root@zabbix ~]# cat /proc/sys/fs/file-max
2097152
[root@zabbix ~]# cat /etc/sysctl.conf
# System default settings live in /usr/lib/sysctl.d/00-system.conf.
# To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
dev.raid.speed_limit_min = 10000
dev.raid.speed_limit_max = 500000
vm.swappiness=1
 vm.vfs_cache_pressure=50
net.ipv4.ip_no_pmtu_disc=1
fs.file-max = 2097152
[root@zabbix ~]# vim /etc/security/limits.conf
* soft nproc 500000
 * hard nproc 500000
 * soft nofile 500000
 * hard nofile 500000
[root@zabbix ~]# su - zabbix -c 'ulimit -aHS' -s '/bin/bash'
su: warning: cannot change directory to /var/lib/zabbix: No such file or directory
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 256296
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 500000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
[root@zabbix ~]# su - root -c 'ulimit -aHS' -s '/bin/bash'
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 256296
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 500000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 500000
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

zabbix_server.conf:

LogFile=/var/log/zabbix/zabbix_server.log
LogFileSize=100
DebugLevel=3
PidFile=/var/run/zabbix/zabbix_server.pid
DBName=zabbix
DBUser=zabbix
StartPollers=750
StartPreprocessors=300
StartPollersUnreachable=200
StartTrappers=20
StartPingers=20
StartDiscoverers=250
StartTimers=3
SNMPTrapperFile=/var/log/snmptrap/snmptrap.log
HousekeepingFrequency=0
MaxHousekeeperDelete=1000
CacheSize=2048M
CacheUpdateFrequency=360
StartDBSyncers=2
HistoryCacheSize=512M
HistoryIndexCacheSize=128M
TrendCacheSize=128M
ValueCacheSize=512M
Timeout=30
UnreachablePeriod=360
UnavailableDelay=600
UnreachableDelay=120
AlertScriptsPath=/usr/lib/zabbix/alertscripts
ExternalScripts=/usr/lib/zabbix/externalscripts
LogSlowQueries=3000

Zabbix System is all-in-one currently:

[root@zabbix ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/md127      280G  8.9G  272G   4% /
devtmpfs         32G     0   32G   0% /dev
tmpfs            32G     0   32G   0% /dev/shm
tmpfs            32G  8.8M   32G   1% /run
tmpfs            32G     0   32G   0% /sys/fs/cgroup
/dev/md125      954G  179G  775G  19% /db
/dev/md126      2.0G  254M  1.8G  13% /boot
/dev/md2        1.6T  792G  783G  51% /mnt/lstorage
/dev/sdb3       200M     0  200M   0% /boot/efi2
/dev/sda1       200M  9.8M  191M   5% /boot/efi
tmpfs           6.3G     0  6.3G   0% /run/user/0

Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz

[root@zabbix ~]#  free
              total        used        free      shared  buff/cache   available
Mem:       65696748    24664656    39955132      460668     1076960    39956428
Swap:       5915644           0     5915644

Looking for help ASAP!



 Comments   
Comment by Glebs Ivanovskis (Inactive) [ 2017 Nov 06 ]

I suspect that ulimit -n returns something too small for your configuration, check this and this warning. Sorry, I don't know how to increase it...

Also preprocessing worker is probably the first type of process in Zabbix which can utilize CPU properly. Therefore:

  • there is no sense to have StartPreprocessors larger than number of CPU cores on your Zabbix server;
  • it is OK when preprocessing worker utilizes 100% of CPU core, you don't need to "dilute" the load among hundreds of processes;
  • update your templates and start monitoring preprocessing queue instead of preprocessing worker busyness.
Comment by Vladislavs Sokurenko [ 2017 Nov 07 ]

Sorry, this tracker is for bug reports only.
Please use IRC,forums or other channels for community support - see https://www.zabbix.org/wiki/Getting_help for more detail.
Alternatively contact [email protected] for commercial support.

Comment by Vladislavs Sokurenko [ 2017 Nov 07 ]

Please try to identify which process consumes, all the handles, and if you happen to have any zombie processes, if it's zabbix then feel free to reopen issue.

Comment by Maksims Edelmans [ 2017 Nov 07 ]

1. StartPreprocessors is now 16 which is equal to our core count.
2. Templates had been updated with preprocessing queue.
3. Cache parameters in zabbix_server.conf file had also been tuned-up.

It seems to be OK now, no continuous service restart anymore.
Thank you for your help.

P.S. I know that this tracker is for bug reports only. The reason I posted my question here was that I had to wait about 24 hours before I can confirm my registration on forum and start posting there. Thank you.

Comment by Maksims Edelmans [ 2017 Nov 08 ]

I am not sure if it is corrent to post it here or open another issue, but I think it is an issue indeed:

If I set DebugLevel=4, then my preprocessing queue is always getting bigger:

32382 /usr/sbin/zabbix_server: preprocessing manager #1 [queued 174, processed 0 values, idle 0.000000 sec during 5.021583 sec]

As soon as I change DebugLevel back to 3, then everything is getting processed:

4086 /usr/sbin/zabbix_server: preprocessing manager #1 [queued 1, processed 4717 values, idle 4.971614 sec during 5.022361 sec]

Is it normal behaviour or not?

Comment by Rostislav Palivoda [ 2017 Nov 08 ]

Please open new request.

Comment by Glebs Ivanovskis (Inactive) [ 2017 Nov 08 ]

"Is it normal behaviour that Zabbix stops processing data when I ask him to sing and play banjo?"

Yes, it is expected that increasing DebugLevel reduces performance due to disk I/O and log file concurrency. DebugLevel=4 is definitely not for production use. Use it wisely, there are options allowing granular control of logging level.

Comment by vladimir lopukhov [ 2019 May 17 ]

Good day! I had a similar problem on Zabbix Server 4.0 (Centos 7, Postgresql 9.6).  None of the above methods did not fit me. The following helped: in the file '/usr/lib/systemd/system/zabbix-server.service' you need to add the following line to the [Service] section: LimitNOFILE=18192 . Then run 'systemctl daemon-reload'. After that the server began to work without problems!

Generated at Fri May 02 06:56:27 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.