[ZBX-12984] Zabbix Server restarting continuously after upgrade to 3.4.3 Created: 2017 Nov 05 Updated: 2019 May 17 Resolved: 2017 Nov 08 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 3.4.3 |
Fix Version/s: | None |
Type: | Incident report | Priority: | Critical |
Reporter: | Maksims Edelmans | Assignee: | Unassigned |
Resolution: | Commercial support required | Votes: | 0 |
Labels: | None | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Issue Links: |
|
Description |
Hello, zabbix_server [7327]: failed to open log file: [24] Too many open files zabbix_server [7327]: failed to write [cannot accept incoming IPC connection: [24] Too many open files] into log file If changing StartPreprocessors to 100, Zabbix Server is still restarting with the same error message but once per 15-20 minutes. [root@zabbix ~]# cat /proc/sys/fs/file-max 2097152 [root@zabbix ~]# cat /etc/sysctl.conf # System default settings live in /usr/lib/sysctl.d/00-system.conf. # To override those settings, enter new settings here, or in an /etc/sysctl.d/<name>.conf file # # For more information, see sysctl.conf(5) and sysctl.d(5). dev.raid.speed_limit_min = 10000 dev.raid.speed_limit_max = 500000 vm.swappiness=1 vm.vfs_cache_pressure=50 net.ipv4.ip_no_pmtu_disc=1 fs.file-max = 2097152 [root@zabbix ~]# vim /etc/security/limits.conf * soft nproc 500000 * hard nproc 500000 * soft nofile 500000 * hard nofile 500000 [root@zabbix ~]# su - zabbix -c 'ulimit -aHS' -s '/bin/bash' su: warning: cannot change directory to /var/lib/zabbix: No such file or directory core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 256296 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 500000 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 4096 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited [root@zabbix ~]# su - root -c 'ulimit -aHS' -s '/bin/bash' core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 256296 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 500000 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 500000 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited zabbix_server.conf: LogFile=/var/log/zabbix/zabbix_server.log LogFileSize=100 DebugLevel=3 PidFile=/var/run/zabbix/zabbix_server.pid DBName=zabbix DBUser=zabbix StartPollers=750 StartPreprocessors=300 StartPollersUnreachable=200 StartTrappers=20 StartPingers=20 StartDiscoverers=250 StartTimers=3 SNMPTrapperFile=/var/log/snmptrap/snmptrap.log HousekeepingFrequency=0 MaxHousekeeperDelete=1000 CacheSize=2048M CacheUpdateFrequency=360 StartDBSyncers=2 HistoryCacheSize=512M HistoryIndexCacheSize=128M TrendCacheSize=128M ValueCacheSize=512M Timeout=30 UnreachablePeriod=360 UnavailableDelay=600 UnreachableDelay=120 AlertScriptsPath=/usr/lib/zabbix/alertscripts ExternalScripts=/usr/lib/zabbix/externalscripts LogSlowQueries=3000 Zabbix System is all-in-one currently: [root@zabbix ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/md127 280G 8.9G 272G 4% / devtmpfs 32G 0 32G 0% /dev tmpfs 32G 0 32G 0% /dev/shm tmpfs 32G 8.8M 32G 1% /run tmpfs 32G 0 32G 0% /sys/fs/cgroup /dev/md125 954G 179G 775G 19% /db /dev/md126 2.0G 254M 1.8G 13% /boot /dev/md2 1.6T 792G 783G 51% /mnt/lstorage /dev/sdb3 200M 0 200M 0% /boot/efi2 /dev/sda1 200M 9.8M 191M 5% /boot/efi tmpfs 6.3G 0 6.3G 0% /run/user/0 Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz [root@zabbix ~]# free total used free shared buff/cache available Mem: 65696748 24664656 39955132 460668 1076960 39956428 Swap: 5915644 0 5915644 Looking for help ASAP! |
Comments |
Comment by Glebs Ivanovskis (Inactive) [ 2017 Nov 06 ] |
I suspect that ulimit -n returns something too small for your configuration, check this and this warning. Sorry, I don't know how to increase it... Also preprocessing worker is probably the first type of process in Zabbix which can utilize CPU properly. Therefore:
|
Comment by Vladislavs Sokurenko [ 2017 Nov 07 ] |
Sorry, this tracker is for bug reports only. |
Comment by Vladislavs Sokurenko [ 2017 Nov 07 ] |
Please try to identify which process consumes, all the handles, and if you happen to have any zombie processes, if it's zabbix then feel free to reopen issue. |
Comment by Maksims Edelmans [ 2017 Nov 07 ] |
1. StartPreprocessors is now 16 which is equal to our core count. It seems to be OK now, no continuous service restart anymore. P.S. I know that this tracker is for bug reports only. The reason I posted my question here was that I had to wait about 24 hours before I can confirm my registration on forum and start posting there. Thank you. |
Comment by Maksims Edelmans [ 2017 Nov 08 ] |
I am not sure if it is corrent to post it here or open another issue, but I think it is an issue indeed: If I set DebugLevel=4, then my preprocessing queue is always getting bigger: 32382 /usr/sbin/zabbix_server: preprocessing manager #1 [queued 174, processed 0 values, idle 0.000000 sec during 5.021583 sec] As soon as I change DebugLevel back to 3, then everything is getting processed: 4086 /usr/sbin/zabbix_server: preprocessing manager #1 [queued 1, processed 4717 values, idle 4.971614 sec during 5.022361 sec] Is it normal behaviour or not? |
Comment by Rostislav Palivoda [ 2017 Nov 08 ] |
Please open new request. |
Comment by Glebs Ivanovskis (Inactive) [ 2017 Nov 08 ] |
"Is it normal behaviour that Zabbix stops processing data when I ask him to sing and play banjo?" Yes, it is expected that increasing DebugLevel reduces performance due to disk I/O and log file concurrency. DebugLevel=4 is definitely not for production use. Use it wisely, there are options allowing granular control of logging level. |
Comment by vladimir lopukhov [ 2019 May 17 ] |
Good day! I had a similar problem on Zabbix Server 4.0 (Centos 7, Postgresql 9.6). None of the above methods did not fit me. The following helped: in the file '/usr/lib/systemd/system/zabbix-server.service' you need to add the following line to the [Service] section: LimitNOFILE=18192 . Then run 'systemctl daemon-reload'. After that the server began to work without problems! |