[ZBX-13164] Zabbixserver eats all available RAM after upgrade to 3.4.4-2 Created: 2017 Dec 08  Updated: 2018 Jan 31  Resolved: 2018 Jan 31

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Problem report Priority: Critical
Reporter: Maciej Cichocki Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Centos7, Zabbix3.4.4-2, Docker


Attachments: PNG File Screenshot from 2018-01-09 13-04-55.png     PNG File Screenshot from 2018-01-09 13-05-52.png     PNG File Screenshot from 2018-01-09 13-06-06.png     PNG File Screenshot from 2018-01-09 13-06-19.png     PNG File Screenshot.png     Text File ps_aux_zabbix.txt     Text File top.txt     File zabbix_ps_12_12.log     File zabbix_ps_13_12.log     File zabbix_server.conf     File zabbix_server.log     File zabbix_server_valgrind.log    

 Description   

We are experiencing a very large memory leak at the moment with our Zabbix 3.4.4-2 installation. This leak is in the vicinity of gigabytes per hour (see attached graph image).
Our zabbix server runs in container (docker) managed by Kubernetes. Previously (before upgrade) we aren't experiencing such memory leaks.
We have 400 pollers which are consuming all available RAM at our machine.



 Comments   
Comment by Vladislavs Sokurenko [ 2017 Dec 08 ]

Thank you for your report, do you know which check is leaking ? Is it due to preprocessing ?

Comment by Maciej Cichocki [ 2017 Dec 08 ]

Unfortunatelly I have no idea.

Comment by Glebs Ivanovskis (Inactive) [ 2017 Dec 08 ]

First of all, which version have you upgraded from?

Are you sure that pollers are leaking memory?

Comment by Maciej Cichocki [ 2017 Dec 11 ]

First, there was an upgrade from version 3.0 to 3.4.2 and when we discovered this memory leaks we made another upgrade from 3.4.2 to 3.4.4.
Yes I am sure that pollers are leaking memory.

Comment by Glebs Ivanovskis (Inactive) [ 2017 Dec 11 ]

Evidence?

Comment by Glebs Ivanovskis (Inactive) [ 2017 Dec 13 ]

Dear kimbarbel1, if it's still such a bad problem for you, please cooperate and provide more information so that we can start investigating the leak. Looks like you may be the only one to have it.

I'm moving this issue in Need info state. Next step will be closing as Cannot reproduce if there is no reply from reporter within two weeks.

Comment by Maciej Cichocki [ 2017 Dec 13 ]

I attached output from ps auxww.
From yesterday and today you can see at this files that memory consumed by pollers is growing rapidly.

Comment by Vladislavs Sokurenko [ 2017 Dec 13 ]

Could you please try running it with a valgrind, then stop server and upload valgrind log

For example:

valgrind --leak-check=full --trace-children=yes --track-origins=yes --max-stackframe=4000000 --read-var-info=yes --leak-resolution=high --log-file=/tmp/zabbix_server_valgrind.log ./sbin/zabbix_server -c /etc/zabbix/zabbix_server.conf --foreground
Comment by Glebs Ivanovskis (Inactive) [ 2017 Dec 13 ]

Thank you for the update! Seems like pollers and unreachable pollers are to blame, their virtual memory size grows.

What types of checks do you use (Zabbix agent, SNMP, JMX)? Can you try disabling some of them and see if this helps? Would be nice to add some monitoring using proc.mem[] item with filtering by process type (pollers, unreachable pollers, pingers, etc.)

Also StartPreprocessors=500 sounds like an overkill.

Comment by Maciej Cichocki [ 2017 Dec 14 ]

@Glebs
We are using Zabbix agent and SNMP checks, most of them are SNMP checks (about 95%).

Required server performance, new values per second	298.48

about 290 of them are SNMP checks.
Unfortunatelly we can not disable it. I will add proc.mem[] item to monitor pollers.
Also I changed StartPreprocessors to 50.

@Vladislavs I will run zabbix with valgrind and will back to you with the log.

Comment by Maciej Cichocki [ 2017 Dec 21 ]

@Vladislavs

Hi Vladislavs I added my log from valgrind.

vso sorry it does not show any leaks, probably your environment is too big and it cannot handle. If you could reproduce issue on the smaller scale then valgrind could be used.

Comment by Ingus Vilnis [ 2018 Jan 02 ]

Hi kimbarbel1,

For more complete overview of the system please attach the following information from your Zabbix server
Performance graphs from Monitoring → Graphs → select your Zabbix server showing time period of 1 day:

  • CPU utilization
  • Memory usage
  • Zabbix cache usage, % free
  • Zabbix data gathering process busy %
  • Zabbix internal process busy %
  • Zabbix server performance

Configuration and log files:

  • Zabbix server configuration file zabbix_server.conf
  • Zabbix server log file zabbix_server.log

Command outputs:

  • top
  • ps aux | grep zabbix as what you have currently.

And by the way, Zabbix 3.4.5 is out. Not that it will immediately fix the problem you have but would be great to have the stats from that one.

Comment by Maciej Cichocki [ 2018 Jan 09 ]

Hi ingus.vilnis,

I attached data you wanted from me (without CPU ulilization and Memory usage graphs).
Data is from version 3.4.4-2.
Tommorow I will add new data - after upgrade to 3.4.5-1.

Comment by Vladislavs Sokurenko [ 2018 Jan 11 ]

I am sorry but it's not possible to reproduce the issue do you use any other checks except SNMP ? Does disabling those or moving to proxy help ?

Comment by Rostislav Palivoda (Inactive) [ 2018 Jan 31 ]

No response after 1 month. Reopen if required.

Generated at Sat Jun 07 11:40:51 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.