[ZBX-7009] Memory leak in SNMP trapper Created: 2013 Sep 15  Updated: 2017 May 30  Resolved: 2013 Dec 10

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 2.0.8rc1
Fix Version/s: None

Type: Incident report Priority: Critical
Reporter: Peter Vilhan Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: memoryleak, snmptraps
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian wheezy 7.1 64bit, 8gb ram, 4core cpu, inside the vmware vsphere4.
Linux zabbixtv 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1+deb7u1 x86_64 GNU/Linux


Attachments: File dbcache.c.custom.patch     File snmptrapper.c.custom.patch     PNG File zab_orig.png     PNG File zab_top_2.png     PNG File zabbix_Server_after_12_hours.png     PNG File zabbix_server_free_mem.png    

 Description   

All of the RAM is eaten by zabbix_server process, responsible for SNMP Traps.

Zabbix stats:
Parameter Value Details
Zabbix server is running Yes localhost:10051
Number of hosts (monitored/not monitored/templates) 25 2 / 0 / 23
Number of items (monitored/disabled/not supported) 2776 2772 / 0 / 4
Number of triggers (enabled/disabled)[problem/unknown/ok] 43 43 / 0 [0 / 0 / 43]
Number of users (online) 9 2
Required server performance, new values per second 464.38 -

I use Zabbix for processing of SNMP traps received from CISCO DCM9900. I have created few hundreds of regular expressions, identifying source service contained in snmp trap.

I use mainly computed items, which uses stored regular expressions and contains formulas like:
count("DCM:snmptrap[\"@Rai_3_out\"]",5,"CC Error","like",10)

Graph of free emmory is included. You can see piked as i have restarted zabbix server.



 Comments   
Comment by richlv [ 2013 Sep 15 ]

please attach a graph showing zabbix_server memory usage, too

Comment by Peter Vilhan [ 2013 Sep 16 ]

zabbix memory consumption 5 hours after server restart.
From server log: 9144:20130915:175319.900 server #25 started snmp trapper #1
So we can see SNMP trapper is eating memory.

Sorry, I am really new to zabbix, so I am not able to attach zabbix_server memory consumption graph. Zabbix tell me, simple check is not supported on item proc.mem[zabbix_server,zabbix,,]

Comment by richlv [ 2013 Sep 16 ]

if you check the zabbix server logfile, is process with id 9144 really snmp trapper ?

Comment by Peter Vilhan [ 2013 Sep 16 ]

Of course it is, here is the output from zabbix_Server.log. I have attached second screenshot from top after 12 hours:
9106:20130915:175319.656 Starting Zabbix Server. Zabbix 2.0.9rc1 (revision 38431).
9106:20130915:175319.656 ****** Enabled features ******
9106:20130915:175319.657 SNMP monitoring: YES
9106:20130915:175319.657 IPMI monitoring: YES
9106:20130915:175319.657 WEB monitoring: YES
9106:20130915:175319.657 Jabber notifications: YES
9106:20130915:175319.657 Ez Texting notifications: YES
9106:20130915:175319.657 ODBC: NO
9106:20130915:175319.657 SSH2 support: YES
9106:20130915:175319.657 IPv6 support: YES
9106:20130915:175319.657 ******************************
9114:20130915:175319.868 server #1 started configuration syncer #1
9115:20130915:175319.868 server #2 started db watchdog #1
9116:20130915:175319.871 server #3 started poller #1
9120:20130915:175319.871 server #6 started poller #4
9125:20130915:175319.872 server #11 started trapper #3
9131:20130915:175319.874 server #15 started alerter #1
9117:20130915:175319.875 server #4 started poller #2
9106:20130915:175319.877 server #0 started [main process]
9119:20130915:175319.878 server #5 started poller #3
9121:20130915:175319.879 server #7 started poller #5
9126:20130915:175319.879 server #12 started trapper #4
9139:20130915:175319.880 server #21 started history syncer #2
9123:20130915:175319.880 server #9 started trapper #1
9132:20130915:175319.881 server #16 started housekeeper #1
9132:20130915:175319.881 executing housekeeper
9124:20130915:175319.882 server #10 started trapper #2
9136:20130915:175319.882 server #19 started discoverer #1
9135:20130915:175319.882 server #18 started http poller #1
9146:20130915:175319.883 server #27 started self-monitoring #1
9141:20130915:175319.883 server #23 started history syncer #4
9137:20130915:175319.883 server #20 started history syncer #1
9134:20130915:175319.884 server #17 started timer #1
9127:20130915:175319.884 server #13 started trapper #5
9140:20130915:175319.885 server #22 started history syncer #3
9129:20130915:175319.885 server #14 started icmp pinger #1
9142:20130915:175319.891 server #24 started escalator #1
9122:20130915:175319.893 server #8 started unreachable poller #1
9144:20130915:175319.900 server #25 started snmp trapper #1
9145:20130915:175319.901 server #26 started proxy poller #1

Comment by Peter Vilhan [ 2013 Sep 16 ]

Patch files I have attached show changes I have made into Zabbix. I needed to get Local Time into Timestamp row in Latest data(history_log table), because I use count function to count the number of video errors in last 5 seconds as received in snmp trap.

Comment by richlv [ 2013 Sep 16 ]

sorry, we don't have the resources to support custom patches. please, revert all patches and repeat the test.

also, proc.mem is an agent item, not simple check. when testing with unpatched version, please, provide a graph, showing zabbix server memory usage

Comment by Peter Vilhan [ 2013 Sep 16 ]

Hello richlv,

thank you for your response and for the advice regarding proc.mem.

I have reverted all changes, so now I am running vanilla 2.0.9rc1.

As you can see on zab_orig.png, nothing has changed and zabbix server proccess is still eating memory. After 40 minutes zabbix_server process responsible for snmp traps has:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1933 zabbix 20 0 1046m 309m 13m S 0,0 3,9 4:14.67 zabbix_server

What else can I do? I have tried valgrind, but it detached from console after startup and does not show anything important.

Thank you for your help.

Best regards,

Peter

Comment by richlv [ 2013 Sep 16 ]

thanks for testing unpatched version, we'll see how this develops

Comment by Peter Vilhan [ 2013 Dec 10 ]

Fixed in 2.2.0!
Thanks.

Peter

Comment by richlv [ 2013 Dec 10 ]

thanks. we didn't see this internally, so closing as 'cannot reproduce'

Generated at Sun Apr 06 00:21:07 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.