[ZBX-11720] Memory leak in zabbix_agentd breaks vfs.fs.size, vfs.fs.inode and vfs.dir.size items if compiled with LeakSanitizer Created: 2017 Jan 20  Updated: 2024 Apr 10  Resolved: 2018 Jan 28

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 3.4.0alpha1
Fix Version/s: 3.4.7rc1, 4.0.0alpha3, 4.0 (plan)

Type: Problem report Priority: Trivial
Reporter: Andris Mednis Assignee: Andris Mednis
Resolution: Fixed Votes: 0
Labels: agent, memoryleak
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

GNU/Linux (Debian testing)


Team: Team A
Team: Team A
Sprint: Sprint 25, Sprint 26
Story Points: 2

 Description   

1. Compile Zabbix with GCC 6.2 with "-fsanitize=leak" option:

$ CFLAGS="-g -O2 -fsanitize=leak" ./configure --enable-server --enable-proxy --enable-agent --enable-ipv6 --with-net-snmp --with-unixodbc --with-libxml2 --with-libcurl --with-openipmi --with-ldap --with-postgresql --with-openssl --prefix=`pwd`

2. Create an item which is checked by starting a separate thread on the agent - it could be 'vfs.fs.size', 'vfs.fs.inode' or 'vfs.dir.size'.
3. Observe in zabbix_agentd.log:

   512:20170120:130846.116 Requested [vfs.dir.size[/home/zabbix34/ZBXNEXT-491/ZBXNEXT-491-2]]
   512:20170120:130846.116 In zbx_execute_threaded_metric() key:'vfs.dir.size'
   554:20170120:130846.118 executing in data process for key:'vfs.dir.size'

=================================================================
==554==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7fd3022fdc7f in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/liblsan.so.0+0xcc7f)
    #1 0x55fe97829677 in zbx_malloc2 /home/zabbix34/ZBXNEXT-491/trunk/src/libs/zbxcommon/misc.c:467

SUMMARY: LeakSanitizer: 24 byte(s) leaked in 1 allocation(s).
   512:20170120:130846.168 End of zbx_execute_threaded_metric():1 'Data gathering process terminated with error.'
   512:20170120:130846.168 Sending back [ZBX_NOTSUPPORTED: Data gathering process terminated with error.]

4. To confirm where the leak comes from, make a small change and recompile:

Index: src/zabbix_agent/zabbix_agentd.c
===================================================================
--- src/zabbix_agent/zabbix_agentd.c	(revision 65211)
+++ src/zabbix_agent/zabbix_agentd.c	(working copy)
@@ -953,7 +953,7 @@
 	{
 		zbx_thread_args_t	*thread_args;
 
-		thread_args = (zbx_thread_args_t *)zbx_malloc(NULL, sizeof(zbx_thread_args_t));
+		thread_args = (zbx_thread_args_t *)malloc(sizeof(zbx_thread_args_t));
 
 		if (FAIL == get_process_info_by_thread(i + 1, &thread_args->process_type, &thread_args->process_num))
 		{

5. Now in zabbix_agentd.log:

  1361:20170120:132246.341 Requested [vfs.dir.size[/home/zabbix34/ZBXNEXT-491/ZBXNEXT-491-2]]
  1361:20170120:132246.341 In zbx_execute_threaded_metric() key:'vfs.dir.size'
  1504:20170120:132246.342 executing in data process for key:'vfs.dir.size'

=================================================================
==1504==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 96 byte(s) in 4 object(s) allocated from:
    #0 0x7fbb1ca4ac7f in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/liblsan.so.0+0xcc7f)
    #1 0x5641e090a140 in MAIN_ZABBIX_ENTRY /home/zabbix34/ZBXNEXT-491/trunk/src/zabbix_agent/zabbix_agentd.c:956

SUMMARY: LeakSanitizer: 96 byte(s) leaked in 4 allocation(s).
  1361:20170120:132246.371 End of zbx_execute_threaded_metric():1 'Data gathering process terminated with error.'
  1361:20170120:132246.371 Sending back [ZBX_NOTSUPPORTED: Data gathering process terminated with error.]


 Comments   
Comment by Andris Mednis [ 2017 Jan 20 ]

Although this memory leak might be harmless it breaks 'vfs.fs.size', 'vfs.fs.inode' and 'vfs.dir.size' items in trunk (they become NOTSUPPORTED) if compiled with "-fsanitize=leak" option.
In older versions memory leak is detected but 'vfs.fs.size', 'vfs.fs.inode' and 'vfs.dir.size' return a valid number.
If "-fsanitize=leak" option is not used the problem is not noticed, agent works as expected.

Comment by Andris Mednis [ 2018 Jan 22 ]

Previously if child process created by zbx_execute_threaded_metric() completed with error then only message "Data gathering process terminated unexpectedly." or "Data gathering process terminated with error." was displayed in frontend.
This fix adds an error code wstatus obtained from waitpid() function (see man waitpid) to the message.
For example, now if there was a memory leak you could see in log file

  5944:20180122:163452.925 End of zbx_execute_threaded_metric():SYSINFO_FAIL 'Data gathering process terminated with error: 5888.'
  5944:20180122:163452.925 Sending back [ZBX_NOTSUPPORTED: Data gathering process terminated with error: 5888.]

What is error 5888 ? In hexadecimal system it is 1700, in binary it is 0001 0111 0000 0000. Its 8 most significant bits are 0001 0111 which converted into decimal is 23.
https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer says that default exit code from LeakSanitizer is 23 (although after reading "man waitpid" I expected this code to reside in the 8 least significant bits).
Interesting to note, that this exit code produced by LeakSanitizer in case of memory leak can be customized. For example with

$ export LSAN_OPTIONS=exitcode=71

and restarting agent you might see

  5944:20180122:163452.925 End of zbx_execute_threaded_metric():SYSINFO_FAIL 'Data gathering process terminated with error: 18176.'

18176 in hex is 4700, and 47 from hex to dec is 71 - what was expected.

Comment by Andris Mednis [ 2018 Jan 22 ]

The fix is required only on Unix/GNU Linux platforms. Zabbix agent for Microsoft Windows is not affected.

Comment by Andris Mednis [ 2018 Jan 22 ]

Fixed in svn://svn.zabbix.com/branches/dev/ZBX-11720 (for 3.4)

Comment by Viktors Tjarve [ 2018 Jan 24 ]

Successfully tested.

Comment by Andris Mednis [ 2018 Jan 25 ]

Available in versions:

  • pre-3.4.7rc1 r77185
  • pre-4.0.0alpha3 (trunk) r77190
Generated at Sat Apr 20 14:53:44 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.