[ZBX-5243] AIX procfs parsing is reading truncated information about cmdline using external process Created: 2012 Jun 26 Updated: 2020 Jan 02 Resolved: 2014 May 08 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Agent (G) |
Affects Version/s: | 1.8.11, 1.8.13, 2.0.0 |
Fix Version/s: | 2.3.0 |
Type: | Incident report | Priority: | Minor |
Reporter: | Boris Manojlovic | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 1 |
Labels: | aix, items, patch | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
AIX 6.1.0.0 |
Attachments: | zabbix-2.2.3-removed-procfs-new.patch zabbix-2.2.3-removed-procfs.patch zabbix-2.x-TRUNK-r44251-removed-procfs-new.patch zabbix-2.x-TRUNK-r44251-removed-procfs.patch zabbix-removed-procfs-1.8.13.patch zabbix-removed-procfs-2.0.0.patch zabbix-removed-procfs-2.0.2.patch | ||||||||||||||||||||
Issue Links: |
|
Description |
procfs on AIX system truncates information about process command line to 80 characters, this patch removes this limitation with use of |
Comments |
Comment by Boris Manojlovic [ 2012 Sep 11 ] |
By mistake patch is not complete, as i had issues with creating patch that would be possible to apply on AIX, i had to copy package and manually amend source i have forgot to add definition of proccommargs in PROC_MEM function and compile would fail. corrected patch for 2.0.2 attached |
Comment by Boris Manojlovic [ 2012 Sep 11 ] |
amended patch |
Comment by Boris Manojlovic [ 2014 Apr 10 ] |
updated patch to apply cleanly |
Comment by Boris Manojlovic [ 2014 Apr 10 ] |
updated patch for latest trunk revision |
Comment by Boris Manojlovic [ 2014 Apr 10 ] |
here is example run (2 processes are because it sees itself (cmdline of proc.num part) and real process)
|
Comment by Aleksandrs Saveljevs [ 2014 Apr 11 ] |
As mentioned in the linked For more details on "struct psinfo", see http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.files/doc/aixfiles/proc.htm : char pr_fname[PRFNSZ]; /* last component of exec()ed pathname*/ char pr_psargs[PRARGSZ]; /* initial characters of arg list */ Here, PRFNSZ is 16 and PRARGSZ is 80. In "struct procsinfo", the definition for the program name field is as follows: char pi_comm[MAXCOMLEN+1]; /* (truncated) program name */ Here, MAXCOMLEN is 32, which is better. |
Comment by Aleksandrs Saveljevs [ 2014 Apr 11 ] |
While investigating this issue, I noticed one thing that might or might not be a problem. In /usr/include/procinfo.h, there is a definition of several macros and "struct procsinfo":
#define SSLEEP 1
#define SRUN 3
...
struct procsinfo
{
...
uint pi_state; /* process state -- from proc.h */
...
}
In /usr/include/sys/proc.h, there are the following macros: #define SNONE 0 /* slot is available */ #define SIDL 4 /* process is created */ #define SZOMB 5 /* process is dying */ #define SSTOP 6 /* process is stopped */ #define SACTIVE 7 /* process is active */ #define SSWAP 8 /* process is swapped */ So "pi_state" seems to refer to macros in proc.h and it is unclear whether "pi_state" can ever be SRUN or SSLEEP, which we use in our code. At least the item always returns zero: $ ./zabbix_agentd -t proc.num[] proc.num[] [u|60] $ ./zabbix_agentd -t proc.num[,,run] proc.num[,,run] [u|0] $ ./zabbix_agentd -t proc.num[,,sleep] proc.num[,,sleep] [u|0] To be investigated. asaveljevs Comment starting at line 32 at http://fossies.org/linux/misc/old/monitor-2.1.9.tar.gz:a/monitor-2.1.9/print_top.c might suggest that SSLEEP and SRUN are no longer used since AIX 4. |
Comment by Boris Manojlovic [ 2014 Apr 13 ] |
patch against 2.2.3 |
Comment by Boris Manojlovic [ 2014 Apr 14 ] |
i have amended my patch for getprocs handling and now it uses 64bit procsinfo64 struct which inside has pi_cpu member which when is bigger than zero can show if process has been working / running or it is idle with "0" cpu ticks
return (procsinfo->pi_state == SACTIVE && procsinfo->pi_cpu > 0) ? SUCCEED : FAIL
sleep
return (procsinfo->pi_state == SACTIVE && procsinfo->pi_cpu == 0) ? SUCCEED : FAIL;
and as you have stated pi_comm is limited to 32 chars but that i think should not be limiting factor in any case asaveljevs Are you sure procsinfo->pi_cpu gets reset when a process is put to sleep? Isn't that field an always increasing counter? steki Yes it is always reset (checked with debug printfs in patch) |
Comment by Boris Manojlovic [ 2014 Apr 14 ] |
corrected handling of num.proc[,,(run|sleep)] |
Comment by Aleksandrs Saveljevs [ 2014 Apr 14 ] |
Boris, your patch is to be implemented in 2.4 (trunk), so unless you have a reason to develop for 2.2 for your own purposes, you might wish to prepare patches for trunk. In any case, development is already going on in development branch svn://svn.zabbix.com/branches/dev/ZBX-5243 . So instead of preparing patches, you might just wish to follow that branch and add your suggestions as comments here. |
Comment by Boris Manojlovic [ 2014 Apr 14 ] |
trunk version of patch with corrected handling of proc.num[,,(run|sleep)] |
Comment by Aleksandrs Saveljevs [ 2014 Apr 14 ] |
As (hopefully) an improvement to Boris' suggestion, I have replaced getprocs() with getprocs64() and "struct procsinfo" with "struct procentry64". According to http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.basetechref/doc/basetrf1/getprocs.htm , it should be safe:
|
Comment by Aleksandrs Saveljevs [ 2014 Apr 14 ] |
Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-5243 . Work is based on Boris' patch. Thank you! |
Comment by Boris Manojlovic [ 2014 Apr 14 ] |
diff -u proc.c~ proc.c --- proc.c~ 2014-04-14 12:23:21.000000000 +0200 +++ proc.c 2014-04-14 14:52:53.000000000 +0200 @@ -34,9 +34,9 @@ switch (zbx_proc_stat) { case ZBX_PROC_STAT_RUN: - return SRUN == procentry->pi_state ? SUCCEED : FAIL; + return ( SACTIVE == procentry->pi_state && procentry->pi_cpu > 0) ? SUCCEED : FAIL; case ZBX_PROC_STAT_SLEEP: - return SSLEEP == procentry->pi_state ? SUCCEED : FAIL; + return ( SACTIVE == procentry->pi_state && procentry->pi_cpu == 0) ? SUCCEED : FAIL; case ZBX_PROC_STAT_ZOMB: return SZOMB == procentry->pi_state ? SUCCEED : FAIL; } example run: root@aixtest:/WORK/zabbix/ZBX-5243 # /opt/z/sbin/zabbix_agent -t 'proc.num[,,all]' && /opt/z/sbin/zabbix_agent -t 'proc.num[,,sleep]' && /opt/z/sbin/zabbix_agent -t 'proc.num[,,run]' proc.num[,,all] [u|96] proc.num[,,sleep] [u|92] proc.num[,,run] [u|4] root@aixtest:/WORK/zabbix/ZBX-5243 # |
Comment by dimir [ 2014 Apr 25 ] |
Successfully tested. Please review my changes in r44770, r44784. |
Comment by Aleksandrs Saveljevs [ 2014 Apr 25 ] |
Changes in r44770 and r44784 look good, thank you! |
Comment by Aleksandrs Saveljevs [ 2014 Apr 25 ] |
Fixed in pre-2.3.0 (trunk) r44790. |
Comment by Aleksandrs Saveljevs [ 2014 May 08 ] |
(1) This task broke compilation on Solaris, because <sys/procfs.h> is no longer included. Because the check for HAVE_SYS_PROCFS_H was only present in AIX code, we thought it was only needed there. Apparently, it is needed on Solaris, too. asaveljevs Previously, we did a check for <sys/procfs.h> in configure.ac and included <sys/procfs.h> as follows:
#ifdef HAVE_SYS_PROCFS_H
/* This is needed to access the correct procfs.h definitions */
# define _STRUCTURED_PROC 1
# include <sys/procfs.h>
#endif
In <sys/procfs.h> it says: /* * This definition is temporary. Structured proc is the preferred API, * and the older ioctl-based interface will be removed in a future version * of Solaris. Until then, by default, including <sys/procfs.h> will * provide the older ioctl-based /proc definitions. To get the structured * /proc definitions, either include <procfs.h> or define _STRUCTURED_PROC * to be 1 before including <sys/procfs.h>. */ #ifndef _STRUCTURED_PROC #define _STRUCTURED_PROC 0 #endif Based on the above and the fact that we only need the header in Solaris now, we now simply include <procfs.h> in Solaris' proc.c. asaveljevs Tested on Solaris 8 and Solaris 10. RESOLVED. |
Comment by dimir [ 2014 May 09 ] |
Successfully tested. |
Comment by Aleksandrs Saveljevs [ 2014 May 09 ] |
Fixed in pre-2.3.0 (trunk) r45231. |
Comment by Aleksandrs Saveljevs [ 2015 Apr 09 ] |
Boris' approach to distinguishing running and sleeping processes is considered in |