[ZBX-372] proc.num[] not working correctly for Solaris ZONES Created: 2008 May 08  Updated: 2024 Apr 10  Resolved: 2018 Aug 10

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: None
Fix Version/s: 3.0.21rc1, 3.4.13rc1, 4.0.0beta1, 4.0 (plan)

Type: Problem report Priority: Major
Reporter: Patrick Waldispuhl Assignee: Viktors Tjarve
Resolution: Fixed Votes: 3
Labels: patch, proc.num, solaris
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Zabbix 1.5.2 Server and Agent
SunOS 5.10 Generic_127112-11 i86pc i386 i86pc
Solaris zones


Attachments: File zabbix-1.8.2-add_zone_support.patch     File zabbix-1.8.2-add_zone_support2.patch     File zabbix-1.8.5-add_zone_support.patch    
Issue Links:
Duplicate
Team: Team A
Sprint: Sprint 36, Sprint 38, Sprint 39, Sprint 40
Story Points: 1.5

 Description   

If I use proc.num[cron,root] as an item on a Solaris Server which hosts a Solaris zone the result is 2 ( one cron deamon running on the Server, one running on the zone).

explanation:
The agent is going through the /proc directory and checking for the deamon.
the output for a Server hosting zones can look like this:

dr-x--x--x   5 root     root         864 May  8 15:06 24088
dr-x--x--x   5 834      843          864 May  8 15:06 24125
dr-x--x--x   5 834      843          864 May  8 15:06 24127
dr-x--x--x   5 834      843          864 May  8 15:06 24131
dr-x--x--x   5 805      805          864 May  8 15:20 25849
dr-x--x--x   5 805      805          864 May  8 15:28 26856
dr-x--x--x   5 805      805          864 May  8 15:30 27112
dr-x--x--x   5 805      805          864 May  8 15:40 28388
dr-x--x--x   5 805      805          864 May  8 15:51 29748
dr-x--x--x   5 root     root         864 May  8 15:52 29941
dr-x--x--x   5 917      207          864 May  8 15:53 5
dr-x--x--x   5 917      207          864 May  8 15:53 8
dr-x--x--x   5 805      805          864 May  8 15:53 37
dr-x--x--x   5 805      805          864 May  8 15:55 285
dr-x--x--x   5 805      805          864 May  8 15:57 538
dr-x--x--x   5 805      805          864 May  8 15:59 800
dr-x--x--x   5 805      805          864 May  8 16:01 1041
dr-x--x--x   5 root     root         864 May  8 16:02 1158

==> here you see the userid and groupid are either "real/local" userids or zone users represented by a number

Explicitly searching for a deamon with userid "root" for eg. is no workaround, as the checking for the user running the deamon is done in the subdirectory.

Can this be changed/corrected?

Thanx
Patrick



 Comments   
Comment by richlv [ 2009 Oct 01 ]

do you still see this issue with the latest version ?
what would be the best option here - ignoring processes running in the zones ?

Comment by Patrick Waldispuhl [ 2009 Oct 02 ]

Hi,

I've not updated Yet ( waiting for 1.8), so yes th Problem is still there.
The best and easiest option would be to ignore the processes running in the zones, another possibility would be to add a parameter to allow to specify the zone-name.

I guess kind of how the -Z switch on Solaris.

  1. ps -efZ | grep cron
    global root 1598 1 0 Nov 15 ? 1:32 /usr/sbin/cron
    bea-z-i1 root 7041 1 0 Nov 15 ? 55:50 /usr/sbin/cron
    bea-z-t1 root 10982 1 0 Nov 15 ? 51:28 /usr/sbin/cron
    soba-z-i root 8526 1 0 Mar 17 ? 0:02 /usr/sbin/cron
    merc-z-i root 29459 1 0 Feb 05 ? 17:15 /usr/sbin/cron
    sobalogi root 20557 1 0 Mar 17 ? 0:02 /usr/sbin/cron
    merc-z-d root 5477 1 0 Feb 09 ? 17:12 /usr/sbin/cron
    soba-z-t root 22144 1 0 Feb 19 ? 0:02 /usr/sbin/cron
    merc-z-p root 27810 1 0 Feb 05 ? 17:14 /usr/sbin/cron
    global zabbix 8467 8401 0 14:57:31 pts/9 0:00 grep cron

Greets
Patrick

Comment by richlv [ 2009 Oct 26 ]

information was provided

Comment by Takanori Suzuki [ 2010 Apr 11 ]

It still occurs with version 1.8.2.

"global" is top level OS which manages other zones.
"testzone" is zone I made.
When there are 11 sshd processes in "global" and 1 sshd process in "testzone", zabbix_agentd returns "12".
See following.
bash-3.00# ./src/zabbix_agent/zabbix_agentd -t proc.num[sshd]
proc.num[sshd] [u|12]

I think zabbix should detect which zone each processes works in.
And it should also detect all processes in whole OS.
Like following.

//detect all processes.
bash-3.00# ./src/zabbix_agent/zabbix_agentd -t proc.num[sshd]
proc.num[sshd] [u|12]

//detect processes in "global"
bash-3.00# ./src/zabbix_agent/zabbix_agentd -t proc.num[sshd,,,,zone,global]
proc.num[sshd,,,,zone,global] [u|11]

//detect processes in "testzone"
bash-3.00# ./src/zabbix_agent/zabbix_agentd -t proc.num[sshd,,,,zone,testzone]
proc.num[sshd,,,,zone,testzone] [u|1]

I made a patch for zabbix 1.8.2.

Comment by Takanori Suzuki [ 2010 Apr 11 ]

The patch extend the "proc.num[]" function.
I added 5th and 6th argument to set "container type" and "zone name".

The reason of adding "container type" argument is there are more container type virtualization.
And zabbix need "container type" information for supporting them.
In Linux, there are lxc, openvz and vserver.
They are all container type virtualization.
Especially, lxc is already merged in upstream of Linux kernel after 2.6.29.
http://virt.kernelnewbies.org/TechComparison
In *BSD, there is jail as far as I know.

I uses psinfo.pr_zoneid which comes from /proc/PID/psinfo to detect zoneid.
As far as I checked, lxc container can be also detected in similar way by using /proc/PID/cgroup, though I didn't write a code for lxc

Comment by Takanori Suzuki [ 2010 May 02 ]

I made another patch "zabbix-1.8.2-add_zone_support2.patch" which doesn't require "container type".
Patched binary works like following.

In this environment, there are 5 sshd processes in "global" zone and 1 sshd process in "testzone" zone.
//detect all processes.
bash-3.00# /usr/local/zabbix/bin/zabbix_agentd -t proc.num[sshd]
proc.num[sshd] [u|6]

//detect processes in "global"
bash-3.00# /usr/local/zabbix/bin/zabbix_agentd -t proc.num[sshd,,,,global]
proc.num[sshd,,,,global] [u|5]

//detect processes in "testzone"
bash-3.00# /usr/local/zabbix/bin/zabbix_agentd -t proc.num[sshd,,,,testzone]
proc.num[sshd,,,,testzone] [u|1]

Comment by Stéphane Lacasse [ 2011 Aug 03 ]

I modified the patch for the 1.8.5 agent.

Comment by Stéphane lacasse [ 2012 Aug 30 ]

I just upgraded the Zabbix Agents to the 1.8.13 binary on all our Solaris and I had completely forgotten about this issue. I'm a little bit disappointed that this patch has not made it into the trunk.

Does it have to be done by me or the initial reporter?

This is, in my view, a perfectly working solution and it is a must for us since we monitor a lot of processes on each zone, including the global zone. Having to patch the agent each time is difficult since none of these Solaris are used as a developement platform.

Rant.

Comment by Marc [ 2013 Dec 28 ]

ZBXNEXT-1887 addresses same problem with different container virtualization technology.

Comment by Viktors Tjarve [ 2018 Jul 04 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-372

Comment by Viktors Tjarve [ 2018 Aug 06 ]

Released in:

  • 3.0.21rc1 r83505
  • 3.4.13rc1 r83506
  • 4.0.0alpha10 r83507
Comment by Martins Valkovskis [ 2018 Aug 07 ]

(3) [D] Documentation updates:

RESOLVED

<viktors.tjarve> Looks good, thanks. CLOSED

Generated at Thu Apr 25 17:29:49 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.