[ZBX-9740] proc.num[<process>] return zero for for protected Windows processes Created: 2015 Jul 29  Updated: 2017 May 30  Resolved: 2015 Oct 16

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 2.4.4, 2.4.5
Fix Version/s: 2.0.16rc1, 2.2.11rc1, 2.4.7rc1, 3.0.0alpha3

Type: Incident report Priority: Major
Reporter: Vladimir Selivanov Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: proc.num, windows
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

win2012, iis8


Attachments: PNG File agent version 2.4.5.png     PNG File agent version 2.4.7.rc1.png     Text File debug_log.txt     File zabbix_agentd.exe     File zabbix_agentd.exe    

 Description   

proc.num[<process>] return zero for some working processes.

Strange situation with monitoring of MS IIS AppPool:
item
proc_info[w3wp.exe,all,<any param>]
returns the result differ to zero , but
item
proc.num[w3wp.exe]
always returns zero.

for some other processes proc.num[] returns not zero result.

This reproduced on win zabbix agent ver. 2.4.4 and 2.4.5
Zabbix agent runs from localSystem account or account with administrative rights.



 Comments   
Comment by Glebs Ivanovskis (Inactive) [ 2015 Aug 19 ]

Please clarify this contradiction:

proc.num[<process>] return zero for any working processes.

for any other processes proc.num[] returns not zero result.

Is w3wp.exe the only problematic process or there are others? Which versions of zabbix agent and IIS are used, 32 or 64 bit?

More information on reproducing this issue will be appreciated. Maybe simple instructions how to get w3wp.exe running on my fresh virtual Windows 2012?

Comment by Vladimir Selivanov [ 2015 Aug 19 ]

Sorry, it is my mistake.

correct text is:
proc.num[<process>] return zero for some working processes.
and
for some other processes proc.num[] returns not zero result.

Please, change the description.

I found other processes, which have same problem:
proc.num[taskhost.exe]
proc.num[csrss.exe]

I use 64bit version of zabbix, widows and IIS.

Comment by Oleksii Zagorskyi [ 2015 Aug 20 ]

As I recall the "w3wp.exe" is a binary of MS IIS service, or something like that.
csrss.exe is also "close" to windows core as it's one of fundamental process as for windows os, IMO.
Maybe it have some relation ? .. as those processes may be more limited for security reason ? ... just an assumption.

For example simple processes, run by end user are ok but system services are not etc ...
Need to find more exact pattern and/or confirm if on many different workstations.

Comment by Glebs Ivanovskis (Inactive) [ 2015 Aug 21 ]

Situation report. The only difference I see in proc.num and proc_info code was introduced in later stages of fixing ZBX-9143. Item proc_info checks system version and uses PROCESS_QUERY_LIMITED_INFORMATION instead of PROCESS_QUERY_INFORMATION if possible (supported since Windows Server 2008), this allows agent to get basic info (like user of the process) even for protected processes. Similar check would be useful in proc.num as well, however, I don't believe it will fix current issue. Agent queries process info if only proc.num second parameter is not empty.

More realistic reason might be that

A simple error-reporting function, printError, displays the reason for any failures, which usually result from security restrictions. For example, OpenProcess fails for the Idle and CSRSS processes because their access restrictions prevent user-level code from opening them.

(https://msdn.microsoft.com/en-us/library/windows/desktop/ms686701%28v=vs.85%29.aspx)

Voland, please, try to run proc.num and proc_info multiple times with different intervals for the same process. Does proc.num always return zero? Does proc_info always return non-zero value?

One more option is to rewrite proc.num and proc_info for Windows Server 2012 using newer process monitoring way from Microsoft: https://msdn.microsoft.com/en-us/library/windows/desktop/dn457825%28v=vs.85%29.aspx

Comment by Glebs Ivanovskis (Inactive) [ 2015 Aug 25 ]

I've attached development version (build from pre-2.4.7rc1 sources) of 64-bit agent with no difference in how proc.num and proc_info query process list.
Voland, please give it a try. Let's see if it helps.

Comment by Vladimir Selivanov [ 2015 Aug 27 ]

I am updated zabbix agent to ver 2.4.7.rc1
For item «proc.num[w3wp.exe]» situation is not changed.
See fragment of log and graph.
Update interval for both items proc.num[w3wp.exe] and proc_info[w3wp.exe,wkset,sum]) is 60 sec
proc.num[w3wp.exe] always returns zero, and proc_info[w3wp.exe,wkset,sum] always returns not zero
Graphs attached

P.S. in ver 2.4.7.rc1 check "system.cpu.util[3,system,avg1]" is not supported

Fragment of log
1588:20150827:115612.719 Zabbix Agent stopped. Zabbix 2.4.5 (revision 53233).
10696:20150827:115646.673 Starting Zabbix Agent [as-msk-n0245]. Zabbix 2.4.7rc1 (revision 10000).
10696:20150827:115646.688 using configuration file: c:\zabbix_agent\conf\zabbix_agentd.conf
10696:20150827:115646.688 agent #0 started [main process]
12588:20150827:115646.688 agent #2 started listener #1
10096:20150827:115646.688 agent #3 started listener #2
6316:20150827:115646.688 agent #1 started [collector]
10732:20150827:115646.688 agent #4 started listener #3
6584:20150827:115646.688 agent #5 started active checks #1
6584:20150827:115646.766 active check "system.cpu.util[1,system,avg1]" is not supported: Collector is not started.
6584:20150827:115646.766 active check "system.cpu.util[3,system,avg1]" is not supported: Collector is not started.
Comment by Glebs Ivanovskis (Inactive) [ 2015 Aug 27 ]

Great thanks for your cooperation! Negative result is nevertheless a result. I will rewrite proc.num and proc_info using Windows Server 2012 specific process snapshooting mechanism and upload new binary then. Stay tuned!

Regarding system.cpu.util there was some changes to it in ZBX-9456 according to ChangeLog. I shall take a look.

Comment by Glebs Ivanovskis (Inactive) [ 2015 Sep 03 ]

Actually, that new process snapshooting is not for getting the list of processes running in the system, so we cannot use as a basis for proc.num and proc_info.

The binary I've attached and you tested has no difference in how proc.num and proc_info capture and process the list of running processes, we can't imagine any reason why they return different results on your system. The only possible way to find out is to pack agent with additional logging and test it on your system. Voland, will you agree to run agent with more logging on your system?

Comment by Vladimir Selivanov [ 2015 Sep 04 ]

What additional logs you want to collect?
After this info, I agree to run agent with more logging.
This info need for my security department.

Comment by Glebs Ivanovskis (Inactive) [ 2015 Sep 04 ]

New binary attached. Please run it with DebugLevel=4. No need to test it for two hours, just make sure that the problematic proc.num[] key and the corresponding proc_info[] were executed at least once. This binary additionally logs the list of processes running on your system as seen by proc.num[] and proc_info[] items. Like that:

  2136:20150902:052803.110 Requested [proc.num[csrss.exe]]
  2136:20150902:052803.110 PROC_NUM: '[System Process]' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'System' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'smss.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'csrss.exe' COUNT
  2136:20150902:052803.110 PROC_NUM: 'csrss.exe' COUNT
  2136:20150902:052803.110 PROC_NUM: 'wininit.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'winlogon.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'services.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'lsass.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'svchost.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'svchost.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'dwm.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'svchost.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'svchost.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'svchost.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'svchost.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'svchost.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'spoolsv.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'IpOverUsbSvc.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'svchost.exe' NOT COUNT
  2136:20150902:052803.110 PROC_NUM: 'wlms.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'taskhostex.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'explorer.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'msdtc.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'notepad.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'notepad.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'cmd.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'conhost.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'cmd.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'conhost.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'powershell.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'conhost.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'cmd.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'conhost.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'TrustedInstaller.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'TiWorker.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'zabbix_agentd.exe' NOT COUNT
  2136:20150902:052803.133 PROC_NUM: 'zabbix_get.exe' NOT COUNT
  2136:20150902:052803.133 Sending back [2]
...
  1636:20150902:052807.907 Requested [proc_info[csrss.exe]]
  1636:20150902:052807.907 PROC_INFO: '[System Process]' NOT COUNT
  1636:20150902:052807.907 PROC_INFO: 'System' NOT COUNT
  1636:20150902:052807.907 PROC_INFO: 'smss.exe' NOT COUNT
  1636:20150902:052807.907 PROC_INFO: 'csrss.exe' COUNT
  1636:20150902:052807.907 PROC_INFO: 'csrss.exe' COUNT
  1636:20150902:052807.907 PROC_INFO: 'wininit.exe' NOT COUNT
  1636:20150902:052807.907 PROC_INFO: 'winlogon.exe' NOT COUNT
  1636:20150902:052807.907 PROC_INFO: 'services.exe' NOT COUNT
  1636:20150902:052807.907 PROC_INFO: 'lsass.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'svchost.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'svchost.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'dwm.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'svchost.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'svchost.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'svchost.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'svchost.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'svchost.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'spoolsv.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'IpOverUsbSvc.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'svchost.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'wlms.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'taskhostex.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'explorer.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'msdtc.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'notepad.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'notepad.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'cmd.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'conhost.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'cmd.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'conhost.exe' NOT COUNT
  1636:20150902:052807.924 PROC_INFO: 'powershell.exe' NOT COUNT
  1636:20150902:052807.939 PROC_INFO: 'conhost.exe' NOT COUNT
  1636:20150902:052807.939 PROC_INFO: 'cmd.exe' NOT COUNT
  1636:20150902:052807.939 PROC_INFO: 'conhost.exe' NOT COUNT
  1636:20150902:052807.939 PROC_INFO: 'TrustedInstaller.exe' NOT COUNT
  1636:20150902:052807.939 PROC_INFO: 'TiWorker.exe' NOT COUNT
  1636:20150902:052807.939 PROC_INFO: 'zabbix_agentd.exe' NOT COUNT
  1636:20150902:052807.939 PROC_INFO: 'zabbix_get.exe' NOT COUNT
  1636:20150902:052807.939 Sending back [2016.000000]

These are the logs we are interested in. If there is something secret in your logs, show just these bits. If there is something secret about the list of processes, tell us just the difference between proc.num[] and proc_info[] logs.

Comment by Vladimir Selivanov [ 2015 Sep 10 ]

log file attached
glebs.ivanovskis

Comment by Glebs Ivanovskis (Inactive) [ 2015 Sep 14 ]

Thank you, Voland!

I see that process lists are identical and both of them have COUNT next to 'w3wp.exe'. Seems like your problem is solved!

Comment by Glebs Ivanovskis (Inactive) [ 2015 Sep 15 ]

Fixed for 2.4 in development branch svn://svn.zabbix.com/branches/dev/ZBX-9740 r55354 (last commit is actually a rollback to r55128).
Fixed for 2.2 in development branch svn://svn.zabbix.com/branches/dev/ZBX-9740-22 r55710 (backported r55128 and r55704 changes).

Checking Windows version and accessing processes with PROCESS_QUERY_LIMITED_INFORMATION access level seem to mysteriously (query was inside inactive if branch) solve the issue.

Version 2.0 uses completely different method of counting processes (ZBX-9143 fixed 2.2 and above).

Comment by Aleksandrs Saveljevs [ 2015 Sep 22 ]

(1) Unrelated to this task, but r55704 removes --new-nodeid option from "shortopts", which was forgotten in ZBXNEXT-1343.

wiper CLOSED

Comment by Andris Zeila [ 2015 Sep 28 ]

Successfully tested.
Also fixed some aging formatting issues, please review r55806

glebs.ivanovskis Reviewed. Looks much nicer! CLOSED

Comment by Glebs Ivanovskis (Inactive) [ 2015 Sep 28 ]

Fixed in:

  • pre-2.2.11rc1 r55811
  • pre-2.4.7rc1 r55813
  • pre-3.0.0alpha3 (trunk) 55814
Comment by Glebs Ivanovskis (Inactive) [ 2015 Sep 29 ]

Documented in:

sasha CLOSED

Generated at Fri Apr 26 11:56:58 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.