[ZBX-7448] Zabbix-agent on windows return empty string on simultaneous request of same user parameter Created: 2013 Nov 27  Updated: 2017 May 30  Resolved: 2014 Apr 25

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 2.0.8
Fix Version/s: 2.3.0

Type: Incident report Priority: Major
Reporter: Dmitry Samsonov Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: userparameters, windows
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows Server 2003


Attachments: Text File zabbix_agentd.log    
Issue Links:
Duplicate

 Description   

How to reproduce.
Create user parameter: UserParameter=test,echo OK
Request user parameter few times in parallel: zabbix_get -s host -k test

Result in our environment:
process1:
process2:OK
process3:
process4:
process5:OK
process6:OK
process7:
process8:OK
process9:

There is no such problem with built-in items (like agent.ping and so on).
There is no such problem in linux.



 Comments   
Comment by richlv [ 2013 Nov 27 ]

could this just be a case of it being slower on windows ?
if you increase startagents, does it improve the situation ?

Comment by Dmitry Samsonov [ 2013 Nov 27 ]

(1) I have increased StartAgents from default 3 to 29 - result is the same. Most of the time I get 3-4 correct responses (from 10 requests).

One more problem occures when I increase StartAgents above 29 - zabbix agent doesn't start at all. Service start fails with "Error 1067: The process terminated unexpectedly". In zabbix agent log I get "16536:20131127:175257.767 One thread has terminated unexpectedly (code:4294967295). Exiting ..."

nikolajs.agafonovs added this to documentation describing limitations of "StartAgent=xx". https://www.zabbix.com/documentation/2.4/manual/appendix/config/zabbix_agentd_win RESOLVED

<richlv> note "Actual limit depends on system setup." is extremely vague and results in more questions than answers

nikolajs.agafonovs it seems this limit is hardcoded in windows development headers. Documentation corrected and simplified RESOLVED

<richlv> i'm not sure which issue triggered updates to other win agent pages, but i guess this one. if so, it changed max limit for StartAgents starting with 1.8 manual pages to 62. why so ?

nikolajs.agafonovs this limit is hardcoded in windows headers. Documentation corrected. RESOLVED

wiper Actually the number of active servers plus number of passive agent threads must be less than 64. I updated documentation, please review.
REOPENED

martins-v Reviewed. RESOLVED.

Comment by Dmitry Samsonov [ 2013 Dec 03 ]

Can you confirm, this is a bug?

Comment by Oleg Ivanivskyi [ 2014 Mar 24 ]

From system log:

- <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
  <Provider Name="Service Control Manager" Guid="{555908d1-a6d7-4695-8e1e-26931d2012f4}" EventSourceName="Service Control Manager" /> 
  <EventID Qualifiers="49152">7034</EventID> 
  <Version>0</Version> 
  <Level>2</Level> 
  <Task>0</Task> 
  <Opcode>0</Opcode> 
  <Keywords>0x8080000000000000</Keywords> 
  <TimeCreated SystemTime="2014-03-24T18:34:52.974951700Z" /> 
  <EventRecordID>5751</EventRecordID> 
  <Correlation /> 
  <Execution ProcessID="484" ThreadID="536" /> 
  <Channel>System</Channel> 
  <Computer>WIN-BS768P0N4TA</Computer> 
  <Security /> 
  </System>
- <EventData>
  <Data Name="param1">Zabbix Agent</Data> 
  <Data Name="param2">4</Data> 
  </EventData>
  </Event>
Comment by Oleg Ivanivskyi [ 2014 Mar 24 ]

Zabbix agent log file with debug mode 4.

Comment by Nikolajs Agafonovs (Inactive) [ 2014 Apr 04 ]

Dmitry wrote: "Request user parameter few times in parallel: zabbix_get -s host -k test"

What means "in parallel"? Did you ran multiple scripts with zabbix_get in parallel?

Comment by Dmitry Samsonov [ 2014 Apr 04 ]

It was just the test case.
Real situation is 2 zabbix servers with same hosts and same items.
If both servers requests sam item from the same host at the same time, then one server could get empty answer.

Comment by Nikolajs Agafonovs (Inactive) [ 2014 Apr 07 ]

(2)

Info from http://stackoverflow.com/questions/17472389/how-to-increase-the-maximum-number-of-child-processes-that-can-be-spawned-by-a-w

One can also read http://blogs.msdn.com/b/ntdebugging/archive/2007/01/04/desktop-heap-overview.aspx

Problem with "StartAgents=xx" comes from Windows internal setup.

If you find that you cannot open more than about 100 total processes, even on a very large RAM server, you may have run into a limit of the Windows "desktop heap size".

The problem is that service sessions under windows (where the services run) have less of this "desktop heap" space available for creating windows.

The short version is:
--Services get smaller desktop heaps than interactive sessions.
--Desktop heap size limits the number of windows
--Each sub-server creates one or more “windows” even if we can’t see them.

This affects the desktop heap of all services. Do not make it larger than necessary or you will push the system to consume more resource and you may bump up against problems in the total available desktop heap size.

Workaround:
Edit the registry value: HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\SessionManager\SubSystems\Windows

You will see a string like: %SystemRoot%\system32\csrss.exe ObjectDirectory=\Windows SharedSection=1024,20480,768 Windows=On SubSystemType=Windows ServerDll=basesrv,1 ServerDll=winsrv:UserServerDllInitialization,3 ServerDll=winsrv:ConServerDllInitialization,2 ServerDll=sxssrv,4 ProfileControl=Off MaxRequestThreads=16

The critical bit is: SharedSection=1024,20480,768

The second number (20480) is the size for interactive sessions. The third number (768) is the size of non-interactive (services) sessions. Note how the third number is 26x smaller than the second. Experimentally, we found that changing this to:

SharedSection=1024,20480,2048

Increased the project limit from 106 to 270, almost perfectly scaling with the heap size. Pick a value that reflects the maximum number of projects that you expect to be opened simultaneously by all users on the system. Do not make this value larger than necessary, and no larger than 8192, as each service in your system will consume more of a precious resource.

<richlv> this text seems to come from some other source, which is not credited.

nikolajs.agafonovs Thread limit in windows is hadrcoded to value of 64. Added check for creating too many processes in windows. RESOLVED in r44275

<richlv> hmm, my comment was about lack of attribution...

nikolajs.agafonovs yes it is. but we just limit maximum of created agents. RESOLVED

wiper CLOSED

<richlv> guys, sorry, this won't do
we have copied a sizable text and have not attributed it properly. that is not nice. REOPENED.

<richlv> the comment has been edited to link to an msdn page. can you please show where on the linked page this text appears ?

<nikolajs.agafonovs> it appears in stackoverflow.com (link added), and as general guideline to this theme in msdn.com

nikolajs.agafonovs link to source of text added. RESOLVED

<richlv> that does seem to match, thanks -> CLOSED

Comment by Nikolajs Agafonovs (Inactive) [ 2014 Apr 07 ]

And about "on Windows return empty string..." you can check and experiment "Timeout=xx" parameter in zabbix_agentd.win.conf to allow less or more processing time of requests.

Comment by Dmitry Samsonov [ 2014 Apr 07 ]

Results are returned immediately, it's not timeout.

For the agent count - we are not using more then 100, as I wrote before problem occurs on 30 agents and 29 works without problems (tested it on few servers).

Comment by Oleg Ivanivskyi [ 2014 Apr 07 ]

Could you check "Zabbix data gathering process busy %" graph from "Template App Zabbix Server"? Does some pollers overloaded?

Comment by Dmitry Samsonov [ 2014 Apr 07 ]

Oleg, I've performed testing using zabbix_get (see this issue description), so the server part could be excluded, isn't it?

Comment by Oleg Ivanivskyi [ 2014 Apr 07 ]

I think not. Apply "Template App Zabbix Server" template and check pollers load.
Look for details:
http://blog.zabbix.com/monitoring-how-busy-zabbix-processes-are/457/

Comment by Dmitry Samsonov [ 2014 Apr 08 ]

Ok.
We have maximum 2.73% pollers busy on our busiest server.

Comment by Oleg Ivanivskyi [ 2014 Apr 08 ]

Thank you!

Do you monitor the host directly by Zabbix server or a proxy?
Do you see any related errors in Zabbix server or agent log file? Could you check?

Comment by Dmitry Samsonov [ 2014 Apr 08 ]

Yes, directly by server.
No errors in agent and server logs.

Comment by Oleg Ivanivskyi [ 2014 Apr 08 ]

I can't reproduce.

Could you show screenshot of the item settings (https://www.zabbix.com/documentation/2.0/manual/config/items/item#configuration) and history for it (https://www.zabbix.com/documentation/_detail/2.0/manual/web_interface/values.png?id=2.0%3Amanual%3Aweb_interface%3Afrontend_sections%3Amonitoring%3Alatest_data )?

Comment by richlv [ 2014 Apr 12 ]

please see subissues (1) and (2)

nikolajs.agafonovs all commented out. RESOLVED

Comment by Andris Zeila [ 2014 Apr 24 ]

Successfully tested, please see the error message changes in r44763

nikolajs.agafonovs error message is ok. CLOSED

Comment by Nikolajs Agafonovs (Inactive) [ 2014 Apr 24 ]

Available in pre-2.3.0 (trunk) r44776

Comment by richlv [ 2014 Apr 24 ]

(3) r44776 updated bin/win32/zabbix_agentd.exe but not the 64b version - was that intended ?

nikolajs.agafonovs win64 binary added to trunk r44787. RESOLVED

sasha CLOSED

Comment by richlv [ 2014 Apr 25 ]

subissue (2) seems to be in a need of careful checking

nikolajs.agafonovs source of text added. RESOLVED

Generated at Fri Apr 04 13:20:25 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.