[#ZBX-9750] Agent in active mode hangs at times waiting for item results.

[ZBX-9750] Agent in active mode hangs at times waiting for item results. Created: 2015 Aug 03 Updated: 2017 May 30 Resolved: 2016 Nov 14
Status:	Closed
Project:	ZABBIX BUGS AND ISSUES
Component/s:	Agent (G)
Affects Version/s:	2.4.4
Fix Version/s:	None

Type:

Incident report

Priority:

Critical

Reporter:

Avi

Assignee:

Unassigned

Resolution:

Duplicate

Votes:

Labels:

activeagent, delay, timeout

Remaining Estimate:

Not Specified

Time Spent:

Not Specified

Original Estimate:

Not Specified

Environment:

Windows SBS 2011, SBS 2008 R2..

Attachments:

zabbix-problem.txt

Issue Links:

Duplicate
duplicates	~~ZBX-9781~~	stale NFS stops agent operations	Closed

Description

This is in active agent.

The agent will work to gather information and store it until it broadcasts it.
Sometimes it will encounter an item that might hang up, attached is a snippet of log file in debug mode that shows the hang while trying to get an item.
It just freezes and stops working for about 10 minutes.

This causes delay AND, most importantly unreliable in saying when the server is down (I use trigger for "unreachable after 2/3 minutes",am getting multiple false positives because the agent just hangs).

I believe a piece of code needs to be add to tell the agent to keep processing in background when waiting for a value, something like DoEvents in VB.

This occurs in many of the servers I monitor, this is the first I could isolate it with a debug log file.

Thanks.

Comments

Comment by richlv [ 2015 Aug 03 ]

looks like it might be delaying on vfs.fs.size[G:,free] - what kind of a filesystem / physical disk is that ?
if you query that key with zabbix_get, does it always return quickly ?

Comment by Avi [ 2015 Aug 03 ]

I havn't tried yet (will report back later). the drive is mounted as iScsi thats why I also believe its the problem.
BUT, that should not matter, at times there could be issues of any kind, whether its ntfs, iscsi or others.
The agent should ignore them and keep on moving, at least send the pings if not anything else.

The machine has another monitoring system installed that does not have these hiccups.

Comment by Aleksandrs Saveljevs [ 2015 Aug 14 ]

Related issue: ~~ZBX-9781~~ (NFS).

Comment by Aleksandrs Saveljevs [ 2016 Nov 14 ]

This should be fixed already together with ~~ZBX-9781~~, so closing as a duplicate.

Comment by Aleksandrs Saveljevs [ 2016 Nov 14 ]

If it is not a secret, which another monitoring system already worked for you?

Comment by Avi [ 2016 Dec 02 ]

It was a long time ago, I don't remember ...
The solution you implemented promises such issues will not arise in the future with any kind of other items ?
Meaning did you fix this at the core or just for those specific items ?

Comment by Aleksandrs Saveljevs [ 2016 Dec 05 ]

The solution implemented in ~~ZBX-9781~~ only concerns vfs.fs.size[] and vfs.fs.inode[] (on all platforms), but the same approach can be reused for other items, if need be.

Generated at Fri Apr 26 01:14:53 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.

[ZBX-9750] Agent in active mode hangs at times waiting for item results. Created: 2015 Aug 03 Updated: 2017 May 30 Resolved: 2016 Nov 14