[ZBX-9750] Agent in active mode hangs at times waiting for item results. Created: 2015 Aug 03  Updated: 2017 May 30  Resolved: 2016 Nov 14

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 2.4.4
Fix Version/s: None

Type: Incident report Priority: Critical
Reporter: Avi Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: activeagent, delay, timeout
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows SBS 2011, SBS 2008 R2..


Attachments: Text File zabbix-problem.txt    
Issue Links:
Duplicate
duplicates ZBX-9781 stale NFS stops agent operations Closed

 Description   

This is in active agent.

The agent will work to gather information and store it until it broadcasts it.
Sometimes it will encounter an item that might hang up, attached is a snippet of log file in debug mode that shows the hang while trying to get an item.
It just freezes and stops working for about 10 minutes.

This causes delay AND, most importantly unreliable in saying when the server is down (I use trigger for "unreachable after 2/3 minutes",am getting multiple false positives because the agent just hangs).

I believe a piece of code needs to be add to tell the agent to keep processing in background when waiting for a value, something like DoEvents in VB.

This occurs in many of the servers I monitor, this is the first I could isolate it with a debug log file.

Thanks.



 Comments   
Comment by richlv [ 2015 Aug 03 ]

looks like it might be delaying on vfs.fs.size[G:,free] - what kind of a filesystem / physical disk is that ?
if you query that key with zabbix_get, does it always return quickly ?

Comment by Avi [ 2015 Aug 03 ]

I havn't tried yet (will report back later). the drive is mounted as iScsi thats why I also believe its the problem.
BUT, that should not matter, at times there could be issues of any kind, whether its ntfs, iscsi or others.
The agent should ignore them and keep on moving, at least send the pings if not anything else.

The machine has another monitoring system installed that does not have these hiccups.

Comment by Aleksandrs Saveljevs [ 2015 Aug 14 ]

Related issue: ZBX-9781 (NFS).

Comment by Aleksandrs Saveljevs [ 2016 Nov 14 ]

This should be fixed already together with ZBX-9781, so closing as a duplicate.

Comment by Aleksandrs Saveljevs [ 2016 Nov 14 ]

If it is not a secret, which another monitoring system already worked for you?

Comment by Avi [ 2016 Dec 02 ]

It was a long time ago, I don't remember ...
The solution you implemented promises such issues will not arise in the future with any kind of other items ?
Meaning did you fix this at the core or just for those specific items ?

Comment by Aleksandrs Saveljevs [ 2016 Dec 05 ]

The solution implemented in ZBX-9781 only concerns vfs.fs.size[] and vfs.fs.inode[] (on all platforms), but the same approach can be reused for other items, if need be.

Generated at Fri Apr 26 01:14:53 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.