[ZBX-7585] zabbix stopped gathering data Created: 2013 Dec 26 Updated: 2017 May 30 Resolved: 2016 Apr 28 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 2.2.1 |
Fix Version/s: | None |
Type: | Incident report | Priority: | Minor |
Reporter: | sles | Assignee: | Unassigned |
Resolution: | Cannot Reproduce | Votes: | 1 |
Labels: | timeout | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Centos 6/x86-64 |
Attachments: | zabbix_server.log.bz2 zabbix_server.log.gz zabbix_server.log.gz | ||||||||
Issue Links: |
|
Description |
Hello! just noticed that there is no data from one of hosts for more then 10 minutes, found in log: 24613:20131226:134300.808 Zabbix agent item "net.if.out[eth1,bytes]" on host "inetgw-nsk" failed: first network error, wait for 15 seconds I noticed this at 13:55 or so, so I restarted zabbix server and immediately get: 28701:20131226:135742.513 resuming Zabbix agent checks on host "inetgw-nsk": connection restored This is definitely a bug, even f there was network problem far more then 15 seconds passed... |
Comments |
Comment by Marc [ 2013 Dec 26 ] |
Sounds not like a bug report. |
Comment by richlv [ 2013 Dec 26 ] |
this might be very hard to trace down - i don't see anything that could be done right now. next time this happens, you could try stracing poller processes and seeing whether any of the is stuck - and if so, on what. |
Comment by sles [ 2013 Dec 27 ] |
Marc , not bug report? richlv, I'll definitely trace |
Comment by Marc [ 2013 Dec 27 ] |
"Problem" might appear more appropriate to me than "bug" |
Comment by sles [ 2013 Dec 27 ] |
I'm sure this is bug . Period. |
Comment by Oleksii Zagorskyi [ 2013 Dec 27 ] |
Sles, would be correct to provide from very beginning zabbix_server.conf's UnreachablePeriod and UnavailableDelay Check this page https://www.zabbix.com/documentation/2.2/manual/appendix/items/unreachability you can find there some useful points. |
Comment by sles [ 2013 Dec 27 ] |
Oleksiy Zagorskyi , they are default. |
Comment by sles [ 2013 Dec 31 ] |
Hello! Just upgraded zabbix agent on several hosts, and on all of them got about 10 minites of no data: 14580:20131231:075602.521 Zabbix agent item "net.if.in[eth0,bytes]" on host "inetgw-nsk" failed: first network error, wait for 15 seconds 14571:20131231:080655.216 Zabbix agent item "vfs.fs.size[/,used]" on host "ast-ngdu2.xnet.belkam.com" failed: first network error, wait for 15 seconds etc. Why so long timeout? Thank you! |
Comment by Marc [ 2013 Dec 31 ] |
Did you already:
|
Comment by sles [ 2014 Jan 02 ] |
partially 1. yes |
Comment by sles [ 2014 Jan 02 ] |
ok. about 4- just rebooted one of servers. will turn debug on later (quite busy right now) and try to reproduce. now connection is restored: 23283:20140102:075108.042 Zabbix agent item "net.if.in[eth0,bytes]" on host "asterisk.p98.belkam.com" failed: first network error, wait for 15 seconds as you can see, snmp started far earlier. Thank you! |
Comment by sles [ 2014 Jan 02 ] |
full log with debug 4 |
Comment by sles [ 2014 Jan 02 ] |
full log with debug 4 |
Comment by sles [ 2014 Jan 02 ] |
full log with debug 4 |
Comment by sles [ 2014 Jan 02 ] |
sorry, uploaded several times- got java script error for some reason |
Comment by richlv [ 2016 Apr 22 ] |
and how many unreachable pollers were there ? maybe this was a duplicate of |
Comment by sles [ 2016 Apr 25 ] |
Hello! Don't remember how much pollers were there, Now, on 3.0.2 we have Now there are no such events for agent's devices, but I still have this errors for some very old devices using snmp: 19155:20160425:102417.012 SNMP agent item "ifInOctets41" on host "p98a-cc6006-1.p98.belkam.com" failed: first network error, wait for 15 seconds I guess that they are not fast enough in response. I'm very sorry for asking this question, but are there any snmp timeout options for zabbix server? |
Comment by Aleksandrs Saveljevs [ 2016 Apr 25 ] |
Yes, since |
Comment by richlv [ 2016 Apr 26 ] |
also haven't seen any graphs on the busy pollers & unreachable pollers. were they all busy, maybe ? |
Comment by sles [ 2016 Apr 28 ] |
Aleksandrs, thank you for pointing to snmp timeout settings, I'll try this. richiv, if you don't see any graph- this is because you didn't ask for them |
Comment by Aleksandrs Saveljevs [ 2016 Apr 28 ] |
Thanks for info! Closing as "Cannot Reproduce" then. |