[ZBX-8427] Fuzzytime suddenly very wrong, we think due to delayed lastclock in dbsync, proxy Created: 2014 Jul 02 Updated: 2017 May 30 Resolved: 2014 Jul 02 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 1.8.20 |
Fix Version/s: | None |
Type: | Incident report | Priority: | Major |
Reporter: | Steve mushero | Assignee: | Unassigned |
Resolution: | Duplicate | Votes: | 0 |
Labels: | None | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Zabbix 1.8.20 server |
Issue Links: |
|
Description |
We have a very busy server, 500 NVPS, and busy housekeeper, etc. Randomly, but very often, our host time check which uses fuzzytime(120) triggers as a Problem, but the affected host is fine, time is exact. This happens on direct server->agent data collection and via proxies. We have added logs and instrumentation in the code, and think this is due to when the fuzzytime() is processed - we think the DB is getting busy and backed up, so the dbcache / sync system is delayed, maybe by minutes - then when the db code which also sets the triggers / evaluates functions, gets around to the check, the now() time is much later than when the data was collected, so the fuzzytime() function returns an error/difference, since now - lastvalue is very large. Another possibility that we first thought it the data is from a proxy and very delayed over the Internet - the proxy collects data at 05:00 but the Zabbix server doesn't receive it until 05:05, and thus raises a trigger. But we are seeing this problem even on hosts which don't have proxies. So we suspect a time delay between collection and trigger evaluation. We always assumed a trigger is evaluated as soon as data arrives/polled, but looking at the code this is clearly not the case - it seems to wait until the db sync code runs to actually evaluate functions and set triggers, so any delay in that process is deadly. So, can you confirm when/how lastclock is set for direct and proxied data, and how fuzzytime() compares a time when there can be a delay in proxy data and/or in the db sync code on a busy system ? |
Comments |
Comment by richlv [ 2014 Jul 02 ] |
closing as a duplicate of ZBX-4500 |
Comment by Steve mushero [ 2014 Jul 02 ] |
But can someone tell us how it works, then we can compensate and make it better - when is lastclock set and by who, and what about when data comes from proxy ? And it is processed when received by the server or when sync'd to the DB ? |
Comment by richlv [ 2014 Jul 02 ] |
this tracker is for bugreports - please use zabbix irc, forums and other channels for community support. see https://www.zabbix.org/wiki/Getting_help for more detail |