[ZBX-5149] Proxy doesn't include unsupported items in a list of active checks after 'refresh_unsupported' interval expires Created: 2012 Jun 08 Updated: 2017 May 30 Resolved: 2012 Nov 02 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Proxy (P) |
Affects Version/s: | 2.0.0 |
Fix Version/s: | 2.0.4rc1, 2.1.0 |
Type: | Incident report | Priority: | Major |
Reporter: | Sergei Turchanov | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 2 |
Labels: | active, agent, check, items, proxy, unsupported | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Linux, x86_64, CentOS |
Issue Links: |
|
Description |
src/zabbix_server/trapper/active.c functions 'send_list_of_active_checks' and 'send_list_of_active_checks_json' supposed to include unsupported items it a list of active checks for a host after CONFIG_REFRESH_UNSUPPORTED interval expires. Relevant lines from SQL query: 'select ... from items where ... or (i.status=%d and i.lastclock+%d<=%d) ...'. The problem is that .lastclock field of 'items' table is always NULL, i.e. proxy doesn't update .lastclock field when it receives item values from agents. To check that I deleted proxy database and let proxy create a new empty one. After some period of time I performed query: 'select count from items where lastclock is not NULL' and got '0'. |
Comments |
Comment by Łukasz Jernaś [ 2012 Oct 02 ] |
This hit us lately as a serious issue, due to MySQL monitoring getting disabled when the database server is down, for doing backups, etc. Almost all of our hosts are monitored via proxies... |
Comment by michael chan [ 2012 Oct 04 ] |
Same problem here. Often proxied checks become unsupported due to timeouts, etc. and never get re-enabled. I have to manually every day go in and modify this, which is unacceptable for a monitoring system. And all my checks are done via proxies, since this is best practice. |
Comment by Sergei Turchanov [ 2012 Oct 05 ] |
Michael, please try my work-around to see if it works for you. Go to database configured for your proxy and execute: update items set lastclock = 0 where lastclock is NULL; after that re-enabling of unsupported items should be working again, although you need to execute this statement every time new hosts or items have been added to monitor through that proxy. |
Comment by michael chan [ 2012 Oct 05 ] |
This does fix the issue - will it cause any issues if the above is done periodically, e.g. done daily via a cronjob? |
Comment by Sergei Turchanov [ 2012 Oct 08 ] |
It shouldn't cause any issues. Agents periodically ask proxy for a list of active checks. And the above mentioned functions send_list_of_active_checks{,_json} do not include metrics which became unsupported in that list unless sufficient time has passed ("Refresh unsupported items" configuration option - 600 seconds by default). To do so the field lastclock in the table items hold a timestamp when proxy retrieved successfully(?) a metric from an agent. Given that timestamp and current time you can measure the interval after which an unsupported metric must be re-included in a list of active checks. P.S. If you use sqlite for your proxy db backend you may need to implement re-try logic in cronjob because sqlite sometimes report 'Database is locked' when zabbix-proxy updates it at the same time. |
Comment by Alexander Vladishev [ 2012 Nov 02 ] |
Fixed in the development branch svn://svn.zabbix.com/branches/dev/ZBX-5149 |
Comment by dimir [ 2012 Nov 13 ] |
Successfully tested. We decided to fix it by adding an item's "lastclock" value to the cache. So, the proxy won't still be updating the "lastclock" in the database but instead keep it in the cache. Cached "lastclock" will be also available on the server side. This is very important fix. Before it, if you had a proxy (any, active or passive) working with an active agent and any of your items would go NOT_SUPPORTED they wouldn't be back from that status ever (unless you e. g. set the "lastclock" to 0 in proxy DB manually or change item status manually etc). |
Comment by Alexander Vladishev [ 2012 Nov 21 ] |
Fixed in versions pre-2.0.4 r31543 and pre-2.1.0 (trunk) r31547. |