[ZBX-5149] Proxy doesn't include unsupported items in a list of active checks after 'refresh_unsupported' interval expires Created: 2012 Jun 08  Updated: 2017 May 30  Resolved: 2012 Nov 02

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P)
Affects Version/s: 2.0.0
Fix Version/s: 2.0.4rc1, 2.1.0

Type: Incident report Priority: Major
Reporter: Sergei Turchanov Assignee: Unassigned
Resolution: Fixed Votes: 2
Labels: active, agent, check, items, proxy, unsupported
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Linux, x86_64, CentOS


Issue Links:
Duplicate

 Description   

src/zabbix_server/trapper/active.c functions 'send_list_of_active_checks' and 'send_list_of_active_checks_json' supposed to include unsupported items it a list of active checks for a host after CONFIG_REFRESH_UNSUPPORTED interval expires. Relevant lines from SQL query: 'select ... from items where ... or (i.status=%d and i.lastclock+%d<=%d) ...'.

The problem is that .lastclock field of 'items' table is always NULL, i.e. proxy doesn't update .lastclock field when it receives item values from agents.

To check that I deleted proxy database and let proxy create a new empty one. After some period of time I performed query: 'select count from items where lastclock is not NULL' and got '0'.
Work-around for this issue is to manually set .lastclock field to some value in the past, i.e. 0. After that re-enabling of usupported items starts working again.



 Comments   
Comment by Łukasz Jernaś [ 2012 Oct 02 ]

This hit us lately as a serious issue, due to MySQL monitoring getting disabled when the database server is down, for doing backups, etc. Almost all of our hosts are monitored via proxies...

Comment by michael chan [ 2012 Oct 04 ]

Same problem here. Often proxied checks become unsupported due to timeouts, etc. and never get re-enabled. I have to manually every day go in and modify this, which is unacceptable for a monitoring system. And all my checks are done via proxies, since this is best practice.

Comment by Sergei Turchanov [ 2012 Oct 05 ]

Michael, please try my work-around to see if it works for you. Go to database configured for your proxy and execute:

update items set lastclock = 0 where lastclock is NULL;

after that re-enabling of unsupported items should be working again, although you need to execute this statement every time new hosts or items have been added to monitor through that proxy.

Comment by michael chan [ 2012 Oct 05 ]

This does fix the issue - will it cause any issues if the above is done periodically, e.g. done daily via a cronjob?

Comment by Sergei Turchanov [ 2012 Oct 08 ]

It shouldn't cause any issues. Agents periodically ask proxy for a list of active checks. And the above mentioned functions send_list_of_active_checks{,_json} do not include metrics which became unsupported in that list unless sufficient time has passed ("Refresh unsupported items" configuration option - 600 seconds by default). To do so the field lastclock in the table items hold a timestamp when proxy retrieved successfully(?) a metric from an agent. Given that timestamp and current time you can measure the interval after which an unsupported metric must be re-included in a list of active checks.
Something apparently gone wrong with zabbix 2.x series as the lastclock in not updated anymore, so any new items which appear in proxy database now have NULL value assigned to that field. And SQL arithmetics has special treating of NULL value so that 'i.lastclock+%d<=%d' evaluates to 'false'. So if you set lastclock to some value in the past (0 is ok) you short-cut the logic which postpones sending unsupported items so that those items will always be included in a list.

P.S. If you use sqlite for your proxy db backend you may need to implement re-try logic in cronjob because sqlite sometimes report 'Database is locked' when zabbix-proxy updates it at the same time.

Comment by Alexander Vladishev [ 2012 Nov 02 ]

Fixed in the development branch svn://svn.zabbix.com/branches/dev/ZBX-5149

Comment by dimir [ 2012 Nov 13 ]

Successfully tested.

We decided to fix it by adding an item's "lastclock" value to the cache. So, the proxy won't still be updating the "lastclock" in the database but instead keep it in the cache. Cached "lastclock" will be also available on the server side.

This is very important fix. Before it, if you had a proxy (any, active or passive) working with an active agent and any of your items would go NOT_SUPPORTED they wouldn't be back from that status ever (unless you e. g. set the "lastclock" to 0 in proxy DB manually or change item status manually etc).

Comment by Alexander Vladishev [ 2012 Nov 21 ]

Fixed in versions pre-2.0.4 r31543 and pre-2.1.0 (trunk) r31547.

Generated at Wed Apr 24 06:12:08 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.