[ZBXNEXT-1249] Remove "Not supported" status of items Created: 2012 Jun 03 Updated: 2020 Jul 18 |
|
Status: | Reopened |
Project: | ZABBIX FEATURE REQUESTS |
Component/s: | Proxy (P), Server (S) |
Affects Version/s: | None |
Fix Version/s: | None |
Type: | Change Request | Priority: | Trivial |
Reporter: | Vladimir Berezhnoy | Assignee: | Unassigned |
Resolution: | Unresolved | Votes: | 37 |
Labels: | None | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Description |
Currently items can become "not supported" for various reasons like "script returned nothing" or wrong value, effectively auto-disabling the items permanently. Very often wrong values are returned due to temporary problems. Instead the items should remain Enabled and should continue to run as scheduled without being disabled. |
Comments |
Comment by richlv [ 2012 Jun 03 ] |
that also works as a safeguard measure so that a few misbehaving items (think timeouts) do not bring down the whole monitoring system. |
Comment by richlv [ 2012 Jun 03 ] |
also assuming this calls for ignoring problems on the proxy & server side, but not for any logic change on the agent side |
Comment by Łukasz Jernaś [ 2013 Sep 08 ] |
Well, a max retries parameter for items would be great. So before it gets unsupported we try to run it X times with the normal interval. Disabling them on proxies for 10 minutes is bad. The new internal notifications should help, as a wrapper can be written enabling them on problematic hosts. but still... |
Comment by Ebonweaver [ 2013 Nov 25 ] |
I'll add a me too here. I have an increasing number of things I can't monitor for no good reason, Zabbix has simply decided for no reason after months of use and no changes that something is not supported. 50% of my webservers have no network traffic suddenly because it's not supported, and yet it obviously is since it was working fine up until now, and works fine on the other systems. My expectation would be enabling it would force it out of the not supported status back to working, but it does not. This has happened with network, cpu, memory, etc on various systems at various random times for no reason in every version of software I have run since 2.0 |
Comment by Ebonweaver [ 2013 Nov 25 ] |
I would also add that priority of trivial is not accurate since this is crippling to the whole point of the software. After enough time I'll have to change to another product as this one will simply be unable to monitor anything. |
Comment by Marco Hofmann [ 2013 Nov 27 ] |
Hello, |
Comment by Bruno Galindro da Costa [ 2013 Dec 04 ] |
Agree. I need a solution for this too. When we click on "not supported" message in webfront, zabbix could enabled the item again, not disable it. |
Comment by david fallin [ 2013 Dec 23 ] |
+1. for me, the problem was easily identified and i was able to just unlink the template and re-link it, problem solved. |
Comment by Christoph [ 2014 Jan 24 ] |
+1. This is very important for me aswell. |
Comment by AndreaConsadori [ 2014 Feb 24 ] |
+1 |
Comment by Filipe Paternot [ 2014 Mar 06 ] |
Perhaps we need a better approach at the problem. I manage a pretty big setup (500k items) and the not supported item situation is good enough. It gives zabbix resilience in dificult times, where script fails, timeout happens, and other stuff we see on large networks. What i miss with this item status is the ability to recover manually a failed item. For a numerous reasons i might want to force a item collect after a issue, so this feature would be good. Maybe if we disable it and then re-enable, or force-enable? |
Comment by richlv [ 2014 Mar 07 ] |
inability to re-enable unsupported items is a functional regression in 2.2, it was possible in 2.0 and earlier. unfortunately, nothing on the roadmap to improve this |
Comment by Lory Pamos [ 2014 Apr 03 ] |
Hey, @alexei! |
Comment by Marco Hofmann [ 2014 Apr 03 ] |
Hi @Lory Pamos, |
Comment by Michael Sphar [ 2014 Apr 17 ] |
Items going Not Supported for transient problems is an issue for me too. Too often there are several minute gaps just because a particular item times out sometimes, often a device that I have no ability to control or fix the cause of sporadic timeouts. I'd rather not make ALL unsupported items refresh every 60s as there's definitely valid cases where if an item isn't working I don't want to keep re-checking everything continuously. What I would really like is a way to override the behavior on a per-item basis. To be able to set an individual item to never go unsupported no matter the response. The new use case I just ran into, polling via SNMP for health status of IBM IMMs (remote management devices). There are OIDs that don't exist when there is not an error condition, but during error conditions, instances are created which contain strings detailing the error condition. I'd like to be able to capture the error condition strings when there is an error for alerting purposes, but if I try to poll those OIDs they go unsupported. So an error might occur, but it will be X minutes before the items get refreshed and try again. |
Comment by Chris Christensen [ 2014 May 30 ] |
Being able to override per item certainly seems like it could be helpful. (Michael Sphar++) Another idea would be to support a more sophisticated backoff (something like exponential backoff); where it could be set globally, but the first polling error checks again at double the defined polling interval, then double again, etc... It would also be nice to expose when the item is scheduled to go back into a supported state (although this could currently be calculated by the timestamp in the log+Admin config); something like the LLD rules (xref: |
Comment by Chris Christensen [ 2014 May 30 ] |
Yet another option would be to expose the (2.2 newly available) Receiving notification on unsupported items as an item+trigger so that the unsupported state can be alarmed/acted on like any other item. I mention this as today the workaround for effectively this is to setup a trigger on an item that has nodata for a certain timeframe; this seems like a waste of timer resources since there is knowledge that the item is not supported. |
Comment by Otheus [ 2015 Jun 01 ] |
1. This should not be marked "trivial". Should be a much higher status. 2. Rename feature as << Better user-management for "Not supported" status of items >>
|
Comment by Aleksandrs Saveljevs [ 2015 Jun 01 ] |
Regarding (3), information on why items are not supported by the agent is provided in GUI since Zabbix 2.4.0 (see |
Comment by David Ribeiro Lopes [ 2016 Mar 31 ] |
I believe this can be closed because there is now an option to "Refresh unsupported items" on the Administration > General > Other tab. You can define how many seconds should Zabbix refresh an item after it has been set to unsupported. |
Comment by richlv [ 2016 Apr 01 ] |
that option has been there almost forever, and unfortunately it does not solve the problem this issue asks to deal with |
Comment by David Ribeiro Lopes [ 2016 Apr 01 ] |
The issue says the item becomes permanently disabled, It does not become permanently disabled, if it runs fine on the next "Refresh unsupported items" check, it becomes enabled. At least this is what literally just happened to me, does it work differently with external checks? |
Comment by richlv [ 2016 Apr 01 ] |
i guess "permanently" is either the wrong word or some misunderstanding - but the default delay of 10 minutes is not good either, especially as there is no way to force item polling anymore |
Comment by Jacek konieczny [ 2016 Oct 20 ] |
The way Zabbix handles 'unsupported' items is very annoying indeed. It might be that an item cannot be monitored for some reason, but when the reason is gone the monitoring should be resumed. |
Comment by Tony den Haan [ 2017 Jul 13 ] |
I think what we need is a "force retry" option, in template as well as host. Lowering the global retry interval might cause unwanted issues. |
Comment by Ben Hub [ 2019 Jun 30 ] |
It seems to me, that "permanent" is a very good description of the problem. In my case, I have set Refresh unsupported items to 60s. I also had it at 5m and 10m before. But it doesn't matter how I configure it, the items marked as not supported don't get re-enabled. Even disabling and re-enabling the item does not help. Pressing check now does not help. Restart of the Zabbix Server does not help. I cannot get the item to work again. The affected item is configured with a template. This template is applied to my host. The item is an external check and calls a local sh-script. Due to a misconfiguration, this script did not return valid values and the item ended up with the following message: Item preprocessing step #1 failed: cannot convert value "..." of type "string" from boolean format: invalid value format I also saw this error in the server log file, but only once. |
Comment by Glebs Ivanovskis [ 2020 Jul 18 ] |
I guess |