[ZBX-7887] calculated item referencing an unsupported item says the item is not found Created: 2014 Feb 28 Updated: 2017 May 30 Resolved: 2014 Apr 25 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 2.2.2 |
Fix Version/s: | 2.2.4rc1, 2.3.0 |
Type: | Incident report | Priority: | Minor |
Reporter: | Aleksandrs Saveljevs | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 1 |
Labels: | calculateditems, notsupported | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Issue Links: |
|
Description |
Suppose we have a calculated item that references another item. Everything goes well until that another item becomes unsupported. In this case, the calculated item becomes unsupported, too, with the error message similar to "Cannot evaluate function [last()]: item [Zabbix server:trapper.item] not found". This is misleading, because the item exists and it used to work well. |
Comments |
Comment by Anton Samets [ 2014 Mar 01 ] |
+1, at 2.0 there is no such bug. for servers, which are in disabled state. Is it normal, that calculated items became unsupported on disabled host? |
Comment by Aleksandrs Saveljevs [ 2014 Mar 07 ] |
By itself, no item of any type should become unsupported if its host is disabled. However, note that disabling a host does not have an effect immediately - until server's configuration cache is updated, items for this host are still being processed. So it might be that your calculated item was still being processed after you disabled a host. However, since the calculated item probably references another item from the same host and these references are currently checked based on the data from the database directly (where the host is disabled, while in configuration cache it is still enabled), it might have been that the calculated item encountered an item from a disabled host and for this reason became unsupported. This is a bug and, ideally, this should be fixed, too. |
Comment by Igors Homjakovs (Inactive) [ 2014 Mar 10 ] |
Fixed in svn://svn.zabbix.com/branches/dev/ZBX-7887 |
Comment by Aleksandrs Saveljevs [ 2014 Mar 10 ] |
(1) The branch does not compile: checks_calculated.c: In function ‘calcitem_evaluate_expression’: checks_calculated.c:198:330: error: expected ‘)’ before numeric constant igorsh RESOLVED in r43328. asaveljevs CLOSED |
Comment by Aleksandrs Saveljevs [ 2014 Mar 11 ] |
(2) In general, the implemented approach does not really solve the problem. It does add a message for referencing not supported items, but the problem remains for disabled items. Also, it has a problem that if we reference an active item on a disabled host, we would now evaluate the calculated item, whereas we should not. There are two ways I see to fix it. The easier solution would be to somehow reuse the following statements in zbx_evaluate_item_functions(): if (ITEM_STATUS_DISABLED == item.status) { func->error = zbx_dsprintf(func->error, "Item disabled for function: {%s:%s.%s(%s)}", item.host_name, item.key, func->function, func->parameter); } else if (ITEM_STATUS_NOTSUPPORTED == item.status) { func->error = zbx_dsprintf(func->error, "Item not supported for function: {%s:%s.%s(%s)}", item.host_name, item.key, func->function, func->parameter); } else if (HOST_STATUS_NOT_MONITORED == host_status) { func->error = zbx_dsprintf(func->error, "Host disabled for function: {%s:%s.%s(%s)}", item.host_name, item.key, func->function, func->parameter); } These seem to perform all the necessary checks for the purposes of calculated items. Another approach, deemed better after discussion with sasha, is to gather information about referenced items from configuration cache. Function DCconfig_get_items_by_keys() might be helpful in this regard. However, we do not keep information about disabled items or disabled hosts there, so it will not be possible to provide a detailed error message. A generic one like "item X is either disabled or belongs to a disabled host" would be OK. asaveljevs Let's see whether we can do the configuration cache solution. REOPENED. igorsh RESOLVED in r43767. asaveljevs CLOSED. |
Comment by Aleksandrs Saveljevs [ 2014 Mar 27 ] |
(3) The following things are now unused and should be removed:
igorsh RESOLVED in r43798. asaveljevs function_t.found in checks_calculated.c is unused, too. REOPENED. igorsh RESOLVED in r44038. asaveljevs CLOSED. |
Comment by Aleksandrs Saveljevs [ 2014 Mar 31 ] |
(4) You have split function DCconfig_get_items_by_keys() in two: one that searches based on "proxy_hostid, host" pair and another one which searches by just "host". The first utilizes index "config->hosts_ph", the second works with "config->hosts_h". Why don't we leave only the second index and, for the purposes of searching by "proxy_hostid, host", we just compare that "host->proxy_hostid" in the found host matches our search criteria? asaveljevs In other words, I am concerned about duplicating code and increasing memory consumption just for this single purpose. The suggestion above does not seem to be as straightforward to do, though, because "config->hosts_ph" indexes not only by "proxy_hostid, host", but also by "status", which can be one of HOST_STATUS_MONITORED, HOST_STATUS_PROXY_ACTIVE, HOST_STATUS_PROXY_PASSIVE. In other words, pointers to proxies are also stored there and hosts and proxies can have the same name. asaveljevs Perhaps we could index by "host, status"? igorsh RESOLVED in r43952. asaveljevs Diff: Index: src/libs/zbxdbhigh/proxy.c =================================================================== --- src/libs/zbxdbhigh/proxy.c (revision 43951) +++ src/libs/zbxdbhigh/proxy.c (revision 43952) @@ -2055,10 +2055,13 @@ keys[i].key = values[i].key; } - DCconfig_get_items_by_keys_ph(items, proxy_hostid, keys, errcodes, values_num); + DCconfig_get_items_by_keys(items, keys, errcodes, values_num); for (i = 0; i < values_num; i++) { + if (proxy_hostid != items[i].host.proxy_hostid) + continue; + if (SUCCEED != errcodes[i]) continue; asaveljevs It seems to make sense to check for "proxy_hostid" after "errcodes[i]" is known to be successful. REOPENED. igorsh RESOLVED in r44039. asaveljevs CLOSED. |
Comment by Aleksandrs Saveljevs [ 2014 Mar 31 ] |
(5) There seems to be a problem with evaluate_macro_function() function. Previously, it would evaluate functions for disabled hosts and disabled items. Now, such functions would not evaluate. Is this good? asaveljevs If it is good, then string duplication is not needed: keys.host = zbx_strdup(NULL, host); keys.key = zbx_strdup(NULL, key); asaveljevs Simple pointer assignment is enough. asaveljevs Variables "items", "keys" and "errcodes" should be named in singular form there. igorsh RESOLVED in r43979. asaveljevs Since we are no longer doing zbx_strdup(), zbx_free() is not necessary either. Please see r44021 where that is fixed. asaveljevs The reason I did the following in dbconfig.c was as a result of discussion with sasha. We decided that we can assume that a host with a given ID does not become a proxy later, or vice versa: - if (0 == found || 0 != strcmp(host->host, row[2])) + if (HOST_STATUS_MONITORED == status && (0 == found || 0 != strcmp(host->host, row[2]))) asaveljevs RESOLVED. igorsh CLOSED. |
Comment by Aleksandrs Saveljevs [ 2014 Mar 31 ] |
(6) In function zbx_evaluate_item_functions() error message "Cannot evaluate function [%s(%s)]: item [%s] not found" might be confusing, same as the original issue we are solving. Might be better to say something like "Cannot evaluate function [%s(%s)]: item [%s] not found in configuration cache" or otherwise clarify the real reason. asaveljevs Also, it seems we cannot reference "items[i].key_orig" there, because item was not found. igorsh RESOLVED in r43981. asaveljevs CLOSED. |
Comment by Aleksandrs Saveljevs [ 2014 Mar 31 ] |
(7) This is similar to (6), but in function calcitem_evaluate_expression(): if (SUCCEED != errcodes[i]) { zbx_snprintf(error, max_error_len, "Cannot evaluate function [%s(%s)]" ": item [%s:%s] not found", f->func, f->params, f->host, f->key); ret = FAIL; break; } asaveljevs Also, why do we set "ret" to FAIL instead of NOTSUPPORTED? igorsh RESOLVED in r43981. asaveljevs CLOSED. |
Comment by Aleksandrs Saveljevs [ 2014 Mar 31 ] |
(8) Please take a look at stylistic fixes in r43897. igorsh Thank you. Looks good. CLOSED. |
Comment by Aleksandrs Saveljevs [ 2014 Apr 24 ] |
(9) In evaluate_macro_function(), DCconfig_clean_items() should be called to clean up after DCconfig_get_items_by_keys(). asaveljevs Similarly, in calcitem_evaluate_expression(). asaveljevs Back to evaluate_macro_function(), error message "Function [%s:%s.%s(%s)] not found." might not be accurate anymore. Previously we queried the database, but now we ask the configuration cache. So it should be something like the following here, too: zbx_snprintf(error, max_error_len, "Cannot evaluate function [%s(%s)]:" " item [%s:%s] does not exist, is disabled or belongs to a disabled host", f->func, f->params, f->host, f->key); asaveljevs Similarly, in zbx_evaluate_item_functions(), DCconfig_clean_items() should be called after DCconfig_get_items_by_itemids(). igorsh RESOLVED in r44816. asaveljevs In evaluate_macro_function(), DCconfig_clean_items() was not called if item was not found. While currently cleaning up is not required in this case, in general it is a good idea to do it anyway, in case something changes in the future (we discussed this with sasha recently). asaveljevs Also, I changed the message back to approximately how it was in r44877. I figured that this is a debugging message and such a level of detail, as proposed earlier, is not necessary. RESOLVED. igorsh Thank you. Looks good. CLOSED. |
Comment by Aleksandrs Saveljevs [ 2014 Apr 24 ] |
(10) Do we intend to solve asaveljevs On one hand, if we do that in that other issue, it would be good, because this is a significant amount of work and is better done separately. asaveljevs On the other hand, if we do that in that other issue, by merging this issue we introduce a regression into a stable branch until that issue is fixed. asaveljevs After discussion wish sasha, we decided to fix this issue in |
Comment by Aleksandrs Saveljevs [ 2014 Apr 24 ] |
(11) Defines ITEM_UNITS_LEN and ITEM_UNITS_LEN_MAX seem unused and should be removed. igorsh RESOLVED in r44782. asaveljevs CLOSED. |
Comment by Aleksandrs Saveljevs [ 2014 Apr 24 ] |
(12) Please take a look at r44773. It fixes a bit of style during code review. igorsh Thank you. Looks good. CLOSED. |
Comment by Igors Homjakovs (Inactive) [ 2014 Apr 28 ] |
Fixed in 2.2.4rc1 r44890 and 2.3.0 (trunk) r44896. |