[ZBX-8149] In DM master server is sending mails with *UNKNOWN* values Created: 2014 Apr 24 Updated: 2017 May 30 Resolved: 2015 Feb 02 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 2.2.2 |
Fix Version/s: | None |
Type: | Incident report | Priority: | Major |
Reporter: | Karol Pucynski | Assignee: | Unassigned |
Resolution: | Won't fix | Votes: | 5 |
Labels: | dm, notifications | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Attachments: | zabbix_server_node_info.patch zabbix_server_node_info_v2.patch | ||||||||
Issue Links: |
|
Description |
We have configured DM. Slave server is sengind e-mail allerts independently. Data and configuration is synchronized between nodes. Mail from Master server: Event: Priv login failed: PROBLEM Mail from Slave server: Status: PROBLEM Event ID: 2002000000015442 Alert Details: Last Item Value: Where can be the problem? |
Comments |
Comment by richlv [ 2014 Apr 24 ] |
with nodes being removed in zabbix 2.4, this is unlikely to be looked into, sorry |
Comment by Karol Pucynski [ 2014 Apr 25 ] |
As far as I know Zabbix 2.2 is LTS release. |
Comment by Anton Samets [ 2014 Apr 25 ] |
Probably |
Comment by Karol Pucynski [ 2014 Apr 25 ] |
|
Comment by Aleksandrs Saveljevs [ 2014 Apr 25 ] |
This might be the same issue as |
Comment by Karol Pucynski [ 2014 Apr 28 ] |
Zabbix frontend is showing data properly - it seems to be bug only in the zabbix server e-mail handling... |
Comment by Karol Pucynski [ 2014 Apr 28 ] |
It also affects version 2.2.3 |
Comment by Giovanni Lovato [ 2014 Jun 03 ] |
Will this be looked into? We just upgraded a complex distributed architecture with severals levels of hierarchy and we will stick with 2.2 for a while since 2.4 and 2.6 won't support multi-level DM. |
Comment by Karol Pucynski [ 2014 Jul 03 ] |
Is there any info about this issue? I think mamy people are now stick to 2.2.X since next LTS release is far away... |
Comment by Christian Wolff [ 2014 Nov 26 ] |
After a update from 2.0.13 to 2.2.7 we seem to have the same issue with our node setup. Please have a look at it! Thanks! Please also add this bug to the known issues section: https://www.zabbix.com/documentation/2.2/manual/installation/known_issues |
Comment by Christian Wolff [ 2014 Nov 27 ] |
We did some further debugging and here is an example from one of our alarms. Maybe this helps or anyone has an idea on this one. 14937:20141127:112216.388 In substitute_simple_macros() data:'XXX : {HOST.HOST}: {TRIGGER.NAME}: {TRIGGER.STATUS} ({TRIGGER.SEVERITY}) = {ITEM.LASTVALUE}' 14937:20141127:112216.388 In DBget_trigger_value() 14937:20141127:112216.388 In get_N_itemid() expression:'{200200000007468}=0' N_functionid:1 14937:20141127:112216.388 In get_N_functionid() expression:'{200200000007468}=0' N_functionid:1 14937:20141127:112216.388 get_N_functionid() functionid:200200000007468 14937:20141127:112216.388 End of get_N_functionid():SUCCEED 14937:20141127:112216.388 End of get_N_itemid():FAIL 14937:20141127:112216.388 End of DBget_trigger_value():FAIL 14937:20141127:112216.388 cannot resolve macro '{HOST.HOST}' 14937:20141127:112216.388 In substitute_simple_macros() data:'Puppet Agent not running' 14937:20141127:112216.388 In DBitem_lastvalue() 14937:20141127:112216.388 In get_N_itemid() expression:'{200200000007468}=0' N_functionid:1 14937:20141127:112216.388 In get_N_functionid() expression:'{200200000007468}=0' N_functionid:1 14937:20141127:112216.388 get_N_functionid() functionid:200200000007468 14937:20141127:112216.388 End of get_N_functionid():SUCCEED 14937:20141127:112216.388 End of get_N_itemid():FAIL 14937:20141127:112216.388 End of DBitem_lastvalue():FAIL 14937:20141127:112216.388 cannot resolve macro '{ITEM.LASTVALUE}' 14937:20141127:112216.388 End substitute_simple_macros() data:'XXX: *UNKNOWN*: Puppet Agent not running: OK (Information) = *UNKNOWN*' |
Comment by Leo Antunes [ 2014 Dec 01 ] |
This seems to be caused by the configuration caching mechanism. We came up with the attached patch and it seems to solve the problem. A bit of background: during the generation of the alarm messages, the "config" structure is used to look for the macros in DBget_trigger_value(). This structure is however only populated with information concerning the current node. That means: if the node is generating alarms for remote nodes, it can't find information about host/item in the "config" structure, leading to the observed symptoms. |
Comment by Christian Wolff [ 2014 Dec 01 ] |
Thanks again for the good work Leo! |
Comment by Leo Antunes [ 2014 Dec 02 ] |
Guess I spoke too soon. Just updated the (still dirty) patch with a second try, but it seems the patch creates a new problem: the .nodata() trigger function started triggering on our patched server, even though data is coming in for the affected items. Unfortunately, we're considering whether to completely abandon the DM setup and just use proxies (or even independent servers), so I'm not sure I'll be able to dedicate any more time trying to ferret out this side-effect. |
Comment by richlv [ 2014 Dec 02 ] |
note that upgrade to 2.4 will automatically convert child nodes into proxies (as dm is removed in 2.4) |
Comment by Christian Wolff [ 2014 Dec 02 ] |
Is the historical data from the child nodes still accessible after an upgrade? |
Comment by richlv [ 2014 Dec 02 ] |
such discussion is out of scope for this issue. any documentation related to the dm removal task should be handled in |
Comment by richlv [ 2015 Feb 02 ] |
with nodes being removed since 2.4, this issue is unlikely to be looked in - closing |