[ZBXNEXT-2355] provide an ability to later understand why alerts were skipped during maintenance, why duplicated event created Created: 2014 Jun 25 Updated: 2024 Mar 27 Resolved: 2016 Dec 20 |
|
Status: | Closed |
Project: | ZABBIX FEATURE REQUESTS |
Component/s: | Server (S) |
Affects Version/s: | 2.2.3, 2.3.2 |
Fix Version/s: | None |
Type: | Change Request | Priority: | Critical |
Reporter: | Oleksii Zagorskyi | Assignee: | Unassigned |
Resolution: | Duplicate | Votes: | 7 |
Labels: | maintenance, troubleshooting, usability | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Issue Links: |
|
Description |
(1) Currently - no any way at all to make sure why this or that events don't generate alerts. I have a question - suppose after a week (note - my memory limited by 3 days only) of a host maintenance I investigate host's events and I cannot understand why for this particular problem event zabbix didn't send an alert? (2) To resolve all this and provide clear info to zabbix users I suggest to implement a feature that: It should help to understand why alerts were missing AND!!! will provide a hint that next the same Ok|Problem event above generated by timer according to internal zabbix routines ("generate_events" function). Where it could stored - I don't know. The "events" table don't have suitable columns. |
Comments |
Comment by Oleksii Zagorskyi [ 2015 Dec 08 ] |
This issue additionally drive zabbix people crazy in ZBX-9432 |
Comment by Chris Christensen [ 2015 Dec 08 ] |
Agree ^ see the comment thread ~Dec 8th in ZBX-9432 for a case that shows duplicate OK events being generated around maintenance (and also no way to see in/out of maint status in Zabbix - all lookups were done from other system logs calling the maint API. Logging and/or UI improvements would definitely be helpful.) |
Comment by Oleksii Zagorskyi [ 2015 Dec 09 ] |
A case with 3 OK events (find it in the ZBX-9432) should be additionally tested after development. |
Comment by richlv [ 2016 Jan 20 ] |
host going in/out of maintenance could be registered in internal events (but there's no way to nicely view those, asked in ZBXNEXT-2170), or maybe in the audit log (although audit log mostly deals with changes in the config, not runtime status like maintenance) |
Comment by Oleksii Zagorskyi [ 2016 Jan 20 ] |
Rich's idea is very good ! As for timestamp of last host PROBLEM event (on top of events): what is more usable for NOC - a timestamp when a server went to shutdown (and didn't get back to alive) or a timestamp of maintenance end (duplicated event) ? Another related discussion ZBXNEXT-2141. I think it should be widely discussed. |
Comment by Oleksii Zagorskyi [ 2016 Jan 27 ] |
Another issue where duplicated event after maintenance is misleading - |
Comment by richlv [ 2016 Apr 15 ] |
zalex_ua And it was indeed implemented that way. No more duplicated event after maintenance, starting from 3.2 |
Comment by Oleksii Zagorskyi [ 2016 Nov 09 ] |
Another use case - one wants to prepare some report (sort of "Triggers top 100" report) based on "events" table and would want to take maintenance periods (active in the past, with data collection) into account ... |
Comment by Oleksii Zagorskyi [ 2016 Dec 20 ] |
We found a more old similar request - ZBXNEXT-1150 |