Loading...

XML

Word

Printable

Type: Incident report
Resolution: Won't fix
Priority: Major
Fix Version/s: None
Affects Version/s: 2.0.17, 2.2.12, 3.0.2
Component/s: Frontend (F), Proxy (P), Server (S)
Labels:

If you will try to use actions for internal events, it will appear to be annoying, especially for triggers and most likely you will stop to use it.
After period of time of using it I have a few comments to share, see below.

Currently, if some agent (zabbix/snmp/etc) is stopped, after ~54 seconds ((15+timeout)*3) it becomes unavailable, all triggers (except with time-based functions) will be marked as unknown with error message "Agent is unavailable."
A function "update_triggers_status_to_unknown" is responsible to do that and it also will generate events plus, if corresponding action exists, will generate also alerts like "Unknown <Trigger name>"

Suppose I'm admin of zabbix and do control everything, so I've started to use internal events to send alerts.
I have a zabbix agent host(s) with 5000 triggers (I saw that on some installations) for items with update interval 60-600-3600-86400 seconds.
The agent has been stopped by someone for 1 minute and I immediately get 5000 "Unknown <Trigger name>" alerts, which is not funny at all.

I don't know why I get these 5000 alerts, so after a minute I go to frontend and I see that those triggers have "Agent is unavailable." error.
I had to visit frontend because it's not possible to know in alert why the trigger is unknown (~~ZBXNEXT-3140~~).
At this point I think - hmm, I do control how I consider that particular host availability by a dedicated trigger "agent.ping.nodata(5m)=1"
And at this point I started to get alerts like "Normal <Trigger name>" and I continue to get them during next 24 hours until all 5000 items will be polled next time.

Another aspect - what if another zabbix admin/user (who don't get the internal alerts) after a few hours will be configuring triggers for this host?
Yes, he/she will notice that some triggers (for items with big update interval) have rex X icon with "Agent is unavailable." error, but, in the same time, host ZBX icon is visible as green!
It misleads!

Also, what if I have other similar agents and received Unknown alerts for them being mixed with Normal alerts from the previous host?
Such huge and different alerts flow impossible to effectively track, so I disable the action for internal trigger events at all
As result most of people do not use it the internal monitoring and implement own solutions to monitor internal items and especially triggers.

One more detail if you use zabbix proxies:
If agent is monitored by proxy - only host status will be updated but its triggers will stay Normal, which may mislead. Reported as ~~ZBX-10766~~.

While I may see some tiny sense in the switching triggers to Unknown (visible gray ? icon in Monitoring menu for Unknown triggers), it produces more problems than usability, so suggested to be reworked.

Possible solutions, better is first:

1. So after long doubts I suggest to not switch triggers to Unknown (based on host availability) at all, because host availability in not an internal thing, it should be monitored by a regular item/trigger by regular zabbix users!

2. All below:

fix ~~ZBX-10766~~ for consistency;
add a new condition for internal actions with an ability to filter out those "host availability" events (Unknown and Normal), for example like requested in ZBXNEXT-3273;
automatically switch Unknown triggers (with error "Agent is unavailable.") back to Normal when host becomes available, without waiting next item update interval.

For now I simply comment:

DBbegin();
process_events();
DBcommit();

in "update_triggers_status_to_unknown" function.

Assignee:: Antons Sincovs

Reporter:: Oleksii Zagorskyi

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2016 May 09 14:41

Updated:: 2022 Oct 08 11:11

Resolved:: 2019 Mar 26 08:50

Details

Description

Attachments

Activity

People

Dates