[ZBXNEXT-2452] "Multiple PROBLEM events generation" and Timer process Created: 2014 Sep 09  Updated: 2025 Apr 21

Status: Open
Project: ZABBIX FEATURE REQUESTS
Component/s: Server (S)
Affects Version/s: 2.0.12, 2.2.6, 3.0.13, 3.2.10, 3.4.4, 4.0.32, 5.0.14, 5.4.3
Fix Version/s: None

Type: Change Request Priority: Major
Reporter: Constantin Oshmyan Assignee: Unassigned
Resolution: Unresolved Votes: 43
Labels: multiple, timer, triggers, usability
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Trigger with "Multiple PROBLEM events generation" in combination with timer-related functions: nodata(), date(), dayofmonth(), dayofweek(), time(), now()


Attachments: File ZBXNEXT-2452-6.0.diff     File ZBXNEXT-2452-6.4.diff     File ZBXNEXT-2452-6.5.diff     PNG File image-2025-04-21-12-10-49-978.png    
Issue Links:
Duplicate
is duplicated by ZBX-23796 Excsessive trigger event creation whi... Closed
Sub-task

 Description   

According to documentation:

If time-based functions (nodata(), date(), dayofmonth(), dayofweek(), time(), now()) are used in the expression, the trigger is recalculated every 30 seconds by a Zabbix timer process. If both time-based and non-time-based functions are used in an expression, it is recalculated when a new value is received and every 30 seconds.

It is good practice; however, there is a problem if trigger has option "Multiple PROBLEM events generation" turned ON. In this case there is a possibility that the timer process could generate "trigger goes to the PROBLEM state" event every 30 second (when all conditions are TRUE, even in the case when no new datas arrived).

My suggestion is the following: all trigger calculations performed by the timer process should be made as the "Multiple PROBLEM events generation" option is OFF, independently of its real setting. In other words: if all conditions are TRUE and the trigger is already in the PROBLEM state, the new event should not be generated. At the same time, if some of conditions becomes FALSE and trigger was in PROBLEM state, it should be closed (i.e. event "trigger goes to the OK state" must be generated).

The combination of timer-related trigger functions and "Multiple PROBLEM events generation" option is not very widespread. However, when this combination is used, it causes to difficult-to-understanding problems. The most typical problem - double or multiple alerts (some exapmles: ZBX-8114, ZBX-4732, ZBX-6170).



 Comments   
Comment by Constantin Oshmyan [ 2014 Sep 09 ]

Example 1: Using "nodata()" function to close trigger automatically by time-out

Task

It is necessary to have a trigger for Windows Application Log: if a new message with the severity "Error" or "Critical" appeared, it should be forwarded the administrator by e-mail with a delay maximum 1 minute.

Item (in the appropriate Template)

Key: eventlog[Application,,Error|Critical",,,100,]
Type: "Zabbix agent (active)"
Type of information: "Log"
Update interval (sec) = 30

(Trigger in the same Template)

Name: New error message in Windows Application log
Expression:

({Template OS Windows:eventlog[Application,,"Error|Critical",,,100,].logseverity(0)}=4 | {Template OS Windows:eventlog[Application,,"Error|Critical",,,100,].logseverity(0)}=9 ) & {Template OS Windows:eventlog[Application,,"Error|Critical",,,100,].nodata(30)}#1

Multiple PROBLEM events generation: ON
Note1: the "nodata(30)}#1" condition is used to close this trigger automatically (the 30 seconds is a minimum value for "nodata()" function).
Note2: the "Multiple PROBLEM events generation" option is necessary: otherwise it is possible to miss some events if they goes one-after-another.

Action

Action has the following logic:
IF
Trigger value = "PROBLEM"
and
Trigger name like "message in Windows"
THEN
Send message to users: Admin1, Admin2, ..., AdminN
ENDIF

Results

If several error messages appeared in the given interval (30 seconds), then all messages are delivered successfully. However, the last message is delivered twice: the 1-st time upon new datas receiving from the Agent, and the 2-nd time - generated by the timer process. If the timeout for nodata() function is longer, then the last message is repeated every 30 seconds: for example, for 10 minutes (to have possibility for operator to see it on the Web-console) it repeates 21 times.

Comment by Aleksandrs Saveljevs [ 2014 Sep 09 ]

The first and second item references in your trigger seem to be identical. You might wish to simplify that.

Comment by Constantin Oshmyan [ 2014 Sep 09 ]

asaveljevs, thank you! I've fixed this example (logseverity are different: ERROR and CRITICAL).

Example 2: using "time()" function as an additional condition clause

Initial Task

It is necessary to monitor a log.file of some application for error messages (lines containing "ERROR" string) for notification of the appropriate administrator.

Item

log[/var/log/myApp/myApp.log,ERROR,,,skip,]

Trigger

{Host:log[/var/log/myApp/myApp.log,ERROR,,,skip,].str(ERROR)}=1

As in example 1, the "Multiple PROBLEM events generation" should be enabled to avoid missing of some messages.

Result

It works OK. However, every night (between midnight and 02:00) the database performs an offline backup, it cause to some error messages in the log that could be ignored.
So, the administrator wants to exclude any error messages at this period.

Modified Task

It is necessary to monitor this log.file for error messages only after 02:00 AM.

Modified Trigger

{Host:log[/var/log/myApp/myApp.log,ERROR,,,skip,].str(ERROR)}=1 & {Host:log[/var/log/myApp/myApp.log,ERROR,,,skip,].time(0)}>020000

Result

Despite of minimal changes, the result will very differ. If some error occurs in this log file after 02:00 AM, then the event "Trigger goes to PROBLEM state" will be generated every 30 seconds by the timer process; in this example - the rest of day up to midnight...

Comment by Oleg Ivanivskyi [ 2014 Sep 10 ]

Looks like a regex to find A and not B on a line could be a workaround in the example 2. Of course, it will not help in the first example.

Comment by Constantin Oshmyan [ 2015 Jan 26 ]

Looks like a regex to find A and not B on a line could be a workaround in the example 2. Of course, it will not help in the first example.

I agree that in some cases it's possible. If the monitored log file includes the clearly formatted timestamp, for example log of Zabbix-server:

  7294:20141227:100121.813 SNMP agent item "ifNumber" on host "CiscoV-SW1" failed: first network error, wait for 15 seconds

then a trigger expression could be re-formulated to use a regexp() instead of time() function, something like the following:

{Host:log[/tmp/zabbix_server.log,error,,,skip,].str(error)}=1 & {Host:log[/tmp/zabbix_server.log,error,,,skip,].regexp([0-9]*:[0-9]{8}:0[01][0-9]{4}\.[0-9]{3})}#1

I.e. "if the timestamp in this record of log file has the hour==00 or hour==01, then ignore".

However, in other cases it is difficult or impossible to use just regexp. For example, in the Windows Event logs the timestamp is a separate field; many Java applications have a multi-line error messages (where the timestamp and the message text are on different lines), some applications could have their timestamps in the same format that could occurs in the message text also, etc. After all, using the time() function is just more understandable.

Comment by Oleksii Zagorskyi [ 2016 Sep 19 ]

ZBXNEXT-1604 is related a bit

Comment by Constantin Oshmyan [ 2017 Dec 05 ]

Unfortunately, all new versions still have this trouble.
Add, please, the 3.0.x, 3.2.x and 3.4.x to "Affects Version/s:" header (I'm reporter for this issue, but anyway I have no permissions for that).
Thanks in advance.

Comment by Victor [ 2018 Nov 01 ]

Agree, this is very useful feature! Voted.
If it will never be implemented, please provide proper way to alarm on every occurrences of consecutive lines in a log file (with no duplicates).

Comment by Constantin Oshmyan [ 2019 Oct 14 ]

Just a reminder as this problem is still actual. And it still exists in v4.0.x also (and, probably, 4.2 and 4.4 as well).

Comment by Constantin Oshmyan [ 2021 Aug 06 ]

Just once more reminder about this problem importance. It still does exist in versions 5.0 (LTS) and, probably, 5.2 and 5.4 also.
As well, this problem regularly occurs for novice users and discussed on Zabbix forum (at least, Russian-language: link1, link2, link3, link4).

Comment by Constantin Oshmyan [ 2022 Jun 02 ]

Reminder: this problem is still actual.
New LTS version 6.0.x is still affected.

Comment by Marcel Renner [ 2022 Jul 13 ]

+1 Voted as well! For example, we would simply like to have the successful and failed logins (from each login) as a separate info event in Zabbix. Therefore only multiple event generation is practicable to not miss any events. But the event should close automatically after X minutes. Due to the mentioned issue this can't be implemented, which makes eventlog, log, logrt and snmptrap quite useless (at least for simple info events that don't send a resolved notification). With this workaround there is probably a way, but Zabbix should offer something more user friendly.

FYI, 6.2.x is still affected.

Comment by Dimitri Bellini [ 2022 Sep 16 ]

Hi DevTeam,
as already mention in the thread, we still suffer of this "problem" on Zabbix 6.2.x.
There are some possible idea to fix this kind of scenario for the next 6.4 or 7.0?
Thanks so much

Comment by Constantin Oshmyan [ 2023 Oct 10 ]

alexei, just reminder, as discussed at Zabbix Summit 2023
This problem is one of the most annoying and long-lived, unfortunately

Comment by Vladislavs Sokurenko [ 2023 Oct 11 ]

Possible fix to explore as a starting point (not released yet):
ZBXNEXT-2452-6.0.diff
ZBXNEXT-2452-6.4.diff
ZBXNEXT-2452-6.5.diff

Comment by Constantin Oshmyan [ 2023 Oct 11 ]

vso, is your possible fix for which version of Zabbix, please?

Comment by Vladislavs Sokurenko [ 2023 Oct 11 ]

constantin.oshmyan attached both for 6.4 and 7.0, the fix is as per your request in description.

Comment by Constantin Oshmyan [ 2023 Oct 11 ]

vso, great, thanks! Can this patch be applied also to the v6.0 (the current LTS version)?

Comment by dimir [ 2023 Oct 11 ]

It's only a quick research thing to try out, if you have a chance. For a proper thing this needs to be code-reviewed, tested, documented and so on.

Another thing that needs to be considered here is a possible regression for those that need an every 30-second alarm. Maybe a checkbox should be added. Anyway, we wanted to quickly check how complicated the fix might be here and if you can try this maybe become aware of some possible side effects/regressions early.

Comment by Vladislavs Sokurenko [ 2023 Oct 11 ]

constantin.oshmyan sure, added patch for 6.0 ZBXNEXT-2452-6.0.diff. It is safe to use unless as dimir mentioned someone will need also old behaviour for some triggers.

Comment by Constantin Oshmyan [ 2023 Oct 11 ]

Thank you, guys!
I'll try it just after finding a workaround for the ZBXNEXT-8014 that is high priority for me at the moment...

Comment by Constantin Oshmyan [ 2023 Oct 20 ]

vso, dimir I've tried this patch in the test environment with the current v6.0.22 release; it works great!
Thank you very much!

Another thing that needs to be considered here is a possible regression for those that need an every 30-second alarm.

Yes, I understand your doubts.
However, this regression is probably theoretical only.
I could not imagine the real situation where such behaviour could be useful, especially taking into account that this interval (30 seconds) is not configurable by a user.
Additionally, I remember that the combination ("Multiple PROBLEM events generation" mode + time-based trigger functions) has never been supported; however, I can not find the confirmation in the documentation now (it's possible, this was in our discussions with Zabbix tech. support).
In any case, such modification should be documented in Release Notes, of course.

Comment by dimir [ 2023 Oct 21 ]

Thanks constantin.oshmyan! What you're saying makes sense. We'll discuss it internally and let you know.

Comment by Constantin Oshmyan [ 2024 Mar 26 ]

We'll discuss it internally and let you know.

dimir, what are the results of these discussions? Are there any chances this feature could be implemented in the v7.0 (nearest LTS version), or will it be postponed for a few more years?

Comment by Constantin Oshmyan [ 2024 May 03 ]

what are the results of these discussions? Are there any chances this feature could be implemented in the v7.0 (nearest LTS version), or will it be postponed for a few more years?

Comment by Constantin Oshmyan [ 2024 Oct 04 ]

Oh, we can celebrate the 10-years anniversary of this ticket!
As I understand from the proposed patches, technically this issue could be solved simple enough (several lines of code).
I'm still surprised why it takes so long time...

Comment by Chintan Jain [ 2025 Apr 21 ]

What is the solution for this? I am facing same problem, once the alert is resolved, it sends multiple actions. version 7

Generated at Sun Apr 27 11:18:33 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.