[ZBX-19981] Nested Trigger Functions May Not Always Work as Expected Created: 2021 Sep 19 Updated: 2022 Sep 27 Resolved: 2022 Sep 27 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 5.4.4 |
Fix Version/s: | None |
Type: | Problem report | Priority: | Trivial |
Reporter: | Ryan Eberly | Assignee: | Zabbix Support Team |
Resolution: | Cannot Reproduce | Votes: | 0 |
Labels: | functions, server, triggers | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Rocky Linux 8.4.2105, Zabbix Version 5.4.4 |
Attachments: |
![]() ![]() |
Description |
When using nested trigger function sum() and trendcount() in a trigger expression it appears that the time shift in my trigger expression is ignored. This is my trigger expression:
sum(trendcount(/test/test.test,3h:now/h-3h))>0
Steps to reproduce:
I've set the timezone on my system and the Zabbix Frontend to both be on UTC time and the current time is 19:55 UTC (Zulu). This should mean that the trend values used for determining if the above trigger is a PROBLEM or OKAY would be from the 1400, 1500, and 1600 hours. Without a timeshift parameter in the expression, the last 3 hours of trends would be 1700, 1800 and 1900.
[root@LAPTOP ~]# date
When I run the following script the trigger reports as okay in the web frontend, but this should actually yield a result of "1" from the expression above: [root@LAPTOP ~]# for min in `seq 0 5`; do c=$(date --date="2021-09-19T*14*:${min}0:00" +%s); echo test test.test $c 95; done > file.out; zabbix_sender -i file.out -T -z 127.0.0.1
However when I change the timestamp parameter that I'm generating in the above script to be the 1700, 1800 or 1900 hour the trigger shows as a PROBLEM in the web frontend which seems to indicate that the timeshift is ignored in trigger expression
[root@LAPTOP ~]# for min in `seq 0 5`; do c=$(date --date="2021-09-19T*17*:${min}0:00" +%s); echo test test.test $c 95; done > file.out; zabbix_sender -i file.out -T -z 127.0.0.1
Is the timeshift being ignored or is there something else that is happening here?
|
Comments |
Comment by Ryan Eberly [ 2021 Sep 19 ] |
Sorry for the formatting issue in the original post. When I BOLDED the numbers 14 and 17 it added asterisks. If you attempt to reproduce the behavior don't include asterisks in the BASH (i.e. for min in `seq 0 5`; do c=$(date --date="2021-09-19T14:${min}0:00" +%s); echo test test.test $c 95; done > file.out; zabbix_sender -i file.out -T -z 127.0.0.1) |
Comment by Ryan Eberly [ 2021 Sep 20 ] |
I realize that sum of a count is redundant and isn't an appropriate use case but here are some other examples that yield the same result:
The triggers in the screenshot did not go into a PROBLEM state until I ran the last line of the next screenshot, which demonstrates that the time shift is not working in these cases:
|
Comment by Ryan Eberly [ 2021 Oct 23 ] |
After reviewing this more in depth I'm wondering if I misinterpreted the manual, or if the manual's verbiage needs to be modified for additional clarity. In the manual at https://www.zabbix.com/documentation/current/manual/config/triggers/expression#time_shift it describes that time shifts begin with "now" and represents the current time. But after reviewing the zabbix server logs I noticed that for each received metric, when in the evaluate_function2() method, that the clock timestamp range in the database query is based upon the clock value of the received metric and not actually "now". This is an important distinction. For example, you can see the timestamp of the metric is 2021.10.22 15:05:18 which is an epoch of 1634915118, but the clock range used in the get_values functions actually shifts back from 1634915118 and not "now": 28621:20211023:020051.512 In evaluate_function2() function:'avg(/test/thing.key.rate[test],6h:now-6h)' ts:'2021.10.22 15:05:18 000000000' I took this one step further and decided to redo this test and create a trigger that looks at both 6h and 6h:now-6h. I sent the metrics with a clock value from the last 6h first, and then sent metrics with clock values for 6h from "now" - 6h ago. Essentially creating a scenario where the metric values are received "out of order". As a result the trigger doesn't succeed because the last value received was from about 7 hours ago.
Is is possible to update the manual to clarify that "now" is actually the clock value of the current metric? Also, is it expected that metrics will always be received in time-order by Zabbix server? We primarily use Trappers because of the nature of our architecture where we have a lot of firewalls and secluded networks. As a result we have our own code that collects metrics and sends them back to a Zabbix server where we have some more code to parse the files and push the metrics into Zabbix. It's certainly possible that metrics could be received not in time order. Is there any way to work around this scenario from a Zabbix standpoint? Could there be an option in the future to allow users to configure the definition of "now" from within the web frontend Administration as it pertains to trigger evaluation? Is there a scenario where this would not be a good option?
|
Comment by Ryan Eberly [ 2022 Sep 27 ] |
Not an issue. Documentation is in the manual and I just missed it. |