Loading...

XML

Word

Printable

Type: Incident report
Resolution: Unresolved
Priority: Trivial
Fix Version/s: None
Affects Version/s: 6.0.25
Component/s: Templates (T)
Labels:
None

We are using the official template to monitor Nginx. https://www.zabbix.com/integrations/nginx#nginx_plus_http

It collects data using the Nginx API (https://nginx.org/en/docs/http/ngx_http_api_module.html) every minute.{}

In this case, the values for received are processed to calculate the rate of change, per second. That is how the value is converted from a total amount, to a speed value (based on the difference over a minute Zabbix calculates the rate of change).

Here is the received value we get from the JSON that Nginx returns:

nginx.stream.server_zones.received.rate{#NAME}

Preprocessing

JSON Path: $.received
Change per second

We kept seeing some values, calculated by Zabbix on the data returned by the Nginx API... which are impossible.

For example, these are some values that we collected back in December when we opened a case with the Nginx support:

023-12-03 16:03:04	0
2023-12-03 16:02:04	0
2023-12-03 16:01:04	0
2023-12-03 16:00:04	973733642604 (almost 1 terabit/s)
2023-12-03 15:59:04	373300445226
2023-12-03 15:58:04	8216678288
2023-12-03 15:57:04	0
2023-12-03 15:56:04	0
2023-12-03 15:55:04	6217956416
2023-12-03 15:54:04	0
2023-12-03 15:53:04	0
2023-12-03 15:52:04	0
2023-12-03 15:51:04	0
2023-12-03 15:50:04	0
2023-12-03 15:49:04	5481749343
2023-12-03 15:48:04	263

Basically the values reported are 0 for many minutes... but then there are huge spikes. So we suspected that Nginx API is not updating this "bytes_received" or "bytes_sent" value in real-time, but is instead only updating it when connections are closed or something.

Nginx support confirmed back to us our suspicion... here was their reply:

Based on description we have now, our guess is that the Zabbix metric refers to the api "received" metric in the stream server zone, and the answer is that the metric is only updated when closing a connection, it is by design.
The "change per second" note from Zabbix documentation and the fact that "received" metric in the stream server zone is only updated when closing a connection, when putting together, follow us to think that the spikes are possible, especially, when closing a long lived connection that existed for a long time (many seconds) but reported it's received metric only upon closure => in that second you may observe a spike though there were no corresponding amount of traffic in that second.

For example: Here is the example of access log record:
——

167.238.31.19 [27/Nov/2023:00:15:31 -0500] TCP 200 17091 1784907579 2502.670"10.66.116.120:80" "1784907579" "17091" "0.001"
——
Where as per access log format for the record $bytes_received = 1784907579 and $session_time = 2502.670

So for the whole 2502 seconds period there will be only 1 metric point at the end with the value 1784907579.

Nginx confirmed here that "the metric is only updated when closing a connection".

Calculating the average to receive a rate of transfer is never going to be really accurate, unfortunately.

The only simple option I can see, since we are relying on the Nginx API, is to reduce the update interval (default seems to be 1 minute) to a much slower period such as one hour in order to get more accurate metrics. Unless we can somehow obtain the "session_time" and use that to divide the value more accurately.

Assignee:: Facundo Vilarnovo

Reporter:: Connor McBrine-Ellis

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2024 Jan 19 23:40

Updated:: 2024 Jan 22 09:05

Details

Description

Attachments

Activity

People

Dates