[ZBX-9640] Incorrect timestamp using zabbix_sender result in case of slow network Created: 2015 Jun 15  Updated: 2017 May 30  Resolved: 2016 Feb 10

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 2.2.9, 2.4.5
Fix Version/s: 2.4.8rc1, 3.0.0rc1

Type: Incident report Priority: Blocker
Reporter: Alexey Pustovalov Assignee: Unassigned
Resolution: Fixed Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by ZBX-9939 wrong events (ok|problem) for item(tr... Closed

 Description   

Zabbix Server/proxy adds a few seconds in case of slow network between zabbix_sender and Zabbix proxy/server. It adds amount of seconds between start and finish data receiving:

#tc qdisc add dev lo root netem delay 750ms

# time echo "test_host test.trap 1426067013 test_value" | zabbix_sender -z localhost -T -i -
info from server: "processed: 1; failed: 0; total: 1; seconds spent: 0.000177"
sent: 1; skipped: 0; total: 1

real	0m4.507s
user	0m0.000s
sys	0m0.000s

+--------+------------+------------------------------------------------------------------------------------------+-----------+
| itemid | clock      | value                                                                                    | ns        |
+--------+------------+------------------------------------------------------------------------------------------+-----------+
|  24018 | 1426067017 | 1434356187                                                                               |         0 |

So in final case it adds about 4 seconds, which was spent to send data before.



 Comments   
Comment by Andris Zeila [ 2016 Jan 20 ]

Zabbix adjusts timestamps of the received values by the difference of server and client (agent, sender, proxy) timestamps. To ensure correct adjustements those timestamps must be taken as close as possible - after connection has been estabilished (after connect() on client side and accept() on server side).

It appears only active proxy correctly adds timestamp taken after connection has been made. In all other places clients adds timestamp taken before connection and server compares it with timestamp taken after data has been received.

Comment by Andris Zeila [ 2016 Jan 21 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-9640

Comment by Sandis Neilands (Inactive) [ 2016 Jan 29 ]

(1) Should we document at least in the zabbix.org the meaning of the value timestamps?

Every value retrieved by the Zabbix system has a timestamp. The timestamp represents the time when the Zabbix system got the value.

Zabbix compnents can be run on separate machines with unsynchronized clocks. Each component runs in its own time. When the values are processed by the server their timestamps must be in the server's time. However initially:

  • when the active agent gets the values their timestamps are in active agent's time;
  • the timestamps for polled items (including passive agent, SNMP, JMX, IPMI, simple checks) are in server's or proxy's time (depending on which component polled the item);
  • the timestamps for SNMP trapper items are in server's or proxy's time (depending on which component received the trap);
  • the timestamps provided via the zabbix_sender should be in the correct time for the item (more on this later).

When the values are passed between components their timestamps must be adjusted in order to be in the time of the receiving component.

On the receiving component:

adjusted_value_timestamp = received_value_timestamp + adjustment

The adjustment value is calculated on the receiving component per each packet that contains values from another component. Each packet is transmitted in a separate TCP connection. These packets contain timestamp of when the packet was sent from the component (in the sending components time). If we ignore the network latency then the difference between receiving time and sending time is the difference between clocks on the machines running the components:

adjustment = incoming_connection_accepted_timestamp - received_packet_sending_timestamp

The adjustment calculation is slightly different when server gets values from passive proxy (it is polled by server):

adjustment = connection_to_proxy_completed_timestamp - received_packet_sending_timestamp

Note that as the value travels trough Zabbix system its timestamp can be adjusted twice. First, in proxy, when it is passed from active agent to proxy. Second, in server, when it is passed from proxy to server.

Summary

The complete rules are as follows:

Format:
  sender -> receiver: value_timestamp_formula; adjustment_timestamp_formula.

Rules:
  AA -> S :    TS_S  = TS_AA + ADJ;     ADJ = ACCPT_TS - PKT_TS.
  AA -> PP:    TS_PP = TS_AA + ADJ;     ADJ = ACCPT_TS - PKT_TS.
  AA -> AP:    TS_AP = TS_AA + ADJ;     ADJ = ACCPT_TS - PKT_TS.
  PP -> S :    TS_S  = TS_PP + ADJ;     ADJ = CONN_TS - PKT_TS.
  AP -> S :    TS_S  = TS_AP + ADJ;     ADJ = ACCPT_TS - PKT_TS.
  PA -> S :    TS_S  = VAL_RECVD_TS;    n/a.
  PA -> AP:    TS_AP = VAL_RECVD_TS;    n/a.
  PA -> PP:    TS_PP = VAL_RECVD_TS;    n/a.

Legend:
  TS_X        : value's timestamp in X's time
  ADJ         : adjustment
  ACCPT_TS    : timestamp (in receiver's time) after incoming connection was accepted on the receiver
  CONN_TS     : timestamp (in receiver's time) after outgoing connection was established from receiver to sender
  PKT_TS      : timestamp (in sender's time) just before sending the packet to the receiver (e.g. after connect() or accept() in case of PP)
  VAL_RECVD_TS: timestamp of the current time (in receiver's time) when the value was received

  AA: active agent
  PA: passive agent
  AP: active proxy
  PP: passive proxy
  S : server

zabbix_sender

Packet timestamp cannot be currently added when specifying host, item, value on the command line. In this case no adjustments will be made by the receiver.

With -T and input file (which can be stdin) the packet timestamp will be added and adjustments will be made in receiver. In this case one must be careful if providing values for multiple hosts (which might have unsynchronized clocks) within the same packet as the same adjustment will be made for all received items.

sandis.neilands Perhaps this is too much detail to add to the documentation. CLOSED.

Comment by Sandis Neilands (Inactive) [ 2016 Jan 29 ]

(2) Take a look at process_pinger_hosts(). Contrary to other polled items we record the timestamp of the pinged item before we get its value. Is this by design or by accident? Should we record the timestamp after getting the pinged item's value?

sandis.neilands After discussion with sasha we decided to leave it as is. CLOSED.

Comment by Sandis Neilands (Inactive) [ 2016 Jan 29 ]

(3) -T option of zabbix_sender must be included in the usage for single key/value use case and its help must be update in the program and manual page.

wiper it would be nice to add timestamp option for single key/value, but for now we will keep backwards compatibility by not sending 'packet' timestamp if we are sending single key/value.
RESOLVED in r58110.

sandis.neilands Now with -r the "clock" object is added twice.
REOPENED.

wiper RESOLVED in r58155

sandis.neilands CLOSED.

Comment by Sandis Neilands (Inactive) [ 2016 Feb 02 ]

Tested with * -> S, * -> AP -> S, * -> PP -> S (where * = AA | PA | sender). Introduced delays with tc command up to 1 second. In addition tested also with lost packets (iptables - example command below).

sudo iptables -A INPUT -p tcp --dport 10051 -m statistic --mode random --probability 0.3 -j DROP

Findings:

  • in my environment increasing delays to more than 1 seconds leads to dropped connections;
  • the timestamps can still be out of sync. In my tests - item's timestamp on server is up to two seconds after the actual item sending time in AA -> AP | PP -> S scenario with slow links between components.

Successfully tested.

Comment by Andris Zeila [ 2016 Feb 02 ]

Released in:

  • pre-2.4.8rc1 r58167
  • pre-3.0.0rc1 r58173
Comment by richlv [ 2016 Feb 02 ]

could you please reopen (1) ? it actually looks perfect for http://zabbix.org/wiki/Docs/Developer_documentation

wiper Actually I was still looking what and where could be documented.

Comment by Andris Zeila [ 2016 Feb 09 ]

Added additional information about time synchronisation to:

sasha Excellent! CLOSED

Comment by viktorkho [ 2016 Feb 18 ]

Is it possible to fix this bug in 2.2 LTS branch, where it was foutd?

Generated at Fri Apr 26 03:34:05 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.