[ZBX-9640] Incorrect timestamp using zabbix_sender result in case of slow network Created: 2015 Jun 15 Updated: 2017 May 30 Resolved: 2016 Feb 10 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Proxy (P), Server (S) |
Affects Version/s: | 2.2.9, 2.4.5 |
Fix Version/s: | 2.4.8rc1, 3.0.0rc1 |
Type: | Incident report | Priority: | Blocker |
Reporter: | Alexey Pustovalov | Assignee: | Unassigned |
Resolution: | Fixed | Votes: | 1 |
Labels: | None | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Issue Links: |
|
Description |
Zabbix Server/proxy adds a few seconds in case of slow network between zabbix_sender and Zabbix proxy/server. It adds amount of seconds between start and finish data receiving: #tc qdisc add dev lo root netem delay 750ms # time echo "test_host test.trap 1426067013 test_value" | zabbix_sender -z localhost -T -i - info from server: "processed: 1; failed: 0; total: 1; seconds spent: 0.000177" sent: 1; skipped: 0; total: 1 real 0m4.507s user 0m0.000s sys 0m0.000s +--------+------------+------------------------------------------------------------------------------------------+-----------+ | itemid | clock | value | ns | +--------+------------+------------------------------------------------------------------------------------------+-----------+ | 24018 | 1426067017 | 1434356187 | 0 | So in final case it adds about 4 seconds, which was spent to send data before. |
Comments |
Comment by Andris Zeila [ 2016 Jan 20 ] |
Zabbix adjusts timestamps of the received values by the difference of server and client (agent, sender, proxy) timestamps. To ensure correct adjustements those timestamps must be taken as close as possible - after connection has been estabilished (after connect() on client side and accept() on server side). It appears only active proxy correctly adds timestamp taken after connection has been made. In all other places clients adds timestamp taken before connection and server compares it with timestamp taken after data has been received. |
Comment by Andris Zeila [ 2016 Jan 21 ] |
Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-9640 |
Comment by Sandis Neilands (Inactive) [ 2016 Jan 29 ] |
(1) Should we document at least in the zabbix.org the meaning of the value timestamps? Every value retrieved by the Zabbix system has a timestamp. The timestamp represents the time when the Zabbix system got the value. Zabbix compnents can be run on separate machines with unsynchronized clocks. Each component runs in its own time. When the values are processed by the server their timestamps must be in the server's time. However initially:
When the values are passed between components their timestamps must be adjusted in order to be in the time of the receiving component. On the receiving component: adjusted_value_timestamp = received_value_timestamp + adjustment The adjustment value is calculated on the receiving component per each packet that contains values from another component. Each packet is transmitted in a separate TCP connection. These packets contain timestamp of when the packet was sent from the component (in the sending components time). If we ignore the network latency then the difference between receiving time and sending time is the difference between clocks on the machines running the components: adjustment = incoming_connection_accepted_timestamp - received_packet_sending_timestamp The adjustment calculation is slightly different when server gets values from passive proxy (it is polled by server): adjustment = connection_to_proxy_completed_timestamp - received_packet_sending_timestamp Note that as the value travels trough Zabbix system its timestamp can be adjusted twice. First, in proxy, when it is passed from active agent to proxy. Second, in server, when it is passed from proxy to server. Summary The complete rules are as follows: Format: sender -> receiver: value_timestamp_formula; adjustment_timestamp_formula. Rules: AA -> S : TS_S = TS_AA + ADJ; ADJ = ACCPT_TS - PKT_TS. AA -> PP: TS_PP = TS_AA + ADJ; ADJ = ACCPT_TS - PKT_TS. AA -> AP: TS_AP = TS_AA + ADJ; ADJ = ACCPT_TS - PKT_TS. PP -> S : TS_S = TS_PP + ADJ; ADJ = CONN_TS - PKT_TS. AP -> S : TS_S = TS_AP + ADJ; ADJ = ACCPT_TS - PKT_TS. PA -> S : TS_S = VAL_RECVD_TS; n/a. PA -> AP: TS_AP = VAL_RECVD_TS; n/a. PA -> PP: TS_PP = VAL_RECVD_TS; n/a. Legend: TS_X : value's timestamp in X's time ADJ : adjustment ACCPT_TS : timestamp (in receiver's time) after incoming connection was accepted on the receiver CONN_TS : timestamp (in receiver's time) after outgoing connection was established from receiver to sender PKT_TS : timestamp (in sender's time) just before sending the packet to the receiver (e.g. after connect() or accept() in case of PP) VAL_RECVD_TS: timestamp of the current time (in receiver's time) when the value was received AA: active agent PA: passive agent AP: active proxy PP: passive proxy S : server zabbix_sender Packet timestamp cannot be currently added when specifying host, item, value on the command line. In this case no adjustments will be made by the receiver. With -T and input file (which can be stdin) the packet timestamp will be added and adjustments will be made in receiver. In this case one must be careful if providing values for multiple hosts (which might have unsynchronized clocks) within the same packet as the same adjustment will be made for all received items. sandis.neilands Perhaps this is too much detail to add to the documentation. CLOSED. |
Comment by Sandis Neilands (Inactive) [ 2016 Jan 29 ] |
(2) Take a look at process_pinger_hosts(). Contrary to other polled items we record the timestamp of the pinged item before we get its value. Is this by design or by accident? Should we record the timestamp after getting the pinged item's value? sandis.neilands After discussion with sasha we decided to leave it as is. CLOSED. |
Comment by Sandis Neilands (Inactive) [ 2016 Jan 29 ] |
(3) -T option of zabbix_sender must be included in the usage for single key/value use case and its help must be update in the program and manual page. wiper it would be nice to add timestamp option for single key/value, but for now we will keep backwards compatibility by not sending 'packet' timestamp if we are sending single key/value. sandis.neilands Now with -r the "clock" object is added twice. wiper RESOLVED in r58155 sandis.neilands CLOSED. |
Comment by Sandis Neilands (Inactive) [ 2016 Feb 02 ] |
Tested with * -> S, * -> AP -> S, * -> PP -> S (where * = AA | PA | sender). Introduced delays with tc command up to 1 second. In addition tested also with lost packets (iptables - example command below). sudo iptables -A INPUT -p tcp --dport 10051 -m statistic --mode random --probability 0.3 -j DROP Findings:
Successfully tested. |
Comment by Andris Zeila [ 2016 Feb 02 ] |
Released in:
|
Comment by richlv [ 2016 Feb 02 ] |
could you please reopen (1) ? it actually looks perfect for http://zabbix.org/wiki/Docs/Developer_documentation wiper Actually I was still looking what and where could be documented. |
Comment by Andris Zeila [ 2016 Feb 09 ] |
Added additional information about time synchronisation to:
sasha Excellent! CLOSED |
Comment by viktorkho [ 2016 Feb 18 ] |
Is it possible to fix this bug in 2.2 LTS branch, where it was foutd? |