Loading...

XML

Word

Printable

Type: Change Request
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 3.2.0alpha1
Component/s: Agent (G), Proxy (P), Server (S)
Labels:
- loss
- network
- performance
- tcp

This issue covers low-connection-number-high-data-throughput cases (namely sender -> server/proxy, active agent -> server/proxy, proxy -> server communications). Related issue for high-connection-number-low-data-throughput cases is ZBXNEXT-3214.

Initially both agent and proxy were designed to send same data multiple times until they would get a reply that data reached destination and was processed. Then ~~ZBX-2285~~ changed their behaviour to the opposite. They assumed that if data was written to socket server will eventually receive and process this data. Then (I couldn't find exact point in time) proxy's behaviour was switched back to prevent data loss and now ZBX-10176 effectively requests the do the same for agent.

Neither of these approaches is ideal. When client fully relies on TCP implementation data may be lost if server hits timeout when reading data from TCP buffer or even earlier if something bad happens on TCP implementation level. When client always waits for reply from server there is a possibility of hitting timeout on client side if server is busy. This leads to data retransmission by client and reprocessing by server which makes it really complicated for Zabbix to recover from network downtimes.

Ideally client should send a short request message and wait for server to respond before sending data itself. Server response should contain an estimate of how much data it will be able to process. Client should send the desired amount of data and wait for server to confirm that data was successfully received and processed.

This involves changes to Zabbix protocols. Since daemons will be spending more time communicating (or waiting for replies) timeouts will need some adjustment. This will negatively affect performance, so it's worth implementing a sort of connection manager at the same time.

is duplicated by

ZBX-14341 Passively collected items may get many more values than one per check

Closed

Assignee:: Unassigned

Reporter:: Glebs Ivanovskis (Inactive)

Votes:: 1 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2016 May 18 18:02

Updated:: 2018 May 16 14:33

Details

Description

Attachments

Issue Links

Activity

People

Dates