[ZBX-10191] Passive agent fails to respond if key length is a multiple of ZBX_STAT_BUF_LEN Created: 2015 Dec 23  Updated: 2018 Jan 09  Resolved: 2018 Jan 09

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G), Proxy (P), Server (S)
Affects Version/s: 2.2.12rc1, 2.4.8rc1, 3.0.0alpha6
Fix Version/s: None

Type: Problem report Priority: Major
Reporter: Glebs Ivanovskis (Inactive) Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: agent, compatibility, hang, network, passive, protocols, timeout
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates ZBXNEXT-3581 Drop plain text protocol, make ZBXD\1... Closed
is duplicated by ZBX-13250 Connection timeout if received exactl... Closed

 Description   

According to https://www.zabbix.com/documentation/3.0/manual/appendix/items/activepassive server/proxy request for passive item looks like:

<item key>\n

For newer server/proxy to be compatible with older versions of agent passive item requests are not preceded by <HEADER> and <DATALEN>.

Data from socket buffer is read by agent in blocks into static buffer. If after a read operation agent ends up with a partially filled static buffer it assumes data has ended and ceases further reading. If agent was able to read full buffer it assumes there is more data (which is not always true!) and continues reading.

In a situation when agent wants more data from the socket but server/proxy gave it all and is already expecting response... nothing happens until server/proxy waiting gets interrupted by timeout.

This was discovered in a process of working on ZBX-8914 which is likely to change this behaviour a bit, so exact key length needed to reproduce timeout error may be a multiple of ZBX_STAT_BUF_LEN or a multiple of (ZBX_STAT_BUF_LEN - 1) or N * (ZBX_STAT_BUF_LEN - 1) - 1 or something like that.

Ways to fix this problem:

  • user-level - change the length of item key with extra meaningless symbol;
  • code-level - change the length of item key with extra meaningless symbol;
  • admin-level - recompiling agent with larger ZBX_STAT_BUF_LEN will reduce probability of this issue occurring;
  • code-level - start all requests with header (i.e. drop support for older agents) or think of a way to distinguish situations "buffer is full and there is data to read" and "buffer is full and there is no more data to read" (non-blocking calls and EWOULDBLOCK come to mind but encryption complicates everything).


 Comments   
Comment by Alexander Vladishev [ 2018 Jan 09 ]

Closed as duplicate of ZBXNEXT-3581.

Generated at Fri Apr 26 16:02:23 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.