According to https://www.zabbix.com/documentation/3.0/manual/appendix/items/activepassive server/proxy request for passive item looks like:
For newer server/proxy to be compatible with older versions of agent passive item requests are not preceded by <HEADER> and <DATALEN>.
Data from socket buffer is read by agent in blocks into static buffer. If after a read operation agent ends up with a partially filled static buffer it assumes data has ended and ceases further reading. If agent was able to read full buffer it assumes there is more data (which is not always true!) and continues reading.
In a situation when agent wants more data from the socket but server/proxy gave it all and is already expecting response... nothing happens until server/proxy waiting gets interrupted by timeout.
This was discovered in a process of working on
ZBX-8914 which is likely to change this behaviour a bit, so exact key length needed to reproduce timeout error may be a multiple of ZBX_STAT_BUF_LEN or a multiple of (ZBX_STAT_BUF_LEN - 1) or N * (ZBX_STAT_BUF_LEN - 1) - 1 or something like that.
Ways to fix this problem:
- user-level - change the length of item key with extra meaningless symbol;
- code-level - change the length of item key with extra meaningless symbol;
- admin-level - recompiling agent with larger ZBX_STAT_BUF_LEN will reduce probability of this issue occurring;
- code-level - start all requests with header (i.e. drop support for older agents) or think of a way to distinguish situations "buffer is full and there is data to read" and "buffer is full and there is no more data to read" (non-blocking calls and EWOULDBLOCK come to mind but encryption complicates everything).