[ZBX-8048] Proxy truncates executable script. Created: 2014 Apr 08  Updated: 2017 May 30  Resolved: 2014 Jun 11

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 2.0.9
Fix Version/s: 2.3.0

Type: Incident report Priority: Major
Reporter: Juris Miščenko (Inactive) Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: dbpatches, proxy, trivial
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

The field length limit for 'Executable script' parameters of ssh, telnet and database monitoring items is set to 64kB on the server, but the proxy truncates this field to 2kB.



 Comments   
Comment by Juris Miščenko (Inactive) [ 2014 Apr 08 ]

Implemented in svn://svn.zabbix.com/branches/dev/ZBX-8048

Comment by Juris Miščenko (Inactive) [ 2014 Apr 08 ]

(1) Details regarding the size change and the issues with excessively large execute scripts should be documented accordingly.

jurism RESOLVED.

Comment by Alexander Vladishev [ 2014 May 19 ]

(2) Please review my changes in r45607.

jurism Changes reviewed. RESOLVED.

Comment by Juris Miščenko (Inactive) [ 2014 May 21 ]

Fix merged in 2.3.0 (trunk) r45693

Comment by Aleksandrs Saveljevs [ 2014 May 21 ]

(3) Unfortunately, I could not reproduce the behavior as mentioned in the documentation:

Although the 'Executed script' field is limited to 64kB in length, an excessively long script might cause the item to fail due to the unpredictable behavior of the receiving daemon. This affects the sending and receiving of the script data. Tests have shown that around the 4kB length scripts are relatively safe. Beyond that items become increasingly unstable to the point of being unusable. This is dependent on the daemon providing the service not Zabbix itself. Caution should be taken if the items are meant for providing consistent and reliable data.

I used the following script for a test:

echo 1
echo 2
echo 3
...
echo 2500

This is 23893 characters in length. It executes perfectly on SSH on Linux, FreeBSD 4.2, FreeBSD 7.3, FreeBSD 8.2 and Solaris 10. It also executes perfectly on Telnet on FreeBSD 4.2 and Solaris 10.

According to src/transport.c:774 in libssh2, there is indeed a limitation of around 32 kB on the maximum packet size, so it refuses to send more than that. However, I could not reproduce the behavior as described in the quote.

Could you please show examples of such unstable behavior over 4 kB?

jurism To reproduce this, I created an SSH agent item on a host and simply populated its executable script with the uname(1) command giving it the 'print all information' flag. A buffer to do this can simply be created with a quick python oneliner from the shell such as this:

python -c 's="uname -a;"; print 2**16 / 4 / len(s) * s'

. This will produce 16kB worth of data containing 1820 calls to uname(1). This caused various results, depending on the target sshd and operating system. For a more varied and error prone approach, repalce the `uname -a' call with a call to w(1). The size of this commands output depends on the count of active login records, so there's greater variety. Although, you can simply cut the `uname' one in half each time, as I did, to have a rough checkpoint every so often in the data size range.

These tests yield varied results on various sizes of data sent, but oftentimes, we either can't read the SSH reply, or the reading/writing is interrupted by a syscall (sigalarm timeout), or the inability to read the banner of the sshd service and experience other failures.

asaveljevs I have tried executing the sequence of "uname -a" produced by your Python script. It works perfectly on OpenBSD 3.9 and OpenBSD 5.4 (the new ones), but fails due to timeout after 3 seconds on FreeBSD 4.2 and NetBSD 5.0 (the old ones). This is expected, because the last two machines are much slower, and is not a strange behavior.

asaveljevs Please describe more test cases that I could try to see the strange behavior. REOPENED.

asaveljevs Based on what you showed me in the office, I understand that you throw a bunch of often-checked SSH items that stress the remote system so that it stops responding properly. However, this is expected and the same would happen with any server system. For instance, if you throw a bunch of items at a Zabbix agent that take a long time to process, it will also run out of listeners and will not respond for a period of time.

asaveljevs So my suggestion would be to either revert the documentation update that says that the problem only depends on the number of characters in the script, or prove that it is indeed true.

asaveljevs One thing that we could document, though, is that the size of the SSH script is limited by around 32 kB. The exact number has to be check in libssh2 sources, as noted above.

jurism Updated documentation to reflect the issue in a more generic way and note the 32kB data length limit of ssh executable scripts. RESOLVED.

sasha I suggest not to write specific restrictions of SSH library in our documentation since they can change.

  • it must be written as warning with exclamation mark
  • "ssh library" => "libssh2 library"
  • 32kB => around 32kB

And please remove: "Also, processing time intense scripts should be used with caution as response time limits apply and this should be accounted for. Increasing the Timeout parameter in the server and/or agent configuration might mitigate instability in the data being returned.".

We do not recommend to increase this parameter since it can decrease the server productivity

jurism Documentation updated. RESOLVED.

sasha REOPENED It should be removed for Telnet checks

jurism Documentation updated. RESOLVED.

sasha CLOSED

Generated at Thu Mar 28 14:33:25 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.