[ZBX-23404] vfs.file.get[] causes agent crash on HP-UX 11.23 Created: 2023 Sep 13  Updated: 2024 Apr 10  Resolved: 2023 Oct 26

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 6.0.22rc1, 7.0.0alpha5
Fix Version/s: 6.0.23rc1, 6.4.8rc1, 7.0.0alpha7, 7.0 (plan)

Type: Problem report Priority: Trivial
Reporter: Andris Mednis Assignee: Andris Mednis
Resolution: Fixed Votes: 0
Labels: agent, crash, hp-ux
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

HP-UX 11iv2 (11.23) on Itanium


Attachments: Text File d.patch     Text File vsnprintf_test.c     Text File vsnprintf_test2.c     Text File vsnprintf_test3.c     File zbx_snprintf_alloc_zbx_strlog_alloc_tests2.patch.txt.gz    
Team: Team C
Sprint: Sprint 105 (Oct 2023)
Story Points: 3

 Description   

Steps to reproduce:

  1. zabbix_agentd -t vfs.get.file[/etc/passwd]

Result:

$ sbin/zabbix_agentd -t vfs.file.get[/etc/passwd]                
zabbix_agentd [378]: ERROR: Got signal [signal:11(SIGSEGV),reason:2,refaddr:00000000]. Crashing ...
zabbix_agentd [378]: ERROR: ====== Fatal information: ======
zabbix_agentd [378]: ERROR: program counter not available for this architecture
zabbix_agentd [378]: ERROR: === Registers: ===
zabbix_agentd [378]: ERROR: register dump not available for this architecture
zabbix_agentd [378]: ERROR: backtrace is not available for this platform
zabbix_agentd [378]: ERROR: === Memory map: ===
zabbix_agentd [378]: ERROR: memory map not available for this platform
zabbix_agentd [378]: ERROR: ================================

Expected:
A valid result, not crash.



 Comments   
Comment by Andris Mednis [ 2023 Sep 13 ]

Zabbix agent 5.0.38rc1 works as expected - this is partially correct: it does not crash on "zabbix_agentd -p".
It could be HP-UX 11iv2 (11.23) specific as ZBX-20417 shows that vfs.file.get[] works normally in HP-UX 11iv3.

The crash is caused by HP-UX vsnprintf(), not conforming to standards.

Crash happens, when Zabbix agent calls get_dir_names(), which calls canonicalize_path(), which calls zbx_snprintf_alloc() where is

*alloc_len = vsnprintf(NULL, 0, fmt, args) + 2; /* '\0' + one byte to prevent the operation retry */

vsnprintf(NULL, 0, fmt, args) causes crash (it works as documented on other operating systems).

https://stackoverflow.com/questions/619497/heap-corruption-in-hp-ux provides some insight to the problem on HP UX 11.11 and a possible workaround.

It is not clear can it be solved by applying vendor patches for HP-UX (I think a commercial support contract is required).

Comment by Andris Mednis [ 2023 Sep 29 ]

Attachment vsnprintf_test.c demonstrates crash on vsnprintf(NULL, 0, ...).
Attachment vsnprintf_test2.c demonstrates what vsnprintf(buf, sizeof(buf),...) returns depending on buffer size.
Attachment vsnprintf_test3.c demonstrates possible workaround.

Comment by dimir [ 2023 Oct 03 ]

I do not understand how it could be that the issue is not happening on 5.0.38rc1:

$ git describe --tags
5.0.38rc1
$ grep actually src/libs/zbxcommon/str.c -B7 -A3
void    zbx_snprintf_alloc(char **str, size_t *alloc_len, size_t *offset, const char *fmt, ...)
{
        va_list args;
        size_t  avail_len, written_len;
retry:
        if (NULL == *str)
        {
                /* zbx_vsnprintf() returns bytes actually written instead of bytes to write, */
                /* so we have to use the standard function                                   */
                va_start(args, fmt);
                *alloc_len = vsnprintf(NULL, 0, fmt, args) + 2; /* '\0' + one byte to prevent the operation retry */

And the master:

$ grep actually src/libs/zbxcommon/common_str.c -B11 -A2
void    zbx_snprintf_alloc(char **str, size_t *alloc_len, size_t *offset, const char *fmt, ...)
{
        va_list args;
        size_t  avail_len, written_len;
retry:
        if (NULL == *str)
        {
                int     rv;


                va_start(args, fmt);


                /* zbx_vsnprintf() returns bytes actually written instead of bytes to write, */
                /* so we have to use the standard function                                   */
                if (0 > (rv = zbx_vsnprintf_check_len(fmt, args))) 

To me there is basically no difference?

<andris> My statement about 5.0.38rc1 was based on observation that 5.0.38rc1 does not crash on "zabbix_agentd -p". zbx_snprintf_alloc() in 5.0.38rc1 will crash on vsnprintf(NULL, 0, fmt, args).

<dimir> There. So, in which version did the problem appear? According to my calculations it's this one which was released in 2.1.0 .

<andris> This can be an explanation:

$ man vsnprintf
...
STANDARDS
fprintf(), printf(), sprintf(), snprintf(), vprintf(), vfprintf(), vsprintf(), vsnprintf(): POSIX.1‐2001, POSIX.1‐2008, C99.
...
Concerning the return value of snprintf(), SUSv2 and C99 contradict each other: when snprintf() is called with size=0 then SUSv2 stipulates an unspecified return value less than 1, while C99 allows str to be NULL in this case, and gives the return value (as always) as the number of characters that would have been written in case the output string has been large enough. POSIX.1‐2001 and later align their specification of snprintf() with C99.

Looks like HP-UX 11.31 provides C99-style vsnprintf(), but HP-UX 11.23 - kind of pre-C99-style vsnprintf().

Comment by dimir [ 2023 Oct 03 ]

Actually, vfs.file.get does not even exist in 5.0. It was added in 6.0

Comment by Andris Mednis [ 2023 Oct 10 ]

Available in versions:

Generated at Tue Jan 07 18:45:26 EET 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.