ZABBIX BUGS AND ISSUES
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-10626

agent crashes in collector after proc.cpu.util[] is requested on Solaris 8

    Details

      Description

      Suppose we start the agent and request proc.cpu.util[]:

      $ bin/zabbix_get -s 127.0.0.1 -p 23050 -k proc.cpu.util[zabbix_agentd]
      zabbix_get [29276]: Check access restrictions in Zabbix agent configuration
      

      Investigation shows that the agent crashes with the following debug log:

       29078:20160405:225645.401 Requested [proc.cpu.util[zabbix_agentd]]
       29078:20160405:225645.403 In procstat_add()
       29078:20160405:225645.403 In zbx_dshm_realloc() shmid:-1 size:21704
       29078:20160405:225645.403 In procstat_copy_data()
       29078:20160405:225645.403 End of procstat_copy_data()
       29078:20160405:225645.403 End of zbx_dshm_realloc():SUCCEED shmid:605
       29078:20160405:225645.404 End of procstat_add()
       29077:20160405:225645.750 __zbx_zbx_setproctitle() title:'collector [processing data]'
       29077:20160405:225645.750 In update_cpustats()
       29077:20160405:225645.750 End of update_cpustats()
       ...
       29077:20160405:225645.766 DEBUG: util_local = ffbee948
       29077:20160405:225645.766 DEBUG: procstat_snapshot = 0
       29077:20160405:225645.767 DEBUG: procstat_snapshot_num = 0
       29077:20160405:225645.767 DEBUG: sizeof(zbx_procstat_util_t) = 32
       29077:20160405:225645.767 DEBUG: inside procstat_util_compare, u1 = ffbee948, u2 = 0
       29077:20160405:225645.767 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0]. Crashing ...
       29077:20160405:225645.767 ====== Fatal information: ======
       29077:20160405:225645.767 program counter not available for this architecture
       29077:20160405:225645.768 === Registers: ===
       29077:20160405:225645.768 register dump not available for this architecture
       29077:20160405:225645.768 === Backtrace: ===
       29077:20160405:225645.768 backtrace not available for this platform
       29077:20160405:225645.768 === Memory map: ===
       29077:20160405:225645.768 memory map not available for this platform
       29077:20160405:225645.769 ================================
       29076:20160405:225645.773 One child process died (PID:29077,exitcode/signal:1). Exiting ...
      

      The crash happens in procstat_calculate_cpu_util_for_queries() in the second call to bsearch:

      /* find the process utilization data in last snapshot */
      putil = (zbx_procstat_util_t *)bsearch(&util_local, procstat_snapshot, procstat_snapshot_num,
      		sizeof(zbx_procstat_util_t), procstat_util_compare);
      

      As can be seen in the debug log above, the "procstat_snapshot" variable is NULL. The following small program shows that, when the second argument to bsearch() is NULL on Solaris 8, then the comparison function is called with one of the arguments being NULL (which our procstat_util_compare() function does not handle and crashes):

      #include <stdio.h>
      #include <stdlib.h>
      
      static int	compare(const void *p1, const void *p2)
      {
      	printf("p1 = %p, p2 = %p\n", p1, p2);
      	return 0;
      }
      
      int	main()
      {
      	int	a;
      	void	*p;
      
      	p = bsearch(&a, NULL, 0, sizeof(a), compare);
      
      	printf("p = %p\n", p);
      
      	return 0;
      }
      
      $ gcc bsearch.c -o bsearch -Wall -Wextra
      $ ./bsearch
      p1 = ffbefb4c, p2 = 0
      p = 0
      

      Compiling the same program on Linux gives the following warning (note that the comparison function is not called at runtime):

      $ gcc bsearch.c -o bsearch -Wall -Wextra
      bsearch.c: In function 'main':
      bsearch.c:15:2: warning: null argument where non-null required (argument 2) [-Wnonnull]
        p = bsearch(&a, NULL, 0, sizeof(a), compare);
        ^
      $ ./bsearch 
      p = (nil)
      

        Activity

        Hide
        Aleksandrs Saveljevs added a comment -

        We should also check calls to bsearch() in other places.

        On Solaris 9, the comparison function does not seem to be called when the second argument to bsearch() is NULL.

        Show
        Aleksandrs Saveljevs added a comment - We should also check calls to bsearch() in other places. On Solaris 9, the comparison function does not seem to be called when the second argument to bsearch() is NULL.
        Hide
        Glebs Ivanovskis added a comment - - edited

        POSIX says:

        The bsearch() function shall search an array of nel objects, the initial element of which is pointed to by base, for an element that matches the object pointed to by key. The size of each element in the array is specified by width. If the nel argument has the value zero, the comparison function pointed to by compar shall not be called and no match shall be found.

        However, I wasn't able to check this part in pre-2001 editions. Solaris 8 came out in 2000 and may not comply to POSIX-2001 standard. Nevertheless, it should obey C99 and C89. Here is what they say about conparison function:

        The implementation shall ensure that the second argument of the comparison function (when called from bsearch), or both arguments (when called from qsort), are pointers to elements of the array.1) The first argument when called from bsearch shall equal key.

        1) That is, if the value passed is p, then the following expressions are always nonzero:

        ((char *)p - (char *)base) % size == 0
        (char *)p >= (char *)base
        (char *)p < (char *)base + nmemb * size
        
        Show
        Glebs Ivanovskis added a comment - - edited POSIX says : The bsearch() function shall search an array of nel objects, the initial element of which is pointed to by base , for an element that matches the object pointed to by key . The size of each element in the array is specified by width . If the nel argument has the value zero, the comparison function pointed to by compar shall not be called and no match shall be found. However, I wasn't able to check this part in pre-2001 editions. Solaris 8 came out in 2000 and may not comply to POSIX-2001 standard. Nevertheless, it should obey C99 and C89. Here is what they say about conparison function: The implementation shall ensure that the second argument of the comparison function (when called from bsearch ), or both arguments (when called from qsort ), are pointers to elements of the array. 1) The first argument when called from bsearch shall equal key . 1) That is, if the value passed is p , then the following expressions are always nonzero: (( char *)p - ( char *)base) % size == 0 ( char *)p >= ( char *)base ( char *)p < ( char *)base + nmemb * size
        Hide
        Glebs Ivanovskis added a comment -

        Fix for 3.0 is available in development branch svn://svn.zabbix.com/branches/dev/ZBX-10626 revision 59342. Simply defined zbx_bsearch() wrapper which will return NULL for arrays of zero elements as modern POSIX requests.

        Show
        Glebs Ivanovskis added a comment - Fix for 3.0 is available in development branch svn://svn.zabbix.com/branches/dev/ZBX-10626 revision 59342. Simply defined zbx_bsearch() wrapper which will return NULL for arrays of zero elements as modern POSIX requests.
        Hide
        Andris Zeila added a comment -

        Successfully tested

        Show
        Andris Zeila added a comment - Successfully tested
        Hide
        Glebs Ivanovskis added a comment -

        Fixed in pre-3.0.2rc1 r59358, pre-3.1.0 (trunk) r59359.

        Show
        Glebs Ivanovskis added a comment - Fixed in pre-3.0.2rc1 r59358, pre-3.1.0 (trunk) r59359.

          People

          • Assignee:
            Unassigned
            Reporter:
            Aleksandrs Saveljevs
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: