Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-21842

zabbix-server hangs up on internal java gateway monitoring

XMLWordPrintable

    • Icon: Problem report Problem report
    • Resolution: Unresolved
    • Icon: Trivial Trivial
    • None
    • 6.0.9
    • Server (S)
    • None

      Steps to reproduce:

      1. Set DebugLevel=4
      2. Add zabbix[java,,ping] and/or zabbix[java,,version] items
      3. Start zabbix-server

       

      Result:
      **

      zabbix-server sometimes hangs up.

      zabbix_server.log output stops like:

       

       20229:20221101:150551.868 In get_values()
       20229:20221101:150551.868 In DCconfig_get_poller_items() poller_type:5
       20229:20221101:150551.868 End of DCconfig_get_poller_items():1
       20229:20221101:150551.868 In substitute_key_macros_impl() data:'zabbix[java,,ping]'
       20229:20221101:150551.868 End of substitute_key_macros_impl():SUCCEED data:'zabbix[java,,ping]'
       20229:20221101:150551.868 In get_value() key:'zabbix[java,,ping]'

      and zabbix-server can not be terminated by "systemctl stop zabbix-server".

       

       

      Expected:
      zabbix-server works without hangup.

       

      Cause:

      I attached the process using gdb. backtrace is:

       

      (gdb) bt
      #0  0x00007f15f3aa481d in __lll_lock_wait () from /lib64/libpthread.so.0
      #1  0x00007f15f3a9dac9 in pthread_mutex_lock () from /lib64/libpthread.so.0
      #2  0x000056096e046cab in __zbx_mutex_lock (filename=filename@entry=0x56096e15305a "log.c", line=line@entry=264,
          mutex=<optimized out>) at mutexs.c:441
      #3  0x000056096dfc7391 in lock_log () at log.c:264
      #4  0x000056096dfc8225 in __zbx_zabbix_log (level=1,
          fmt=0x56096e160cb0 "Got signal [signal:%d(%s),reason:%d,refaddr:%p]. Crashing ...") at log.c:434
      #5  0x000056096e03b70d in fatal_signal_handler (sig=<optimized out>, siginfo=<optimized out>, context=0x7ffc7a49e280)
          at sighandler.c:58
      #6  <signal handler called>
      #7  0x00007f15f194ec75 in __strlen_avx2 () from /lib64/libc.so.6
      #8  0x00007f15f18eac6f in vfprintf () from /lib64/libc.so.6
      #9  0x00007f15f19bf1ac in __vfprintf_chk () from /lib64/libc.so.6
      #10 0x000056096dfc8324 in vfprintf (__ap=0x7ffc7a49ee38, __fmt=0x56096e128af8 "In %s() jmx_endpoint:'%s' num:%d",
          __stream=0x56096f9ca520) at /usr/include/bits/stdio2.h:130
      #11 __zbx_zabbix_log (level=level@entry=4, fmt=fmt@entry=0x56096e128af8 "In %s() jmx_endpoint:'%s' num:%d")
          at log.c:459
      #12 0x000056096deebce1 in get_values_java (request=request@entry=0 '\000', items=items@entry=0x7ffc7a4b4a40,
          results=results@entry=0x7ffc7a4b2a40, errcodes=errcodes@entry=0x7ffc7a4b1634, num=num@entry=1) at checks_java.c:133
      #13 0x000056096deec264 in get_value_java (request=request@entry=0 '\000', item=item@entry=0x7ffc7a4b4a40,
          result=result@entry=0x7ffc7a4b2a40) at checks_java.c:121
      #14 0x000056096dee9df1 in get_value_internal (item=item@entry=0x7ffc7a4b4a40, result=result@entry=0x7ffc7a4b2a40)
          at checks_internal.c:420
      #15 0x000056096dee6da3 in get_value (add_results=<optimized out>, result=0x7ffc7a4b2a40, item=0x7ffc7a4b4a40)
          at poller.c:287
      #16 zbx_check_items (items=0x7ffc7a4b4a40, errcodes=0x7ffc7a4b2840, num=1, results=0x7ffc7a4b2a40,
          add_results=<optimized out>, poller_type=<optimized out>) at poller.c:704
      #17 0x000056096dee7343 in get_values (poller_type=poller_type@entry=5 '\005', nextcheck=nextcheck@entry=0x7ffc7a4bb9c0)
          at poller.c:807
      #18 0x000056096dee78dd in poller_thread (args=args@entry=0x7ffc7a4bbaa0) at poller.c:973
      #19 0x000056096e046f4b in zbx_thread_start (handler=0x56096dee7790 <poller_thread>, thread_args=0x7ffc7a4bbaa0,
          thread=0x56096f9f0940) at threads.c:124
      #20 0x000056096dece4a9 in server_startup (listen_sock=0x7ffc7a4bbc50, rtc=0x7ffc7a4bbb90,
          ha_failover=0x56096e4bc110 <ha_failover_delay>, ha_stat=0x56096e4bc114 <ha_status>) at server.c:1783
      #21 0x000056096decff6b in MAIN_ZABBIX_ENTRY (flags=flags@entry=0) at server.c:2111
      #22 0x000056096e03a875 in daemon_start (allow_root=<optimized out>, user=<optimized out>, flags=0) at daemon.c:463
      #23 0x000056096dec66bb in main (argc=3, argv=0x56096f9c0b40) at server.c:1149

      SEGV occurs at get_values_java() in src/zabbix_server/poller/checks_java.c:

       

       

               zabbix_log(LOG_LEVEL_DEBUG, "In %s() jmx_endpoint:'%s' num:%d", __func__, items[0].jmx_endpoint, num);

      Then, SEGV signal handler is called and it tries to lock log file mutex, but mutex is already taken by original (non-signal handler) code.

       

      A deadlock occurs like this.

      I looked into codes and I found "jmx_endpoint" member of "items" variable is referenced without initialized when internal java gateway monitoring (zabbix[java,,ping] and zabbix[java,,version] items).

      In addition, zabbix-server sometimes outputs strange logs like:

       19878:20221101:150535.905 In get_values_java() jmx_endpoint:'<F3>^O^^<FA>H<8B>G^XH<8B>8<E9><A0><FF><FF><FF><F3>^O^^<FA>AWI<89><CF>AVA<89><D6>AUI<89><F5>ATUSH<89><FB>H<83><EC>xL<89>DdH<8B>^D%(' num:1

      It says that "jmx_endpoint" is not initialized correctly.

       

      Fix:

      To fix the problem, "item" variable should be initialized correctly.

      diff --git a/src/zabbix_server/poller/poller.c b/src/zabbix_server/poller/poller.c
      index 58338b6d91..9439348179 100644
      --- a/src/zabbix_server/poller/poller.c
      +++ b/src/zabbix_server/poller/poller.c
      @@ -791,6 +791,7 @@ static int  get_values(unsigned char poller_type, int *nextcheck)
      
              zabbix_log(LOG_LEVEL_DEBUG, "In %s()", __func__);
      
      +       memset(&item, 0, sizeof(item));
              items = &item;
              num = DCconfig_get_poller_items(poller_type, &items); 

            zabbix.dev Zabbix Development Team
            kento.takahashi Kento Takahashi
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: