[ZBX-13544] Zabbix Agent Crashes when using Regex with 'Log' item Created: 2018 Feb 27  Updated: 2024 Apr 10  Resolved: 2018 Mar 25

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Agent (G)
Affects Version/s: 3.4.7
Fix Version/s: 3.4.8rc1, 4.0.0alpha5, 4.0 (plan)

Type: Incident report Priority: Major
Reporter: Samarth Assignee: Michael Veksler
Resolution: Fixed Votes: 0
Labels: crash, logmonitoring, regex
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Mac OSX 10.13.3


Attachments: File extra_log.patch    
Team: Team C
Sprint: Sprint 28, Sprint 29, Sprint 30
Story Points: 1

 Description   

Steps to reproduce:
Configure the Log item to monitor a log file as follows:

log[/Library/FileMaker Server/Logs/Event.log,Warning]

The Zabbix Agent crashes every time this command is run.

Here is the crash log:

60483:20180226:145814.062 Starting Zabbix Agent [67x100.filemaker.com]. Zabbix 3.4.7 (revision 77720).
60483:20180226:145814.062 **** Enabled features ****
60483:20180226:145814.063 IPv6 support:          YES
60483:20180226:145814.063 TLS support:           YES
60483:20180226:145814.063 **************************
60483:20180226:145814.063 using configuration file: /usr/local/etc/zabbix/zabbix_agentd.conf
60483:20180226:145814.065 agent #0 started [main process]
60484:20180226:145814.065 agent #1 started [collector]
60485:20180226:145814.066 agent #2 started [listener #1]
60486:20180226:145814.066 agent #3 started [listener #2]
60487:20180226:145814.067 agent #4 started [listener #3]
60488:20180226:145814.067 agent #5 started [active checks #1]
60488:20180226:145814.073 no active checks on server [17.184.100.87:10051]: host [67x100.filemaker.com] not monitored
60488:20180226:150014.218 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x42]. Crashing ...
60488:20180226:150014.218 ====== Fatal information: ======
60488:20180226:150014.218 program counter not available for this architecture
60488:20180226:150014.219 === Registers: ===
60488:20180226:150014.219 register dump not available for this architecture
60488:20180226:150014.219 === Backtrace: ===
60488:20180226:150014.219 13: 0   zabbix_agentd                       0x0000000103b73794 zbx_log_fatal_info + 168
60488:20180226:150014.219 12: 1   zabbix_agentd                       0x0000000103b73bda fatal_signal_handler + 27
60488:20180226:150014.219 11: 2   libsystem_platform.dylib            0x00007fff68683f5a _sigtramp + 26
60488:20180226:150014.220 10: 3   libdyld.dylib                       0x00007fff68402392 dyld_stub_binder + 282
60488:20180226:150014.220 9: 4   zabbix_agentd                       0x0000000103b7000a regexp_sub + 117
60488:20180226:150014.220 8: 5   zabbix_agentd                       0x0000000103b706d6 regexp_match_ex_regsub + 38
60488:20180226:150014.220 7: 6   zabbix_agentd                       0x0000000103b70616 regexp_sub_ex + 578
60488:20180226:150014.220 6: 7   zabbix_agentd                       0x0000000103b62e46 process_logrt + 6565
60488:20180226:150014.220 5: 8   zabbix_agentd                       0x0000000103b5e28a active_checks_thread + 3278
60488:20180226:150014.221 4: 9   zabbix_agentd                       0x0000000103b72b46 zbx_thread_start + 32
60488:20180226:150014.221 3: 10  zabbix_agentd                       0x0000000103b649a0 MAIN_ZABBIX_ENTRY + 670
60488:20180226:150014.221 2: 11  zabbix_agentd                       0x0000000103b7304d daemon_start + 441
60488:20180226:150014.221 1: 12  zabbix_agentd                       0x0000000103b64f88 main + 869
60488:20180226:150014.221 0: 13  libdyld.dylib                       0x00007fff68402115 start + 1
60488:20180226:150014.221 === Memory map: ===
60488:20180226:150014.221 memory map not available for this platform
60488:20180226:150014.222 ================================
60483:20180226:150014.222 One child process died (PID:60488,exitcode/signal:1). Exiting ...
60483:20180226:150014.223 cannot remove shared memory for collector: [22] Invalid argument
60483:20180226:150014.223 Zabbix Agent stopped. Zabbix 3.4.7 (revision 77720).

Side Notes:

  • The crash only happens with log[], not with log.count[].
  • If the regex field is left empty, the crash doesn't happen and the item works as expected.
  • PCRE is installed on the Mac.


 Comments   
Comment by Samarth [ 2018 Feb 27 ]

Debug Level = 3

Comment by Viktors Tjarve [ 2018 Feb 27 ]

Hi Samarth,
Would you be willing to help up test this case with a pached version of Zabbix agent (with more log information around the potential place where the crash is happening)? So far I'm not able to reproduce this crash and with the information we currently have it's not really clear what can be causing it.

Also please check if you are using the correct version of PCRE library. Note that you need exactly PCRE (v8.x). PCRE2 (v10.x) library is not supported.

Comment by Samarth [ 2018 Feb 27 ]

Hi Viktors,
Sure, I can help you test the patch. I am using PCRE 8.41 installed using Homebrew. What debug level do you want set?

Comment by Glebs Ivanovskis (Inactive) [ 2018 Feb 27 ]

I did some searching and found this.

Can you show the output of the following command?

$ jtool -lazy_bind zabbix_agentd
Comment by Samarth [ 2018 Feb 28 ]

Hi Glebs,

Here is the symbol output:

lazy bind information:
dylib symbol
libSystem.B.dylib __NSGetArgv
libSystem.B.dylib ___assert_rtn
libSystem.B.dylib ___bzero
libSystem.B.dylib ___error
libSystem.B.dylib ___maskrune
libSystem.B.dylib ___memcpy_chk
libSystem.B.dylib ___memmove_chk
libSystem.B.dylib ___stack_chk_fail
libSystem.B.dylib ___tolower
libSystem.B.dylib ___toupper
libSystem.B.dylib ___vsnprintf_chk
libSystem.B.dylib __exit
libSystem.B.dylib _accept
libSystem.B.dylib _access
libSystem.B.dylib _alarm
libSystem.B.dylib _atexit
libSystem.B.dylib _atof
libSystem.B.dylib _atoi
libSystem.B.dylib _backtrace
libSystem.B.dylib _backtrace_symbols
libSystem.B.dylib _bind
libSystem.B.dylib _bsearch
libSystem.B.dylib _calloc
libSystem.B.dylib _chdir
libSystem.B.dylib _clock_gettime
libSystem.B.dylib _close
libSystem.B.dylib _closedir
libSystem.B.dylib _closelog
libSystem.B.dylib _connect
libSystem.B.dylib _dlclose
libSystem.B.dylib _dlerror
libSystem.B.dylib _dlopen
libSystem.B.dylib _dlsym
libSystem.B.dylib _dup
libSystem.B.dylib _dup2
libSystem.B.dylib _execl
libSystem.B.dylib _exit
libSystem.B.dylib _fclose
libSystem.B.dylib _fcntl
libSystem.B.dylib _fflush
libSystem.B.dylib _fgets
libSystem.B.dylib _fileno
libSystem.B.dylib _fopen
libSystem.B.dylib _fork
libSystem.B.dylib _fprintf
libSystem.B.dylib _fputc
libSystem.B.dylib _free
libSystem.B.dylib _fscanf
libSystem.B.dylib _gethostbyaddr
libSystem.B.dylib _gethostbyname
libSystem.B.dylib _getloadavg
libSystem.B.dylib _getmntinfo$INODE64
libSystem.B.dylib _getpeername
libSystem.B.dylib _getpid
libSystem.B.dylib _getprotobynumber
libSystem.B.dylib _getpwnam
libSystem.B.dylib _getservbyport
libSystem.B.dylib _gettimeofday
libSystem.B.dylib _getuid
libSystem.B.dylib _host_page_size
libSystem.B.dylib _host_statistics
libSystem.B.dylib _hstrerror
libiconv.2.dylib _iconv
libiconv.2.dylib _iconv_close
libiconv.2.dylib _iconv_open
libSystem.B.dylib _inet_addr
libSystem.B.dylib _inet_aton
libSystem.B.dylib _inet_ntoa
libSystem.B.dylib _initgroups
libSystem.B.dylib _kill
libSystem.B.dylib _listen
libSystem.B.dylib _localtime
libSystem.B.dylib _localtime_r
libSystem.B.dylib _lseek
libSystem.B.dylib _lstat$INODE64
libSystem.B.dylib _mach_host_self
libSystem.B.dylib _malloc
libSystem.B.dylib _memchr
libSystem.B.dylib _memcmp
libSystem.B.dylib _memcpy
libSystem.B.dylib _memmove
libSystem.B.dylib _memset
libSystem.B.dylib _mktime
libSystem.B.dylib _open
libSystem.B.dylib _opendir$INODE64
libSystem.B.dylib _openlog
libSystem.B.dylib _pipe
libSystem.B.dylib _printf
libSystem.B.dylib _putchar
libSystem.B.dylib _puts
libSystem.B.dylib _qsort
libSystem.B.dylib _read
libSystem.B.dylib _readdir$INODE64
libSystem.B.dylib _realloc
libSystem.B.dylib _recv
libSystem.B.dylib _recvfrom
libSystem.B.dylib _regcomp
libSystem.B.dylib _regerror
libSystem.B.dylib _regexec
libSystem.B.dylib _regfree
libSystem.B.dylib _remove
libSystem.B.dylib _rename
libresolv.9.dylib _res_9_dn_expand
libresolv.9.dylib _res_9_dn_skipname
libresolv.9.dylib _res_9_init
libresolv.9.dylib _res_9_mkquery
libresolv.9.dylib _res_9_send
libSystem.B.dylib _select$1050
libSystem.B.dylib _semctl
libSystem.B.dylib _semget
libSystem.B.dylib _semop
libSystem.B.dylib _sendto
libSystem.B.dylib _setegid
libSystem.B.dylib _seteuid
libSystem.B.dylib _setgid
libSystem.B.dylib _setpgid
libSystem.B.dylib _setsid
libSystem.B.dylib _setsockopt
libSystem.B.dylib _setuid
libSystem.B.dylib _shmat
libSystem.B.dylib _shmctl
libSystem.B.dylib _shmdt
libSystem.B.dylib _shmget
libSystem.B.dylib _shutdown
libSystem.B.dylib _sigaction
libSystem.B.dylib _signal
libSystem.B.dylib _sigprocmask
libSystem.B.dylib _sleep
libSystem.B.dylib _socket
libSystem.B.dylib _sscanf
libSystem.B.dylib _stat$INODE64
libSystem.B.dylib _statvfs
libSystem.B.dylib _strcasecmp
libSystem.B.dylib _strchr
libSystem.B.dylib _strcmp
libSystem.B.dylib _strdup
libSystem.B.dylib _strerror
libSystem.B.dylib _strlen
libSystem.B.dylib _strncmp
libSystem.B.dylib _strrchr
libSystem.B.dylib _strstr
libSystem.B.dylib _sysconf
libSystem.B.dylib _sysctl
libSystem.B.dylib _syslog$DARWIN_EXTSN
libSystem.B.dylib _time
libSystem.B.dylib _times
libSystem.B.dylib _umask
libSystem.B.dylib _uname
libSystem.B.dylib _unlink
libSystem.B.dylib _vfprintf
libSystem.B.dylib _wait
libSystem.B.dylib _waitpid
libSystem.B.dylib _write

Comment by Viktors Tjarve [ 2018 Feb 28 ]

Hi Samarth,
Please apply the patch that I have added in Attachments, recompile and run the agent again. You can leave DebugLevel=3. Then attach the log here. Thanks

Comment by Samarth [ 2018 Mar 01 ]

Hi Viktors,

The crash stopped. I am getting a 'cannot compile regular expression' error now and the item status has changed to 'Not Supported' in the Zabbix server UI

Here is the log:

47046:20180228:160210.519 **** Enabled features ****
47046:20180228:160210.519 IPv6 support: NO
47046:20180228:160210.519 TLS support: NO
47046:20180228:160210.519 **************************
47046:20180228:160210.519 using configuration file: /usr/local/etc/zabbix_agentd.conf
47046:20180228:160210.519 agent #0 started [main process]
47047:20180228:160210.520 agent #1 started [collector]
47048:20180228:160210.520 agent #2 started listener #1
47049:20180228:160210.521 agent #3 started listener #2
47050:20180228:160210.521 agent #4 started listener #3
47051:20180228:160210.521 agent #5 started active checks #1
47051:20180228:160410.672 In regexp_sub_ex()string: 2018-02-08 15:53:05.073 -0800 Information 743 Starting FileMaker Server processes..., pattern: Warning, case_sensitive = 1, output_template: , output:(null)
47051:20180228:160410.673 not a global regexp
47051:20180228:160410.673 In regexp_match_ex_regsub()
47051:20180228:160410.673 In regexp_sub_ex()string: 2018-02-08 15:53:05.073 -0800 Information 743 Starting FileMaker Server processes..., pattern: Warning, output_template: , flags: 2, out:(null)
47051:20180228:160410.673 flags |= REG_NOSUB
47051:20180228:160410.674 goto compile
47051:20180228:160410.674 regcomp() faild error:16, invalid argument to regex routine
47051:20180228:160410.674 End of regexp_sub() FAIL
47051:20180228:160410.674 End of regexp_match_ex_regsub(), ret = -1
47051:20180228:160410.674 End of regexp_sub_ex() ret: -1
47051:20180228:160440.708 In regexp_sub_ex()string: 2018-02-08 15:53:05.073 -0800 Information 743 Starting FileMaker Server processes..., pattern: Warning, case_sensitive = 1, output_template: , output:(null)
47051:20180228:160440.709 not a global regexp
47051:20180228:160440.709 In regexp_match_ex_regsub()
47051:20180228:160440.709 In regexp_sub_ex()string: 2018-02-08 15:53:05.073 -0800 Information 743 Starting FileMaker Server processes..., pattern: Warning, output_template: , flags: 2, out:(null)
47051:20180228:160440.710 flags |= REG_NOSUB
47051:20180228:160440.710 goto compile
47051:20180228:160440.710 regcomp() faild error:16, invalid argument to regex routine
47051:20180228:160440.710 End of regexp_sub() FAIL
47051:20180228:160440.710 End of regexp_match_ex_regsub(), ret = -1
47051:20180228:160440.710 End of regexp_sub_ex() ret: -1
47051:20180228:160510.744 In regexp_sub_ex()string: 2018-02-08 15:53:05.073 -0800 Information 743 Starting FileMaker Server processes..., pattern: Warning, case_sensitive = 1, output_template: , output:(null)
47051:20180228:160510.744 not a global regexp
47051:20180228:160510.745 In regexp_match_ex_regsub()
47051:20180228:160510.745 In regexp_sub_ex()string: 2018-02-08 15:53:05.073 -0800 Information 743 Starting FileMaker Server processes..., pattern: Warning, output_template: , flags: 2, out:(null)
47051:20180228:160510.745 flags |= REG_NOSUB
47051:20180228:160510.745 goto compile
47051:20180228:160510.745 regcomp() faild error:16, invalid argument to regex routine
47051:20180228:160510.745 End of regexp_sub() FAIL
47051:20180228:160510.746 End of regexp_match_ex_regsub(), ret = -1
47051:20180228:160510.746 End of regexp_sub_ex() ret: -1
47051:20180228:160510.746 active check "log[/Library/FileMaker Server/Logs/Event.log,Warning]" is not supported: cannot compile regular expression

Comment by Samarth [ 2018 Mar 01 ]

A side note:
With the patch, Zabbix agent crashes when using log.count[]:

Log:
48362:20180228:162135.174 TLS support: NO
48362:20180228:162135.174 **************************
48362:20180228:162135.174 using configuration file: /usr/local/etc/zabbix_agentd.conf
48362:20180228:162135.175 agent #0 started [main process]
48363:20180228:162135.175 agent #1 started [collector]
48364:20180228:162135.175 agent #2 started listener #1
48366:20180228:162135.176 agent #4 started listener #3
48365:20180228:162135.176 agent #3 started listener #2
48367:20180228:162135.176 agent #5 started active checks #1
48367:20180228:162335.316 Got signal [signal:11(SIGSEGV),reason:1,refaddr:0x0]. Crashing ...
48367:20180228:162335.317 ====== Fatal information: ======
48367:20180228:162335.317 program counter not available for this architecture
48367:20180228:162335.317 === Registers: ===
48367:20180228:162335.317 register dump not available for this architecture
48367:20180228:162335.317 === Backtrace: ===
48367:20180228:162335.318 11: 0 zabbix_agentd 0x00000001059ddc48 zbx_log_fatal_info + 168
48367:20180228:162335.318 10: 1 zabbix_agentd 0x00000001059de0bb fatal_signal_handler + 27
48367:20180228:162335.318 9: 2 libsystem_platform.dylib 0x00007fff59e52f5a _sigtramp + 26
48367:20180228:162335.318 8: 3 ??? 0x0000000000000000 0x0 + 0
48367:20180228:162335.319 7: 4 zabbix_agentd 0x00000001059cb3c3 process_logrt + 8003
48367:20180228:162335.319 6: 5 zabbix_agentd 0x00000001059c5f88 active_checks_thread + 3368
48367:20180228:162335.319 5: 6 zabbix_agentd 0x00000001059dcd90 zbx_thread_start + 32
48367:20180228:162335.319 4: 7 zabbix_agentd 0x00000001059ccebf MAIN_ZABBIX_ENTRY + 751
48367:20180228:162335.319 3: 8 zabbix_agentd 0x00000001059dd47b daemon_start + 443
48367:20180228:162335.319 2: 9 zabbix_agentd 0x00000001059cd4e4 main + 900
48367:20180228:162335.320 1: 10 libdyld.dylib 0x00007fff59bd1115 start + 1
48367:20180228:162335.320 0: 11 ??? 0x0000000000000001 0x0 + 1
48367:20180228:162335.320 === Memory map: ===
48367:20180228:162335.320 memory map not available for this platform
48367:20180228:162335.320 ================================
48362:20180228:162335.321 One child process died (PID:48367,exitcode/signal:1). Exiting ...
48362:20180228:162335.321 cannot remove shared memory for collector: [22] Invalid argument
48362:20180228:162335.321 Zabbix Agent stopped. Zabbix 3.4.7 (revision 77720).

Comment by Michael Veksler [ 2018 Mar 22 ]

Available in:

  • 3.4.8rc1 r78907
  • pre-4.0.0alpha5 (trunk) r78908
Comment by richlv [ 2018 Nov 26 ]

The previous comment says "With the patch, Zabbix agent crashes when using log.count[]", but the issue is closed as fixed - what happened here?
MVekslers the issue was fixed. The problem was only with Mac OS.

<richlv> Thank you Michael, but that link only leads back to this issue. Looks like a secret comment again?

MVekslers The link target the previous comment that is publicly available

<richlv> Ah, now a comment from March is visible, showing the fix versions.
There are likely more secret comments about the svn branch with the fix and details on it, but at least now it's clear that something got fixed.
Thank you Michael.

Generated at Fri Apr 26 19:55:14 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.