[ZBX-22658] Zabbix Server crashes on start SIGSEGV (libnss_winbind causes memory corruption) Created: 2023 Apr 12 Updated: 2024 Apr 10 Resolved: 2023 Nov 10 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 6.4.0, 6.4.1 |
Fix Version/s: | None |
Type: | Problem report | Priority: | Trivial |
Reporter: | Ellerhold IT | Assignee: | Aleksejs Sestakovs (Inactive) |
Resolution: | Fixed | Votes: | 0 |
Labels: | crash | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Debian Bullseye |
Attachments: |
![]() ![]() |
||||||||
Issue Links: |
|
||||||||
Team: | |||||||||
Sprint: | Sprint 99 (Apr 2023), Sprint 100 (May 2023), Sprint 101 (Jun 2023), Sprint 102 (Jul 2023), Sprint 103 (Aug 2023) |
Description |
Steps to reproduce:
Result:
Expected:
|
Comments |
Comment by Vladislavs Sokurenko [ 2023 Apr 13 ] |
Backtrace for easier searching: 2343901:20230412:092014.484 === Backtrace: === 2343901:20230412:092014.485 14: /usr/sbin/zabbix_server(zbx_backtrace+0x3c) [0x561ca588d18c] 2343901:20230412:092014.485 13: /usr/sbin/zabbix_server(zbx_log_fatal_info+0x285) [0x561ca588d515] 2343901:20230412:092014.485 12: /usr/sbin/zabbix_server(+0x2edb56) [0x561ca588db56] 2343901:20230412:092014.485 11: /lib/x86_64-linux-gnu/libpthread.so.0(+0x13140) [0x7fc1af142140] 2343901:20230412:092014.485 10: /lib/x86_64-linux-gnu/libcrypto.so.1.1(RAND_DRBG_generate+0x165) [0x7fc1aef032e5] 2343901:20230412:092014.485 9: /lib/x86_64-linux-gnu/libcrypto.so.1.1(RAND_DRBG_bytes+0x81) [0x7fc1aef034c1] 2343901:20230412:092014.485 8: /lib/x86_64-linux-gnu/libssl.so.1.1(SSL_CTX_new+0x3f1) [0x7fc1af078381] 2343901:20230412:092014.485 7: /usr/sbin/zabbix_server(zbx_tls_init_child+0x150) [0x561ca57e96f0] 2343901:20230412:092014.485 6: /usr/sbin/zabbix_server(poller_thread+0x82) [0x561ca5645dc2] 2343901:20230412:092014.485 5: /usr/sbin/zabbix_server(zbx_thread_start+0x20) [0x561ca57dc180] 2343901:20230412:092014.485 4: /usr/sbin/zabbix_server(+0x8b846) [0x561ca562b846] 2343901:20230412:092014.485 3: /usr/sbin/zabbix_server(MAIN_ZABBIX_ENTRY+0xc9b) [0x561ca562d08b] 2343901:20230412:092014.485 2: /usr/sbin/zabbix_server(main+0x229) [0x561ca5621eb9] 2343901:20230412:092014.485 1: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea) [0x7fc1ae88ed0a] 2343901:20230412:092014.485 0: /usr/sbin/zabbix_server(_start+0x2a) [0x561ca56290aa] 2343901:20230412:092014.485 === Memory map: === |
Comment by Alex Kalimulin [ 2023 Apr 13 ] |
How did you install Samba 4.17/4.18? The latest Bullseye has only 4.13 available in the packages. |
Comment by Vladislavs Sokurenko [ 2023 Apr 13 ] |
Please also show output of: |
Comment by Ellerhold IT [ 2023 Apr 13 ] |
Samba 4.17 is available via the bullseye-backports repository. Samba 4.18 is available via this repository: http://www.corpit.ru/mjt/packages/samba/ This is maintained by Michael Tokarev - the official samba maintainer of debian. |
Comment by Ellerhold IT [ 2023 Apr 13 ] |
zabbix_server -V: zabbix_server (Zabbix) 6.4.1 Revision 546e284fd7c 3 April 2023, compilation time: Apr 3 2023 06:55:08 Copyright (C) 2023 Zabbix SIA License GPLv2+: GNU GPL version 2 or later <https://www.gnu.org/licenses/>. This is free software: you are free to change and redistribute it according to the license. There is NO WARRANTY, to the extent permitted by law. This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (http://www.openssl.org/). Compiled with OpenSSL 1.1.1k 25 Mar 2021 Running with OpenSSL 1.1.1n 15 Mar 2022 |
Comment by Alex Kalimulin [ 2023 Apr 13 ] |
Cannot reproduce so far, even with samba from the aforementioned repo. Any hints that could help to pinpoint the issue? E.g. special settings, config parameters, custom compiled libs, etc. |
Comment by Ellerhold IT [ 2023 Apr 13 ] |
I've got these packages installed: ii libssl1.1:amd64 1.1.1n-0+deb11u4 amd64 Secure Sockets Layer toolkit - shared libraries ii libxmlsec1-openssl:amd64 1.2.31-1 amd64 Openssl engine for the XML security library ii libzstd1:amd64 1.4.8+dfsg-2.1 amd64 fast lossless compression algorithm ii openssl 1.1.1n-0+deb11u4 amd64 Secure Sockets Layer toolkit - cryptographic utility ii ssl-cert 1.1.0+nmu1 all simple debconf wrapper for OpenSSL ii zabbix-agent 1:6.4.1-1+debian11 amd64 Zabbix network monitoring solution - agent ii zabbix-server-mysql 1:6.4.1-1+debian11 amd64 Zabbix network monitoring solution - server (MySQL) ii zabbix-sql-scripts 1:6.4.1-1+debian11 all Zabbix network monitoring solution - sql-scripts ii libldb2:amd64 2:2.7.2+samba4.18.1+dfsg-1~exp1 amd64 LDAP-like embedded database - shared library ii python3-ldb 2:2.7.2+samba4.18.1+dfsg-1~exp1 amd64 Python 3 bindings for LDB ii python3-samba 2:4.18.1+dfsg-1~exp1 amd64 Python 3 bindings for Samba ii samba 2:4.18.1+dfsg-1~exp1 amd64 SMB/CIFS file, print, and login server for Unix ii samba-ad-provision 2:4.18.1+dfsg-1~exp1 all Samba files needed for AD domain provision ii samba-common 2:4.18.1+dfsg-1~exp1 all common files used by both the Samba server and client ii samba-common-bin 2:4.18.1+dfsg-1~exp1 amd64 Samba common files used by both the server and the client ii samba-dsdb-modules:amd64 2:4.18.1+dfsg-1~exp1 amd64 Samba Directory Services Database ii samba-libs:amd64 2:4.18.1+dfsg-1~exp1 amd64 Samba core libraries ii samba-vfs-modules:amd64 2:4.18.1+dfsg-1~exp1 amd64 Samba Virtual FileSystem plugins ii libnss-winbind:amd64 2:4.18.1+dfsg-1~exp1 amd64 Samba nameservice integration plugins ii libpam-winbind:amd64 2:4.18.1+dfsg-1~exp1 amd64 Windows domain authentication integration plugin ii libwbclient0:amd64 2:4.18.1+dfsg-1~exp1 amd64 Samba winbind client library ii winbind 2:4.18.1+dfsg-1~exp1 amd64 service to resolve user and group information from Windows NT servers ii libcurl3-gnutls:amd64 7.74.0-1.3+deb11u7 amd64 easy-to-use client-side URL transfer library (GnuTLS flavour) ii libgnutls30:amd64 3.7.1-5+deb11u3 amd64 GNU TLS library - main runtime library We've set `TLSConnect=psk` and `TLSAccept=psk` in the agent config and there is `TLSCAFile`, `TLSCertFile` and `TLSKeyFile` set in the zabbix_server.conf. No self-compiled libs or packages. |
Comment by Ellerhold IT [ 2023 Apr 14 ] |
Please find the zabbix_server.conf attached. Nothing out of the ordinary I think. The TLS* keys are used! |
Comment by Aleksejs Sestakovs (Inactive) [ 2023 May 15 ] |
Hi EllerholdIT, I was not able to reproduce the issue.
No server crash observed. Can you reproduce the issue with standalone machine? |
Comment by Norbert Püschel [ 2023 Jun 28 ] |
I now have the same problem with CentOS 8 Stream and current patches. I have tracked the problem to nss_winbind. With Samba installed, your /etc/nsswitch.conf will typically have entries like: passwd: files winbind systemd group: files winbind systemd The problem is with the group entry. I have found a temporary workaround; you can add the line: initgroups: files systemd to nsswitch.conf, and the problem goes away. This suggests that a call to getgrouplist() is crashing the server with winbind enabled. The number of groups imported from Windows AD is typically enormous, so the reason for the crash might be that you get too many results. It might also be that the nss_winbind module takes too long to deliver results. I have not yet tested if reducing the Samba parameter winbind expand groups might have an influence on this behaviour. I have no idea what has changed in Samba 4.18 to cause this; there is nothing in the release notes. |
Comment by Vladislavs Sokurenko [ 2023 Jun 28 ] |
Does compilation from sources help ? |
Comment by Norbert Püschel [ 2023 Jun 28 ] |
Compilation of what? Zabbix? Samba? The crash occurs both with Zabbix Server 6.4.3 and 6.4.4, so I do not see how compilation would change that. Anyway, this is one of my production systems, so I will not do experiments with that. I can live with the workaround for now. I suggest you look for places in Zabbix server where getgrouplist() is called explicit or implicitly: int getgrouplist(const char *{}user, gid_t group, gid_t *{}groups, int *{}ngroups); getgrouplist() ist internally called by initgroups() AFAIK. Anyway, the crash is probably not Zabbix's fault. I do not see any calls to getgrouplist() in the source code, only initgroups() which in turn is implemented in glibc. So this might actually be a bug in glibc. Anyway, if you want to reproduce the bug I guess you need to make sure that the Active Directory you connect your Samba to has enough groups to trigger the error.
|
Comment by Vladislavs Sokurenko [ 2023 Jun 28 ] |
Thanks for looking into this, initgroups() is called when launching Zabbix as a daemon to change user from root. Manually launching Zabbix agent could also be a workaround in that case, but in that case it does not seem as something that can be fixed in Zabbix indeed. |
Comment by Glebs Ivanovskis [ 2023 Jun 28 ] |
Backtrace shows, that it crashes in OpenSSL libraries following zbx_tls_init_child() call. I don't think initgroups() is a root cause, given that it worked fine with a different version of Samba. My guess would be that it is a symbol binding conflict similar to ZBX-12159. |
Comment by Norbert Püschel [ 2023 Jun 28 ] |
Maybe you can put a note with the initgroups-workaround in the documentation. When I ran in to the problem yesterday I stumbled upon this bug entry by pure luck, as the connection with the samba version is not so obvious. |
Comment by Kacper [ 2023 Aug 29 ] |
I did a git bisect on Samba today and found the commit that introduced this segfault in Zabbix. It's commit 642a4452ce5b3333c50e41e54bc6ca779686ecc3 "nsswitch: leverage TLS if available in favour over global locking". The question is why the switching to thread-local storage in winbind causes a segfault in Zabbix. This issue exists in 6.0 LTS also and is not limited to Debian. The same issue can be observed with Samba 4.18 on RHEL too. |
Comment by Ellerhold IT [ 2023 Aug 30 ] |
The zabbix agent works for us, its just the same that segfault in our setup. |
Comment by Glebs Ivanovskis [ 2023 Aug 30 ] |
kacper, can you share exact details how you did the bisect? |
Comment by Kacper [ 2023 Aug 30 ] |
cyclone, sure.
The first bad commit will end up being 642a4452ce5b3333c50e41e54bc6ca779686ecc3. To confirm revert commit 7545e2c (this is the commit right after 642a445 for wb_common.c) and 642a445, recompile samba and watch zabbix server not crash. |
Comment by Glebs Ivanovskis [ 2023 Aug 30 ] |
kacper, thanks! I wonder if it is essential for Samba to be running in order to crash Zabbix? |
Comment by Kacper [ 2023 Aug 30 ] |
cyclone, whether samba is running or not makes no difference. |
Comment by Norbert Püschel [ 2023 Aug 30 ] |
As I wrote before, the crash is caused by winbind through the NSS-Integeation when initgroups is called. Smbd and Nmbd do not have a role in this. |
Comment by Kacper [ 2023 Aug 30 ] |
Interestingly, running zabbix server in foreground mode (-f) or under a gdb session does not trigger a crash. |
Comment by Kacper [ 2023 Sep 06 ] |
Upstream bug report tracking this is at https://bugzilla.samba.org/show_bug.cgi?id=15464 |
Comment by Kacper [ 2023 Sep 27 ] |
This bug has now been fixed in Samba 4.18.7 released today. Those on Samba 4.19 will have to wait until October 16 as that's the release date of 4.19.1 containing the bug fix. We can go ahead and close this bug report here. |
Comment by Ellerhold IT [ 2023 Sep 27 ] |
Thanks for your investigation and sorry that I've opened this issue at the wrong project. I posted a bug report to the samba mailing list at the same time, but sadly no one responded. Once 4.19.1 is released I'll test it and can confirm if it works. |
Comment by Michael Veksler [ 2023 Nov 10 ] |
Closing this for now, if you have any more questions, feel free to reopen this ticket! |