-
Incident report
-
Resolution: Won't fix
-
Critical
-
None
-
7.0.2
-
None
-
S24-W34/35
Suddenly our zabbix server crahes after startup.
It then does endless restarts.
We tracked this down to the processing of an incomming large proxy message
System runnig on Debian 12.6 with Zabbix server 7.0.2 (revision d1b0c3308ce)
Attached are
- the zabbix server config (with sensitive things masked)
- the DebugLevel=4 of the crashing process, the PSK are masked and some client names masked too
The relevant thiings seems to be here:
141115:20240819:174955.196 zbx_setproctitle() title:'trapper #5 [processed data in 0.026269 sec, waiting for connection]' 141115:20240819:174955.196 In zbx_tls_accept() 141115:20240819:174955.197 zbx_psk_server_cb() requested PSK identity "XXXXXXXXXX" 141115:20240819:174955.215 End of zbx_tls_accept():SUCCEED (established TLSv1.3 TLS_CHACHA20_POLY1305_SHA256) 141115:20240819:174955.215 zbx_setproctitle() title:'trapper #5 [processing data]' 141115:20240819:174955.215 In zbx_ipc_async_socket_recv() timeout:0 141115:20240819:174955.215 End of zbx_ipc_async_socket_recv():0 141115:20240819:174955.231 trapper got '\{"request":"proxy data","host":"Zabbix proxy Digmesa Ipsach","session":"f00c704ec5b34d5ab03ff49faa2f0c32","interface availability":[{"interfaceid":28,"available":1,"error":""},\{"interfaceid":377,"available":1,"error":""},\{"interfaceid":515,"available":1,"error":""},{"interface 141115:20240819:174955.232 In recv_proxy_data() 141115:20240819:174955.235 Proxy "XXXXXXXXXXXX" version 6.0.30 is outdated, only data collection and remote execution is available with server version 7.0.2. 141115:20240819:174955.235 In zbx_hc_check_proxy() proxyid:10130 141115:20240819:174955.235 End of zbx_hc_check_proxy():SUCCEED 141115:20240819:174955.235 In zbx_process_proxy_data() 141115:20240819:174955.236 zbx_process_proxy_data() flag_win:7/0 flag:48 proxy_status:0 period_end:1724082595 delay:1320 timestamp:1724082595 lastaccess:1724081275 proxy_delay:5159 more:1 141115:20240819:174955.239 Got signal [signal:11(SIGSEGV),reason:1,refaddr:(nil)]. Crashing ... 141115:20240819:174955.241 ====== Fatal information: ====== 141115:20240819:174955.241 Program counter: 0x7f00cd63fea6 141115:20240819:174955.241 === Registers: === 141115:20240819:174955.241 r8 = 7f004a626fa4 = 139639224692644 = 139639224692644 141115:20240819:174955.242 r9 = 0 = 0 = 0 141115:20240819:174955.242 r10 = 0 = 0 = 0 141115:20240819:174955.242 r11 = 20 = 32 = 32 141115:20240819:174955.242 r12 = 7f004a626fa4 = 139639224692644 = 139639224692644 141115:20240819:174955.242 r13 = 7f004a626fa8 = 139639224692648 = 139639224692648 141115:20240819:174955.242 r14 = 7f004a626f98 = 139639224692632 = 139639224692632 141115:20240819:174955.242 r15 = 55d9d4f57870 = 94394069121136 = 94394069121136 141115:20240819:174955.242 rdi = 0 = 0 = 0 141115:20240819:174955.243 rsi = 55d9d4f57870 = 94394069121136 = 94394069121136 141115:20240819:174955.243 rbp = 55d9d4f57848 = 94394069121096 = 94394069121096 141115:20240819:174955.243 rbx = 3 = 3 = 3 141115:20240819:174955.243 rdx = 7f004a626f98 = 139639224692632 = 139639224692632 141115:20240819:174955.243 rax = 0 = 0 = 0 141115:20240819:174955.243 rcx = 30 = 48 = 48 141115:20240819:174955.243 rsp = 7fff51f605c8 = 140734568465864 = 140734568465864 141115:20240819:174955.244 rip = 7f00cd63fea6 = 139641422610086 = 139641422610086 141115:20240819:174955.244 efl = 10283 = 66179 = 66179 141115:20240819:174955.245 csgsfs = 2b000000000033 = 12103423998558259 = 12103423998558259 141115:20240819:174955.245 err = 4 = 4 = 4 141115:20240819:174955.245 trapno = e = 14 = 14 141115:20240819:174955.245 oldmask = 0 = 0 = 0 141115:20240819:174955.245 cr2 = 0 = 0 = 0 141115:20240819:174955.245 === Backtrace: === 141115:20240819:174955.247 18: /usr/sbin/zabbix_server: trapper #5 [processing data](zbx_backtrace+0x3b) [0x55d9d38e7ccb] 141115:20240819:174955.247 17: /usr/sbin/zabbix_server: trapper #5 [processing data](zbx_log_fatal_info+0x17d) [0x55d9d38e7f3d] 141115:20240819:174955.248 16: /usr/sbin/zabbix_server: trapper #5 [processing data](+0x3c35f6) [0x55d9d38e85f6] 141115:20240819:174955.248 15: /lib/x86_64-linux-gnu/libc.so.6(+0x3c050) [0x7f00cd5d4050] 141115:20240819:174955.248 14: /lib/x86_64-linux-gnu/libc.so.6(+0xa7ea6) [0x7f00cd63fea6] 141115:20240819:174955.248 13: /usr/sbin/zabbix_server: trapper #5 [processing data](+0x265853) [0x55d9d378a853] 141115:20240819:174955.248 12: /usr/sbin/zabbix_server: trapper #5 [processing data](zbx_dc_set_interfaces_availability+0x61) [0x55d9d3796901] 141115:20240819:174955.248 11: /usr/sbin/zabbix_server: trapper #5 [processing data](zbx_process_proxy_data+0x23db) [0x55d9d38abe5b] 141115:20240819:174955.248 10: /usr/sbin/zabbix_server: trapper #5 [processing data](recv_proxy_data+0x315) [0x55d9d363acf5] 141115:20240819:174955.248 9: /usr/sbin/zabbix_server: trapper #5 [processing data](zbx_trapper_process_request_server+0xe4) [0x55d9d363a7b4] 141115:20240819:174955.248 8: /usr/sbin/zabbix_server: trapper #5 [processing data](+0x1093f8) [0x55d9d362e3f8] 141115:20240819:174955.248 7: /usr/sbin/zabbix_server: trapper #5 [processing data](zbx_trapper_thread+0x422) [0x55d9d362f2c2] 141115:20240819:174955.248 6: /usr/sbin/zabbix_server: trapper #5 [processing data](zbx_thread_start+0x20) [0x55d9d381c600] 141115:20240819:174955.248 5: /usr/sbin/zabbix_server: trapper #5 [processing data](+0xcf44a) [0x55d9d35f444a] 141115:20240819:174955.249 4: /usr/sbin/zabbix_server: trapper #5 [processing data](MAIN_ZABBIX_ENTRY+0xe55) [0x55d9d35f5885] 141115:20240819:174955.249 3: /usr/sbin/zabbix_server: trapper #5 [processing data](main+0x37e) [0x55d9d35e9b5e] 141115:20240819:174955.249 2: /lib/x86_64-linux-gnu/libc.so.6(+0x2724a) [0x7f00cd5bf24a] 141115:20240819:174955.249 1: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7f00cd5bf305] 141115:20240819:174955.249 0: /usr/sbin/zabbix_server: trapper #5 [processing data](_start+0x21) [0x55d9d35f0bb1] 141115:20240819:174955.249 === Memory map: === 141115:20240819:174955.249 55d9d3525000-55d9d35e2000 r--p 00000000 fe:01 152010 /usr/sbin/zabbix_server 141115:20240819:174955.250 55d9d35e2000-55d9d391b000 r-xp 000bd000 fe:01 152010 /usr/sbin/zabbix_server 141115:20240819:174955.250 55d9d391b000-55d9d3a4e000 r--p 003f6000 fe:01 152010 /usr/sbin/zabbix_server 141115:20240819:174955.250 55d9d3a4e000-55d9d3a52000 r--p 00528000 fe:01 152010 /usr/sbin/zabbix_server 141115:20240819:174955.250 55d9d3a52000-55d9d3b51000 rw-p 0052c000 fe:01 152010 /usr/sbin/zabbix_server 141115:20240819:174955.250 55d9d3b51000-55d9d3b63000 rw-p 00000000 00:00 0 141115:20240819:174955.254 55d9d4ea2000-55d9d4ee4000 rw-p 00000000 00:00 0 [heap] 141115:20240819:174955.254 55d9d4ee4000-55d9d4f06000 rw-p 00000000 00:00 0 [heap] 141115:20240819:174955.254 55d9d4f06000-55d9d5494000 rw-p 00000000 00:00 0 [heap] 141115:20240819:174955.254 7f00498b0000-7f00498f0000 rw-s 00000000 00:01 35258375 /SYSV00000000 (deleted)
After chaning the PSK of the specific proxy to some invalid values (so no more data is processed) the server remains up and running.
But of course all data from that proxy is no longer ingested/processed.
The JSON data structure of the failing proxy data is about 136300 bytes in size