[ZBX-15171] Zabbix Server Crash Created: 2018 Nov 15  Updated: 2024 Apr 10  Resolved: 2018 Nov 20

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 4.0.1, 4.2.0alpha1, 4.2 (plan)
Fix Version/s: 4.0.2rc1, 4.2.0alpha1, 4.2 (plan)

Type: Problem report Priority: Critical
Reporter: Leonardo Maza Assignee: Alex Kalimulin
Resolution: Fixed Votes: 0
Labels: crash, discovery, odbc, zabbix_server
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu 16.04 and Ubuntu 18.04, MySQL Server


Attachments: Text File zabbix_server.log    
Issue Links:
Causes
caused by ZBX-10576 Restructure unixODBC related code Closed
Duplicate
is duplicated by ZBX-15180 Segfault on zabbix proxy Closed
Team: Team A
Sprint: Sprint 46, Nov 2018
Story Points: 0.125

 Description   

Steps to reproduce:

  1. Unknow I have a several logs.  

Result:
See log file...
See memory dump...
Expected:
See screenshot...

 

More data:

Some times them when Zabbix Server crash, It restart automatically, but some times cannot restart. In this cases, when I try stop zabbix server, It can not stop. When I see the proceses running with name zabbix (ps xaf | grep zabbix), Zabbix pinger is running. Then I kill it (kill -i PID) and I can start zabbix normally.

 



 Comments   
Comment by Vladislavs Sokurenko [ 2018 Nov 15 ]

Can you please provide output of:
ps xaf | grep zabbix

Does increasing CacheSize help with a crash ?

Comment by Vladislavs Sokurenko [ 2018 Nov 15 ]

I think that this issue with hanging process is solved in ZBX-15027

Comment by Vladislavs Sokurenko [ 2018 Nov 15 ]

Backtrace

 11992:20181114:153744.306 === Backtrace: ===
 11992:20181114:153744.307 17: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](zbx_backtrace+0x44) [0x5570b7651ce4]
 11992:20181114:153744.307 16: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](zbx_log_fatal_info+0x153) [0x5570b7651f8f]
 11992:20181114:153744.307 15: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](+0xff300) [0x5570b7652300]
 11992:20181114:153744.307 14: /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890) [0x7f221844e890]
 11992:20181114:153744.307 13: /lib/x86_64-linux-gnu/libc.so.6(+0xb1646) [0x7f221504b646]
 11992:20181114:153744.307 12: /lib/x86_64-linux-gnu/libc.so.6(__strdup+0xe) [0x7f22150379ae]
 11992:20181114:153744.307 11: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](zbx_strdup2+0x4f) [0x5570b76680b7]
 11992:20181114:153744.307 10: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](zbx_odbc_query_result_to_lld_json+0x3e1) [0x5570b75ddb67]
 11992:20181114:153744.308 9: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](get_value_db+0x28e) [0x5570b75af549]
 11992:20181114:153744.308 8: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](+0x4bade) [0x5570b759eade]
 11992:20181114:153744.308 7: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](+0x4e028) [0x5570b75a1028]
 11992:20181114:153744.308 6: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](poller_thread+0x1a1) [0x5570b75a224f]
 11992:20181114:153744.308 5: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](zbx_thread_start+0x32) [0x5570b765faf2]
 11992:20181114:153744.308 4: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](MAIN_ZABBIX_ENTRY+0x9c4) [0x5570b758b486]
 11992:20181114:153744.308 3: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](daemon_start+0x315) [0x5570b765142d]
 11992:20181114:153744.308 2: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](main+0x305) [0x5570b758aaac]
 11992:20181114:153744.308 1: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7) [0x7f2214fbbb97]
 11992:20181114:153744.308 0: /usr/sbin/zabbix_server: poller #9 [got 4 values in 0.167859 sec, getting values](_start+0x2a) [0x5570b7589c7a]

Possible code fragment that is missing NULL check:

	while (NULL != (row = zbx_odbc_fetch(query_result)))
	{
		zbx_json_addobject(&json, NULL);

		for (i = 0; i < query_result->col_num; i++)
		{
			char	*value = NULL;

			value = zbx_strdup(value, row[i]);
			zbx_replace_invalid_utf8(value);
			zbx_json_addstring(&json, macros.values[i], value, ZBX_JSON_TYPE_STRING);
			zbx_free(value);
		}

		zbx_json_close(&json);
	}

Patch:

Index: src/zabbix_server/odbc/odbc.c
===================================================================
--- src/zabbix_server/odbc/odbc.c	(revision 86893)
+++ src/zabbix_server/odbc/odbc.c	(working copy)
@@ -611,6 +611,9 @@
 		{
 			char	*value = NULL;
 
+			if (NULL == row[i])
+				continue;
+
 			value = zbx_strdup(value, row[i]);
 			zbx_replace_invalid_utf8(value);
 			zbx_json_addstring(&json, macros.values[i], value, ZBX_JSON_TYPE_STRING);
Comment by Leonardo Maza [ 2018 Nov 15 ]

Hi, thanks for quickly response

I increase now the CacheSize from 256M to 512M

  1. CacheSize=8M
    #CacheSize=256M
    CacheSize=512M
    If the zabbix server crash again I will commented you

This is the ps xaf output

 The last lines on zabbix_logs when crash ocurre.

The output

54136:20181115:043744.698 One child process died (PID:54199,exitcode/signal:1). Exiting ...
zabbix_server [54136]: Error waiting for process with PID 54199: [10] No child processes

^C

First I try stop zabbix-server crash, the command never finished and I cancelled it
seacadm@seac-roy-mon01:~$ sudo service zabbix-server stop
[sudo] password for seacadm:
^C

Output ps xaf | grep zabbix. Alway I found icmp pinger running and I only can kill it with kill -9 command
seacadm@seac-roy-mon01:~$ ps xaf | grep zabbix
41053 pts/0 T 0:00 | _ sudo nano /etc/zabbix/zabbix_server.conf
41069 pts/0 T 0:00 | | _ nano /etc/zabbix/zabbix_server.conf
56026 pts/0 S+ 0:00 | _ grep --color=auto zabbix
1861 ? S 0:00 /usr/sbin/zabbix_agentd -c /etc/zabbix/zabbix_agentd.conf
1864 ? S 0:31 _ /usr/sbin/zabbix_agentd: collector [idle 1 sec]
1865 ? S 0:08 _ /usr/sbin/zabbix_agentd: listener #1 [waiting for connection]
1866 ? S 0:09 _ /usr/sbin/zabbix_agentd: listener #2 [waiting for connection]
1867 ? S 0:09 _ /usr/sbin/zabbix_agentd: listener #3 [waiting for connection]
1868 ? S 0:07 _ /usr/sbin/zabbix_agentd: active checks #1 [getting list of active checks]
54271 ? S 0:00 /usr/sbin/zabbix_server: icmp pinger #9 [pinging hosts]
seacadm@seac-roy-mon01:~$ sudo kill -9 54271

 

Then I can start zabbix server again
seacadm@seac-roy-mon01:~$ sudo service zabbix-server start

 

 

Comment by Vladislavs Sokurenko [ 2018 Nov 15 ]

Yes, this process leftover is fixed under ZBX-15027. Crash will be fixed under this issue, thank you very much for your report!

Comment by Leonardo Maza [ 2018 Nov 15 ]

Excellent for ZBX 15027!

Increase the CacheSize not hel with crash

For now, I patch this issue with the next script, on crontab, until new release, Thanks

 

- Crontab

*/2 * * * * /home/seacadm/restart_zabbix_crash.sh > /home/seacadm/logRestartZabbix.log 2>&1

 

- Script

#!/bin/bash

MAXAGE=120 #seconds
fileAge=$(($(date +%s) - $(stat -c '%Y' "/var/log/zabbix/zabbix_server.log")))

mensaje='Ultima escritura en el archivo de log hace '$fileAge' segundos'
echo $mensaje

test $fileAge -gt $MAXAGE && {
   echo 'restart zabbix'
   ps -ef | grep zabbix_server | grep -v grep | awk '{print $2}' | xargs -r kill -9
   service zabbix-server start
}

Comment by Glebs Ivanovskis [ 2018 Nov 15 ]

Good job, vso!

Caused by ZBX-10576. Sorry, my bad.

Comment by Vladislavs Sokurenko [ 2018 Nov 15 ]

Thanks cyclone, updated caused by

Comment by Glebs Ivanovskis [ 2018 Nov 16 ]

Please also add Proxy into Component/s.

vso done

cyclone Thanks!

Comment by Vladislavs Sokurenko [ 2018 Nov 20 ]

In 3.0 empty string is returned for NULL values.

Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" already exists.
Cannot create item: item with the same key "trap1[""]" alrea

Log:

 31836:20181120:143337.720 json '{"data":[{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":null},{"{#PROXY_HOSTID}":"10267"}]}'
Comment by Alex Kalimulin [ 2018 Nov 20 ]

Fixed in:

  • pre-4.0.2rc1 r87075
  • pre-4.2.0alpha1 r87076
Generated at Fri Apr 26 12:14:29 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.