[ZBX-10485] zabbix_server 3.0.1 crashing on database upgrade of 2.2.5 database Created: 2016 Mar 02  Updated: 2017 May 30  Resolved: 2016 Mar 07

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 2.4.6, 3.0.1
Fix Version/s: 2.4.8rc1, 3.0.2rc1, 3.2.0alpha1

Type: Incident report Priority: Major
Reporter: Todd Blake Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: crash, upgrade
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

RHEL 6.5


Attachments: File bugreport.tar.gz     File bugreport3.tar.gz    

 Description   

On startup against a 2.2.5 database, the zabbix_server crashes during the database upgrade process. Subsequent attempts crash as well.

Attached:

config.log - from compilation
objdump.txt - objump as requested in the zabbix server log
zabbix_server3.log - the server log
zabbix_server.conf - the server.conf with authentication details redacted



 Comments   
Comment by Aleksandrs Saveljevs [ 2016 Mar 02 ]

Thank you! It seems to fail in DBpatch_2030094().

Could you please attach a DebugLevel=4 log of the upgrade process, along with the output of "select triggerid,expression from triggers" query?

Comment by Todd Blake [ 2016 Mar 02 ]

Certainly, attached in bugreport2.tar.gz

Comment by Aleksandrs Saveljevs [ 2016 Mar 02 ]

Could you please check whether you have attached the correct log file? It is full of lines about active proxies.

Comment by Todd Blake [ 2016 Mar 02 ]

oops , attached bugreport3.tar.gz

Comment by Glebs Ivanovskis (Inactive) [ 2016 Mar 02 ]

Can the first entry

+-----------+---------------------------------------------------------+
| triggerid | expression                                              |
+-----------+---------------------------------------------------------+
|         0 |                                                         |

cause

static int	DBpatch_2030094(void)
{
	...
		for (p = row[1]; '\0' != *p; p++)

to dereference p which stores NULL?

Comment by Todd Blake [ 2016 Mar 02 ]

For what it's worth, deleting that orphaned trigger does the trick.

mysql> SELECT triggerid,expression from triggers where expression='';
+-----------+------------+
| triggerid | expression |
+-----------+------------+
|         0 |            |
+-----------+------------+
1 row in set (0.04 sec)

mysql> delete from triggers where triggerid=0;
Query OK, 1 row affected (0.01 sec)
 25508:20160302:102726.414 completed 97% of database upgrade
 25508:20160302:102726.419 completed 98% of database upgrade
 25508:20160302:102726.421 completed 99% of database upgrade
 25508:20160302:102726.424 completed 100% of database upgrade
 25508:20160302:102726.424 database upgrade fully completed
Comment by Glebs Ivanovskis (Inactive) [ 2016 Mar 03 ]

Could you please show the output of the following command?

show create table triggers;

There should be NOT NULL constraint for expression.

Comment by Todd Blake [ 2016 Mar 03 ]

Pasted below.

Isn't an empty string different than a NULL though? The select shows an empty string.

mysql> show create table triggers\G
*************************** 1. row ***************************
       Table: triggers
Create Table: CREATE TABLE `triggers` (
  `triggerid` bigint(20) unsigned NOT NULL,
  `expression` varchar(2048) NOT NULL DEFAULT '',
  `description` varchar(255) NOT NULL DEFAULT '',
  `url` varchar(255) NOT NULL DEFAULT '',
  `status` int(11) NOT NULL DEFAULT '0',
  `value` int(11) NOT NULL DEFAULT '0',
  `priority` int(11) NOT NULL DEFAULT '0',
  `lastchange` int(11) NOT NULL DEFAULT '0',
  `comments` text NOT NULL,
  `error` varchar(128) NOT NULL DEFAULT '',
  `templateid` bigint(20) unsigned DEFAULT NULL,
  `type` int(11) NOT NULL DEFAULT '0',
  `state` int(11) NOT NULL DEFAULT '0',
  `flags` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`triggerid`),
  KEY `triggers_1` (`status`),
  KEY `triggers_2` (`value`),
  KEY `triggers_3` (`templateid`),
  CONSTRAINT `c_triggers_1` FOREIGN KEY (`templateid`) REFERENCES `triggers` (`triggerid`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql> select triggerid,CONCAT("'",expression,"'") as expression from triggers where triggerid=0;
+-----------+------------+
| triggerid | expression |
+-----------+------------+
|         0 | ''         |
+-----------+------------+
1 row in set (0.00 sec)

mysql>
Comment by Glebs Ivanovskis (Inactive) [ 2016 Mar 03 ]

Thank you very much for your cooperation!

I guessed the entry causing crash, but I was wrong about the exact place where the crash happens.

static int	DBpatch_2030094(void)
{
	...
	char		..., *expr = NULL, ...;
	...
/* loop breaks on the very first iteration when p is empty string */
		for (p = row[1]; '\0' != *p; p++)
		{
			...
		}
/* taking else-branch */
		if (2048 < expr_offset && 2048 /* TRIGGER_EXPRESSION_LEN */ < zbx_strlen_utf8(expr))
		{
			...
		}
		else if (0 != strcmp(row[1], expr))	/* expr is still NULL, CRASH!!! */
		{
			...
		}
	...
}
Comment by Glebs Ivanovskis (Inactive) [ 2016 Mar 03 ]

Fix for version 2.4 is available in development branch svn://svn.zabbix.com/branches/dev/ZBX-10485-24 revision 58833.

Comment by Glebs Ivanovskis (Inactive) [ 2016 Mar 03 ]

Just a side note, thoughts aloud.

This is the second crash I investigate which is related to passing NULL to standard strcmp() function. We use zbx_malloc() instead of standard malloc() which is very convenient because we don't have to check for NULL return value every time we use malloc(). Maybe we could wrap strcmp() to check for NULL parameters automatically? I don't mean making zbx_strcmp() fully NULL-tolerant, but knowing exact places where NULL was unintentionally passed to strcmp() could save a lot of time investigating crashes.

Comment by Alexander Vladishev [ 2016 Mar 05 ]

(1) an alternative solution was implemented in r58856. Please take a look.

glebs.ivanovskis I have thought of this variant too. Personally, I have no preference. Changes look fine.
CLOSED

Comment by Glebs Ivanovskis (Inactive) [ 2016 Mar 07 ]

Fixed in pre-2.4.8rc1 r58863, pre-3.0.2rc1 r58864, pre-3.1.0 (trunk) r58865.

Comment by Sandis Neilands (Inactive) [ 2016 Mar 09 ]

This issue was also found by Coverity: CID 118937.

Generated at Fri Apr 26 02:40:06 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.