[ZBX-8754] zabbix crash in escalator in is_uint_n_range() Created: 2014 Sep 15  Updated: 2017 May 30  Resolved: 2014 Sep 19

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 2.2.6
Fix Version/s: 2.0.14rc1, 2.2.7rc1, 2.4.2rc1, 2.5.0

Type: Incident report Priority: Critical
Reporter: Sam Rudge Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: crash
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Ubuntu Precise (12.04.4 LTS), mysql 5.5.22-0ubuntu1


Attachments: File dissasembly.txt.gz     Text File zabbix-crash-debug.txt    

 Description   

When starting zabbix-server, the server crashes.

See attached debug log and disassembly listing for details.

Zabbix will not start at all



 Comments   
Comment by Aleksandrs Saveljevs [ 2014 Sep 15 ]

Backtrace for easier searching:

48083:20140915:155237.127 === Backtrace: ===
48083:20140915:155237.127 9: /usr/sbin/zabbix_server: escalator [processed 0 escalations in 0.000000 sec, processing escalations](print_fatal_info+0xae) [0x466fae]
48083:20140915:155237.127 8: /usr/sbin/zabbix_server: escalator [processed 0 escalations in 0.000000 sec, processing escalations]() [0x4673d1]
48083:20140915:155237.127 7: /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f67ff8464a0]
48083:20140915:155237.127 6: /usr/sbin/zabbix_server: escalator [processed 0 escalations in 0.000000 sec, processing escalations](is_uint_n_range+0x26) [0x46d066]
48083:20140915:155237.127 5: /usr/sbin/zabbix_server: escalator [processed 0 escalations in 0.000000 sec, processing escalations]() [0x43fffa]
48083:20140915:155237.127 4: /usr/sbin/zabbix_server: escalator [processed 0 escalations in 0.000000 sec, processing escalations](main_escalator_loop+0x625) [0x4408d5]
48083:20140915:155237.127 3: /usr/sbin/zabbix_server: escalator [processed 0 escalations in 0.000000 sec, processing escalations](MAIN_ZABBIX_ENTRY+0x932) [0x4191c2]
48083:20140915:155237.127 2: /usr/sbin/zabbix_server: escalator [processed 0 escalations in 0.000000 sec, processing escalations](daemon_start+0x1b2) [0x466862]
48083:20140915:155237.127 1: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) [0x7f67ff83176d]
48083:20140915:155237.127 0: /usr/sbin/zabbix_server: escalator [processed 0 escalations in 0.000000 sec, processing escalations]() [0x41405d]
Comment by Andris Zeila [ 2014 Sep 16 ]

The logs suggest that alerts table contains records with null userid and not nulll mediatype id. Could you check it by trying

select count(*) from alerts where mediatypeid is not null and userid is null;

(or similar) request in mysql?

This is not a normal situation, so could you give more information if there was anything specific done before the crash? For example if Zabbix was upgraded from older version or there were any issues (corruption) with database or anything?

Comment by Sam Rudge [ 2014 Sep 17 ]

Hi,

Yes that query returns 20 results, before the crash happened I was creating LLD rules on templates, but previously to that I was making changes to the alert actions though I think the crash started happening after that.

Zabbix hadn't been upgraded and all the changes made were done via the frontend or API, I can't see any other corruption in the database and have run a check on all the tables. This is a new server that isn't yet being used as we're migrating our old Zabbix server to it.

-Sam

Comment by Andris Zeila [ 2014 Sep 17 ]

Thanks, one more thing, could you please post the alerts table structure (show create table alerts)?

Comment by Sam Rudge [ 2014 Sep 19 ]
CREATE TABLE `alerts` (
  `alertid` bigint(20) unsigned NOT NULL,
  `actionid` bigint(20) unsigned NOT NULL,
  `eventid` bigint(20) unsigned NOT NULL,
  `userid` bigint(20) unsigned DEFAULT NULL,
  `clock` int(11) NOT NULL DEFAULT '0',
  `mediatypeid` bigint(20) unsigned DEFAULT NULL,
  `sendto` varchar(100) NOT NULL DEFAULT '',
  `subject` varchar(255) NOT NULL DEFAULT '',
  `message` text NOT NULL,
  `status` int(11) NOT NULL DEFAULT '0',
  `retries` int(11) NOT NULL DEFAULT '0',
  `error` varchar(128) NOT NULL DEFAULT '',
  `esc_step` int(11) NOT NULL DEFAULT '0',
  `alerttype` int(11) NOT NULL DEFAULT '0',
  PRIMARY KEY (`alertid`),
  KEY `alerts_1` (`actionid`),
  KEY `alerts_2` (`clock`),
  KEY `alerts_3` (`eventid`),
  KEY `alerts_4` (`status`,`retries`),
  KEY `alerts_5` (`mediatypeid`),
  KEY `alerts_6` (`userid`),
  CONSTRAINT `c_alerts_1` FOREIGN KEY (`actionid`) REFERENCES `actions` (`actionid`) ON DELETE CASCADE,
  CONSTRAINT `c_alerts_2` FOREIGN KEY (`eventid`) REFERENCES `events` (`eventid`) ON DELETE CASCADE,
  CONSTRAINT `c_alerts_3` FOREIGN KEY (`userid`) REFERENCES `users` (`userid`) ON DELETE CASCADE,
  CONSTRAINT `c_alerts_4` FOREIGN KEY (`mediatypeid`) REFERENCES `media_type` (`mediatypeid`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=latin1
Comment by Andris Zeila [ 2014 Sep 19 ]

Thanks, currently we are at loss how records with NULL userid and not NULL mediatypeid could have been inserted/updated into alerts table. For now you can safely remove them (delete from alerts where mediatypeid is not null and userid is null; ) and it happens again, maybe you will have more detailed steps how it happened.

Meanwhile we will fix server so it does not crash in such situation.

Comment by Andris Zeila [ 2014 Sep 19 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-8754

Comment by Andris Zeila [ 2014 Oct 08 ]

Released in:

  • pre-2.0.14rc1 r49622
  • pre-2.2.7rc1 r49623
  • pre-2.4.2rc2 r49650
  • pre-2.5.0 r49654
Generated at Sat Apr 20 04:30:29 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.