[ZBX-1018] Zabbix server 1.6.5 hangs after PostgreSQL database restart Created: 2009 Aug 19  Updated: 2017 May 30  Resolved: 2010 Mar 09

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 1.6.5
Fix Version/s: 1.8.2, 1.9.0 (alpha)

Type: Incident report Priority: Critical
Reporter: Emir Imamagic Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

CentOS 5, Zabbix 1.6.5


Issue Links:
Duplicate
duplicates ZBX-18 PostgreSQL monitoring [Watchdog] Closed

 Description   

Hello,

we just noticed that Zabbix 1.6.5 hangs after PostgreSQL database restart. In previous versions server would simply die after noticing that the database is gone. In the latest version it simply starts returning loads of messages that database query result has failed or that result is NULL. Here's an example from zabbix_server log:
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_str where itemid=32272]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_text where itemid=32272]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_text where itemid=32272]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_log where itemid=32272]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_log where itemid=32272]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from trends where itemid=32272]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from trends where itemid=32272]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_uint where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_uint where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_str where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_str where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_text where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_text where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_log where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_log where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from trends where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from trends where itemid=44687]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_uint where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_uint where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_str where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_str where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_text where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_text where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_log where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_log where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from trends where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from trends where itemid=44766]
13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history where itemid=27572]
13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history where itemid=27572]

(notice the timetamps ))

It continued behaving like this even after the database came back online. Only after a restart of Zabbix server things got back to normal. I assume that the problem is that server doesn't do reconnection. This probably affects only the latest version because it has PostgreSQL error handling implemented.



 Comments   
Comment by Aleksandrs Saveljevs [ 2010 Mar 09 ]

Fixed in pre-1.8.2 in r10669.

Generated at Fri Apr 26 04:36:54 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.