Uploaded image for project: 'ZABBIX BUGS AND ISSUES'
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-1018

Zabbix server 1.6.5 hangs after PostgreSQL database restart

    XMLWordPrintable

Details

    • Incident report
    • Status: Closed
    • Critical
    • Resolution: Fixed
    • 1.6.5
    • 1.8.2, 1.9.0 (alpha)
    • Server (S)
    • None
    • CentOS 5, Zabbix 1.6.5

    Description

      Hello,

      we just noticed that Zabbix 1.6.5 hangs after PostgreSQL database restart. In previous versions server would simply die after noticing that the database is gone. In the latest version it simply starts returning loads of messages that database query result has failed or that result is NULL. Here's an example from zabbix_server log:
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_str where itemid=32272]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_text where itemid=32272]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_text where itemid=32272]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_log where itemid=32272]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_log where itemid=32272]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from trends where itemid=32272]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from trends where itemid=32272]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_uint where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_uint where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_str where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_str where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_text where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_text where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_log where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_log where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from trends where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from trends where itemid=44687]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_uint where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_uint where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_str where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_str where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_text where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_text where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history_log where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history_log where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from trends where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from trends where itemid=44766]
      13179:20090818:133528 [Z3005] Query failed: [0] Result is NULL [select min(clock) from history where itemid=27572]
      13179:20090818:133528 [Z3005] Query failed: [0] PGRES_FATAL_ERROR: [select min(clock) from history where itemid=27572]

      (notice the timetamps ))

      It continued behaving like this even after the database came back online. Only after a restart of Zabbix server things got back to normal. I assume that the problem is that server doesn't do reconnection. This probably affects only the latest version because it has PostgreSQL error handling implemented.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              eimamagi Emir Imamagic
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: