[ZBX-15119] Zabbix Server PgSQL upgrade to 4.0.1 fails Created: 2018 Nov 01  Updated: 2018 Nov 13  Resolved: 2018 Nov 13

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Server (S)
Affects Version/s: 4.0.1
Fix Version/s: None

Type: Incident report Priority: Blocker
Reporter: Alex P Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: 4.0.1, PostgreSQL, server, upgrade
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian GNU/Linux 9.5 (stretch)

postgresql 11+195.pgdg90+1
postgresql-11 11.0-1.pgdg90+2
postgresql-client-11 11.0-1.pgdg90+2
postgresql-client-common 195.pgdg90+1
postgresql-common 195.pgdg90+1
postgresql-contrib 11+195.pgdg90+1
zabbix-agent 1:4.0.1-1+stretch
zabbix-frontend-php 1:4.0.1-1+stretch
zabbix-get 1:4.0.1-1+stretch
zabbix-sender 1:4.0.1-1+stretch
zabbix-server-pgsql 1:4.0.1-1+stretch



 Description   

Steps to reproduce:

  1. Trying to upgrade from 3.4.14
(Reading database ... 56532 files and directories currently installed.)
Preparing to unpack .../zabbix-server-pgsql_1%3a4.0.1-1+stretch_amd64.deb ...
Unpacking zabbix-server-pgsql (1:4.0.1-1+stretch) over (1:3.4.14-1+stretch) ...
Preparing to unpack .../libgpg-error-l10n_1.32-3~bpo9+2_all.deb ...
Unpacking libgpg-error-l10n (1.32-3~bpo9+2) over (1.32-3~bpo9+1) ...
Preparing to unpack .../libgpg-error0_1.32-3~bpo9+2_amd64.deb ...
Unpacking libgpg-error0:amd64 (1.32-3~bpo9+2) over (1.32-3~bpo9+1) ...
Setting up libgpg-error0:amd64 (1.32-3~bpo9+2) ...
(Reading database ... 56532 files and directories currently installed.)
Preparing to unpack .../zabbix-frontend-php_1%3a4.0.1-1+stretch_all.deb ...
Unpacking zabbix-frontend-php (1:4.0.1-1+stretch) over (1:3.4.14-1+stretch) ...
Preparing to unpack .../zabbix-get_1%3a4.0.1-1+stretch_amd64.deb ...
Unpacking zabbix-get (1:4.0.1-1+stretch) over (1:3.4.14-1+stretch) ...
Preparing to unpack .../zabbix-sender_1%3a4.0.1-1+stretch_amd64.deb ...
Unpacking zabbix-sender (1:4.0.1-1+stretch) over (1:3.4.14-1+stretch) ...
Setting up libgpg-error-l10n (1.32-3~bpo9+2) ...
Setting up zabbix-server-pgsql (1:4.0.1-1+stretch) ...

Configuration file '/etc/zabbix/zabbix_server.conf'
 ==> Modified (by you or by a script) since installation.
 ==> Package distributor has shipped an updated version.
 ==> Keeping old config file as default.
[/usr/lib/tmpfiles.d/zabbix-server.conf:1] Line references path below legacy directory /var/run/, updating /var/run/zabbix → /run/zabbix; please update the tmpfiles.d/ drop-in file accordingly.
Processing triggers for libc-bin (2.24-11+deb9u3) ...
Setting up zabbix-frontend-php (1:4.0.1-1+stretch) ...
Processing triggers for systemd (239-11~bpo9+1) ...
Setting up zabbix-sender (1:4.0.1-1+stretch) ...
Processing triggers for man-db (2.7.6.1-2) ...
Setting up zabbix-get (1:4.0.1-1+stretch) ...
Reading package lists... Done
Building dependency tree
Reading state information... Done

Result:
The installation itself was fine. When starting the server, it fails to upgrade the DB:

2018-11-01T11:30:50-06:00 zbxserver systemd[1]: Starting Zabbix Server...
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: Starting Zabbix Server. Zabbix 4.0.1 (revision 86073).
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: ****** Enabled features ******
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: SNMP monitoring:           YES
2018-11-01T11:30:50-06:00 zbxserver systemd[1]: Started Zabbix Server.
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: IPMI monitoring:           YES
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: Web monitoring:            YES
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: VMware monitoring:         YES
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: SMTP authentication:       YES
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: Jabber notifications:      YES
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: Ez Texting notifications:  YES
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: ODBC:                      YES
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: SSH2 support:              YES
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: IPv6 support:              YES
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: TLS support:               YES
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: ******************************
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: using configuration file: /etc/zabbix/zabbix_server.conf
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: current database version (mandatory/optional): 03050116/03050116
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: required mandatory version: 04000000
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: starting automatic database upgrade
2018-11-01T11:30:50-06:00 zbxserver postgres[21049]: [20-1] 2018-11-01 17:30:50.321 GMT [21049] zabbix@zabbix ERROR:  null value in column "acknowledged" violates not-null constraint
2018-11-01T11:30:50-06:00 zbxserver postgres[21049]: [20-2] 2018-11-01 17:30:50.321 GMT [21049] zabbix@zabbix DETAIL:  Failing row contains (714, 0, 0, 22051, 1485309494, 392175048, 761, 1485352694, 355722226, null, 0, , null, 0).
2018-11-01T11:30:50-06:00 zbxserver postgres[21049]: [20-3] 2018-11-01 17:30:50.321 GMT [21049] zabbix@zabbix STATEMENT:  update problem set acknowledged=(select acknowledged from events where events.eventid=problem.eventid)
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR:  null value in column "acknowledged" violates not-null constraint
DETAIL:  Failing row contains (714, 0, 0, 22051, 1485309494, 392175048, 761, 1485352694, 355722226, null, 0, , null, 0).
 [update problem set acknowledged=(select acknowledged from events where events.eventid=problem.eventid)]
2018-11-01T11:30:50-06:00 zbxserver zabbix_server[21047]: database upgrade failed
2018-11-01T11:30:50-06:00 zbxserver systemd[1]: zabbix-server.service: Main process exited, code=exited, status=1/FAILURE
2018-11-01T11:30:50-06:00 zbxserver systemd[1]: zabbix-server.service: Failed with result 'exit-code'.


 Comments   
Comment by Andrei Gushchin (Inactive) [ 2018 Nov 02 ]

Hello Alex,

Thank you for the reporting this.
Could you please show the description of the problem table:

zabbix#\d+
Comment by Alex P [ 2018 Nov 02 ]

here it is:

zabbix=# \d+ problem
                                       Table "public.problem"
    Column     |  Type   | Collation | Nullable |   Default   | Storage | Stats target | Description
---------------+---------+-----------+----------+-------------+---------+--------------+-------------
 eventid       | bigint  |           | not null |             | plain   |              |
 source        | integer |           | not null | 0           | plain   |              |
 object        | integer |           | not null | 0           | plain   |              |
 objectid      | bigint  |           | not null | '0'::bigint | plain   |              |
 clock         | integer |           | not null | 0           | plain   |              |
 ns            | integer |           | not null | 0           | plain   |              |
 r_eventid     | bigint  |           |          |             | plain   |              |
 r_clock       | integer |           | not null | 0           | plain   |              |
 r_ns          | integer |           | not null | 0           | plain   |              |
 correlationid | bigint  |           |          |             | plain   |              |
 userid        | bigint  |           |          |             | plain   |              |
Indexes:
    "problem_pkey" PRIMARY KEY, btree (eventid)
    "problem_1" btree (source, object, objectid)
    "problem_2" btree (r_clock)
    "problem_3" btree (r_eventid)
Referenced by:
    TABLE "problem_tag" CONSTRAINT "c_problem_tag_1" FOREIGN KEY (eventid) REFERENCES problem(eventid) ON DELETE CASCADE

and

zabbix=# \d+ events
                                       Table "public.events"
    Column    |  Type   | Collation | Nullable |   Default   | Storage | Stats target | Description
--------------+---------+-----------+----------+-------------+---------+--------------+-------------
 eventid      | bigint  |           | not null |             | plain   |              |
 source       | integer |           | not null | 0           | plain   |              |
 object       | integer |           | not null | 0           | plain   |              |
 objectid     | bigint  |           | not null | (0)::bigint | plain   |              |
 clock        | integer |           | not null | 0           | plain   |              |
 value        | integer |           | not null | 0           | plain   |              |
 acknowledged | integer |           | not null | 0           | plain   |              |
 ns           | integer |           | not null | 0           | plain   |              |
Indexes:
    "events_pkey" PRIMARY KEY, btree (eventid)
    "events_1" btree (source, object, objectid, clock)
    "events_2" btree (source, object, clock)
Referenced by:
    TABLE "acknowledges" CONSTRAINT "c_acknowledges_2" FOREIGN KEY (eventid) REFERENCES events(eventid) ON DELETE CASCADE
    TABLE "alerts" CONSTRAINT "c_alerts_2" FOREIGN KEY (eventid) REFERENCES events(eventid) ON DELETE CASCADE
    TABLE "alerts" CONSTRAINT "c_alerts_5" FOREIGN KEY (p_eventid) REFERENCES events(eventid) ON DELETE CASCADE
    TABLE "event_recovery" CONSTRAINT "c_event_recovery_3" FOREIGN KEY (c_eventid) REFERENCES events(eventid) ON DELETE CASCADE
    TABLE "event_tag" CONSTRAINT "c_event_tag_1" FOREIGN KEY (eventid) REFERENCES events(eventid) ON DELETE CASCADE
Comment by Edgars Melveris [ 2018 Nov 06 ]

Hello Alex, this looks like a broken db problem.

Can you try this command?

select * from events where acknowledged not in ( 0, 1 );
Comment by Alex P [ 2018 Nov 06 ]

Hello Edgars,

Here it is:

zabbix=# select * from events where acknowledged not in ( 0, 1 );
 eventid | source | object | objectid | clock | value | acknowledged | ns
---------+--------+--------+----------+-------+-------+--------------+----
(0 rows)

Comment by Edgars Melveris [ 2018 Nov 06 ]

Sorry, I gave you a wrong query, that one won't find null values.

Please try this one:

select * from events where acknowledged is null;
Comment by Alex P [ 2018 Nov 06 ]

No `null` entries:

zabbix=# select * from events where acknowledged is null;
 eventid | source | object | objectid | clock | value | acknowledged | ns
---------+--------+--------+----------+-------+-------+--------------+----
(0 rows)

Comment by Edgars Melveris [ 2018 Nov 07 ]

Ok, can you now try this one?

select p.eventid from problem p left join events e on e.eventid = p.eventid where e.eventid is null;
Comment by Alex P [ 2018 Nov 07 ]
zabbix=# select count(p.eventid) from problem p left join events e on e.eventid = p.eventid where e.eventid is null;
 count
--------
 195902
(1 row)

Comment by Edgars Melveris [ 2018 Nov 08 ]

Hello Alex!

Looks like something is very broken in your database, you shouldn't have any problems without corresponding events. But you have almost 200k. This is why the upgrade fails, you need to fix that.

As for this bug report - most probably this was done outside of Zabbix, have you done anything to the database manually? If you can't reproduce the steps to create this situation with Zabbix itself, we'll have to close this issue.

Comment by Alex P [ 2018 Nov 13 ]

Edgars,

Nothing was done to the DB manually. Only upgrades via DEB packages. And we've had that one for 3-4 years.

Either way, closing the ticket, as did a fresh install and re-imported hosts, screens, etc.

Generated at Wed Apr 17 02:54:18 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.