ZABBIX BUGS AND ISSUES
  1. ZABBIX BUGS AND ISSUES
  2. ZBX-11203

database upgrade from Zabbix 3.0 to Zabbix 3.2 fails

    Details

    • Type: Incident report Incident report
    • Status: Reopened
    • Priority: Blocker Blocker
    • Resolution: Unresolved
    • Affects Version/s: 3.2.0
    • Fix Version/s: None
    • Component/s: Server (S)
    • Environment:
      centos 7 x86-64

      Description

      During upgrade from 3.0.4 to 3.2.0 I get:

         814:20160914:164040.194 current database version (mandatory/optional): 03010002/03010002
         814:20160914:164040.194 required mandatory version: 03020000
         814:20160914:164040.194 starting automatic database upgrade
         814:20160914:164040.195 [Z3005] query failed: [1091] Can't DROP 'id'; check that column/key exists [alter table history_text drop column id]
         814:20160914:164040.195 database upgrade failed
      

        Issue Links

          Activity

          Hide
          Aleksandrs Saveljevs added a comment - - edited

          Could you please post the structure of table "history_text" before the upgrade? What database engine do you use?

          Show
          Aleksandrs Saveljevs added a comment - - edited Could you please post the structure of table "history_text" before the upgrade? What database engine do you use?
          Hide
          sles added a comment - - edited

          I looked into logs more deeply while restoring database from backup and found that for some reason zabbix server was stopped during upgrade:

           38158:20160914:163545.748 current database version (mandatory/optional): 03000000/03000000
           38158:20160914:163545.748 required mandatory version: 03020000
           38158:20160914:163545.748 starting automatic database upgrade
           38158:20160914:163545.917 completed 1% of database upgrade
           38158:20160914:163546.805 completed 2% of database upgrade
           38158:20160914:163546.903 completed 3% of database upgrade
           38158:20160914:163632.375 Got signal [signal:15(SIGTERM),sender_pid:38837,sender_uid:0,reason:0]. Exiting ...
          

          This is why I got error message then.
          Very strange, all I did is just yum update, didn't stop zabbix server.
          Anyway, if there is any bug it is rpm package related, not zabbix server.

          Could you , please, close this bug report?
          Thank you!

          Show
          sles added a comment - - edited I looked into logs more deeply while restoring database from backup and found that for some reason zabbix server was stopped during upgrade: 38158:20160914:163545.748 current database version (mandatory/optional): 03000000/03000000 38158:20160914:163545.748 required mandatory version: 03020000 38158:20160914:163545.748 starting automatic database upgrade 38158:20160914:163545.917 completed 1% of database upgrade 38158:20160914:163546.805 completed 2% of database upgrade 38158:20160914:163546.903 completed 3% of database upgrade 38158:20160914:163632.375 Got signal [signal:15(SIGTERM),sender_pid:38837,sender_uid:0,reason:0]. Exiting ... This is why I got error message then. Very strange, all I did is just yum update, didn't stop zabbix server. Anyway, if there is any bug it is rpm package related, not zabbix server. Could you , please, close this bug report? Thank you!
          Hide
          Aleksandrs Saveljevs added a comment -

          Thank you! Closing as "Won't fix" then.

          Show
          Aleksandrs Saveljevs added a comment - Thank you! Closing as "Won't fix" then.
          Hide
          Alexander Yesipov added a comment - - edited

          Same for me:

           24288:20160914:154628.225 current database version (mandatory/optional): 03000000/03000000
           24288:20160914:154628.225 required mandatory version: 03020000
           24288:20160914:154628.225 starting automatic database upgrade
           24288:20160914:154628.530 completed 1% of database upgrade
           24288:20160914:154628.741 Got signal [signal:15(SIGTERM),sender_pid:24323,sender_uid:0,reason:0]. Exiting ...
          
          Linux 3.10.0-327.18.2.el7.x86_64 #1 SMP Thu May 12 04:13:05 CDT 2016 x86_64 x86_64 x86_64 GNU/Linux
          

          update via yum without manual server stopping

          Show
          Alexander Yesipov added a comment - - edited Same for me: 24288:20160914:154628.225 current database version (mandatory/optional): 03000000/03000000 24288:20160914:154628.225 required mandatory version: 03020000 24288:20160914:154628.225 starting automatic database upgrade 24288:20160914:154628.530 completed 1% of database upgrade 24288:20160914:154628.741 Got signal [signal:15(SIGTERM),sender_pid:24323,sender_uid:0,reason:0]. Exiting ... Linux 3.10.0-327.18.2.el7.x86_64 #1 SMP Thu May 12 04:13:05 CDT 2016 x86_64 x86_64 x86_64 GNU/Linux update via yum without manual server stopping
          Hide
          richlv added a comment -

          if those were the zabbix-provided packages and the server was not accidentally stopped manually (by attempting to restart it, for example), there might indeed be something to fix on the zabbix side - but we can also wait and see whether anybody else faces such a problem

          Show
          richlv added a comment - if those were the zabbix-provided packages and the server was not accidentally stopped manually (by attempting to restart it, for example), there might indeed be something to fix on the zabbix side - but we can also wait and see whether anybody else faces such a problem
          Hide
          Alexander Yesipov added a comment -

          Yep, packages were from zabbix repo. Overall process:
          1) download http://repo.zabbix.com/zabbix/3.2/rhel/7/x86_64/zabbix-release-3.2-1.el7.noarch.rpm
          2) install zabbix-release-3.2 over zabbix-release-3.0
          3) yum clean all && yum update
          4) DB upgrade failed

          Show
          Alexander Yesipov added a comment - Yep, packages were from zabbix repo. Overall process: 1) download http://repo.zabbix.com/zabbix/3.2/rhel/7/x86_64/zabbix-release-3.2-1.el7.noarch.rpm 2) install zabbix-release-3.2 over zabbix-release-3.0 3) yum clean all && yum update 4) DB upgrade failed
          Hide
          Alexander Yesipov added a comment - - edited

          I've got second zabbix3.0 instance which I gonna update tonight. Current history_log structure is:

          mysql> describe history_log;
          +------------+---------------------+------+-----+---------+-------+
          | Field      | Type                | Null | Key | Default | Extra |
          +------------+---------------------+------+-----+---------+-------+
          | id         | bigint(20) unsigned | NO   | PRI | NULL    |       |
          | itemid     | bigint(20) unsigned | NO   | MUL | NULL    |       |
          | clock      | int(11)             | NO   |     | 0       |       |
          | timestamp  | int(11)             | NO   |     | 0       |       |
          | source     | varchar(64)         | NO   |     |         |       |
          | severity   | int(11)             | NO   |     | 0       |       |
          | value      | text                | NO   |     | NULL    |       |
          | logeventid | int(11)             | NO   |     | 0       |       |
          | ns         | int(11)             | NO   |     | 0       |       |
          +------------+---------------------+------+-----+---------+-------+
          

          I will post feedback here.

          Show
          Alexander Yesipov added a comment - - edited I've got second zabbix3.0 instance which I gonna update tonight. Current history_log structure is: mysql> describe history_log; +------------+---------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------+---------------------+------+-----+---------+-------+ | id | bigint(20) unsigned | NO | PRI | NULL | | | itemid | bigint(20) unsigned | NO | MUL | NULL | | | clock | int(11) | NO | | 0 | | | timestamp | int(11) | NO | | 0 | | | source | varchar(64) | NO | | | | | severity | int(11) | NO | | 0 | | | value | text | NO | | NULL | | | logeventid | int(11) | NO | | 0 | | | ns | int(11) | NO | | 0 | | +------------+---------------------+------+-----+---------+-------+ I will post feedback here.
          Hide
          Oleksiy Zagorskyi added a comment - - edited

          Just a few question appeared in my mind:
          Is it possible that "yum update" causes the server stop a bit later after restart ?
          Does zabbix server create pid file before starting DB upgrade ?

          Note - in both cases posted above (by different users), sender_pid:NNNNN is higher that PID of running main process, so the SIGTERM signal sender has been started later after zabbix server daemon started.
          Added: looks like it's expected if use init script to stop zabbix server, because it forks additional shell (new PID).
          But if use the same shell to just kill main zabbix process, then sender_pid:NNNNN is lower and is the same between start/stop, as expected too.

          We need Kodai Terashima attention here too.

          Show
          Oleksiy Zagorskyi added a comment - - edited Just a few question appeared in my mind: Is it possible that "yum update" causes the server stop a bit later after restart ? Does zabbix server create pid file before starting DB upgrade ? Note - in both cases posted above (by different users), sender_pid:NNNNN is higher that PID of running main process, so the SIGTERM signal sender has been started later after zabbix server daemon started. Added: looks like it's expected if use init script to stop zabbix server, because it forks additional shell (new PID). But if use the same shell to just kill main zabbix process, then sender_pid:NNNNN is lower and is the same between start/stop, as expected too. We need Kodai Terashima attention here too.
          Hide
          Alexander Yesipov added a comment -

          I did upgrade and results are the same. Here some logs:
          zabbix_server.log

            1786:20160914:220927.619 Got signal [signal:15(SIGTERM),sender_pid:25289,sender_uid:0,reason:0]. Exiting ...
            1786:20160914:220929.631 syncing history data...
            1786:20160914:220929.632 syncing history data done
            1786:20160914:220929.632 syncing trends data...
            1786:20160914:220929.849 syncing trends data done
            1786:20160914:220929.850 Zabbix Server stopped. Zabbix 3.0.4 (revision 61185).
           25294:20160914:220929.912 Starting Zabbix Server. Zabbix 3.2.0 (revision 62485).
           25294:20160914:220929.912 ****** Enabled features ******
           25294:20160914:220929.912 SNMP monitoring:           YES
           25294:20160914:220929.912 IPMI monitoring:           YES
           25294:20160914:220929.912 Web monitoring:            YES
           25294:20160914:220929.912 VMware monitoring:         YES
           25294:20160914:220929.912 SMTP authentication:       YES
           25294:20160914:220929.912 Jabber notifications:      YES
           25294:20160914:220929.912 Ez Texting notifications:  YES
           25294:20160914:220929.912 ODBC:                      YES
           25294:20160914:220929.912 SSH2 support:              YES
           25294:20160914:220929.912 IPv6 support:              YES
           25294:20160914:220929.912 TLS support:               YES
           25294:20160914:220929.912 ******************************
           25294:20160914:220929.912 using configuration file: /etc/zabbix/zabbix_server.conf
           25294:20160914:220929.922 current database version (mandatory/optional): 03000000/03000000
           25294:20160914:220929.922 required mandatory version: 03020000
           25294:20160914:220929.922 starting automatic database upgrade
           25294:20160914:220929.956 completed 1% of database upgrade
           25294:20160914:220930.130 Got signal [signal:15(SIGTERM),sender_pid:25329,sender_uid:0,reason:0]. Exiting ...
           25294:20160914:220932.225 syncing history data...
           25294:20160914:220932.225 syncing history data done
           25294:20160914:220932.225 syncing trend data...
           25294:20160914:220932.225 syncing trend data done
           25294:20160914:220932.225 Zabbix Server stopped. Zabbix 3.2.0 (revision 62485).
           25334:20160914:220932.242 Starting Zabbix Server. Zabbix 3.2.0 (revision 62485).
           25334:20160914:220932.242 ****** Enabled features ******
           25334:20160914:220932.242 SNMP monitoring:           YES
           25334:20160914:220932.242 IPMI monitoring:           YES
           25334:20160914:220932.242 Web monitoring:            YES
           25334:20160914:220932.242 VMware monitoring:         YES
           25334:20160914:220932.242 SMTP authentication:       YES
           25334:20160914:220932.242 Jabber notifications:      YES
           25334:20160914:220932.242 Ez Texting notifications:  YES
           25334:20160914:220932.242 ODBC:                      YES
           25334:20160914:220932.242 SSH2 support:              YES
           25334:20160914:220932.242 IPv6 support:              YES
           25334:20160914:220932.242 TLS support:               YES
           25334:20160914:220932.242 ******************************
           25334:20160914:220932.242 using configuration file: /etc/zabbix/zabbix_server.conf
           25334:20160914:220932.247 current database version (mandatory/optional): 03010000/03010000
           25334:20160914:220932.247 required mandatory version: 03020000
           25334:20160914:220932.247 starting automatic database upgrade
           25334:20160914:220932.248 [Z3005] query failed: [1091] Can't DROP 'id'; check that column/key exists [alter table history_log drop column id]
           25334:20160914:220932.248 database upgrade failed
          

          systemd logs

          Sep 14 22:09:27 hostname systemd[1]: Stopping Zabbix Server...
          Sep 14 22:09:29 hostname systemd[1]: Starting Zabbix Server...
          Sep 14 22:09:29 hostname systemd[1]: PID file /run/zabbix/zabbix_server.pid not readable (yet?) after start.
          Sep 14 22:09:29 hostname systemd[1]: Started Zabbix Server.
          Sep 14 22:09:30 hostname systemd[1]: Stopping Zabbix Server...
          Sep 14 22:09:32 hostname systemd[1]: Starting Zabbix Server...
          Sep 14 22:09:32 hostname systemd[1]: Started Zabbix Server.
          Sep 14 22:09:32 hostname systemd[1]: zabbix-server.service: main process exited, code=exited, status=1/FAILURE
          Sep 14 22:09:32 hostname kill[25336]: Usage:
          Sep 14 22:09:32 hostname kill[25336]: kill [options] <pid|name> [...]
          Sep 14 22:09:32 hostname kill[25336]: Options:
          Sep 14 22:09:32 hostname kill[25336]: -a, --all              do not restrict the name-to-pid conversion to processes
          Sep 14 22:09:32 hostname kill[25336]: with the same uid as the present process
          Sep 14 22:09:32 hostname kill[25336]: -s, --signal <sig>     send specified signal
          Sep 14 22:09:32 hostname kill[25336]: -q, --queue <sig>      use sigqueue(2) rather than kill(2)
          Sep 14 22:09:32 hostname kill[25336]: -p, --pid              print pids without signaling them
          Sep 14 22:09:32 hostname kill[25336]: -l, --list [=<signal>] list signal names, or convert one to a name
          Sep 14 22:09:32 hostname kill[25336]: -L, --table            list signal names and numbers
          Sep 14 22:09:32 hostname kill[25336]: -h, --help     display this help and exit
          Sep 14 22:09:32 hostname kill[25336]: -V, --version  output version information and exit
          Sep 14 22:09:32 hostname kill[25336]: For more details see kill(1).
          Sep 14 22:09:32 hostname systemd[1]: zabbix-server.service: control process exited, code=exited status=1
          Sep 14 22:09:32 hostname systemd[1]: Unit zabbix-server.service entered failed state.
          Sep 14 22:09:32 hostname systemd[1]: zabbix-server.service failed.
          Sep 14 22:09:42 hostname systemd[1]: zabbix-server.service holdoff time over, scheduling restart.
          
          Show
          Alexander Yesipov added a comment - I did upgrade and results are the same. Here some logs: zabbix_server.log 1786:20160914:220927.619 Got signal [signal:15(SIGTERM),sender_pid:25289,sender_uid:0,reason:0]. Exiting ... 1786:20160914:220929.631 syncing history data... 1786:20160914:220929.632 syncing history data done 1786:20160914:220929.632 syncing trends data... 1786:20160914:220929.849 syncing trends data done 1786:20160914:220929.850 Zabbix Server stopped. Zabbix 3.0.4 (revision 61185). 25294:20160914:220929.912 Starting Zabbix Server. Zabbix 3.2.0 (revision 62485). 25294:20160914:220929.912 ****** Enabled features ****** 25294:20160914:220929.912 SNMP monitoring: YES 25294:20160914:220929.912 IPMI monitoring: YES 25294:20160914:220929.912 Web monitoring: YES 25294:20160914:220929.912 VMware monitoring: YES 25294:20160914:220929.912 SMTP authentication: YES 25294:20160914:220929.912 Jabber notifications: YES 25294:20160914:220929.912 Ez Texting notifications: YES 25294:20160914:220929.912 ODBC: YES 25294:20160914:220929.912 SSH2 support: YES 25294:20160914:220929.912 IPv6 support: YES 25294:20160914:220929.912 TLS support: YES 25294:20160914:220929.912 ****************************** 25294:20160914:220929.912 using configuration file: /etc/zabbix/zabbix_server.conf 25294:20160914:220929.922 current database version (mandatory/optional): 03000000/03000000 25294:20160914:220929.922 required mandatory version: 03020000 25294:20160914:220929.922 starting automatic database upgrade 25294:20160914:220929.956 completed 1% of database upgrade 25294:20160914:220930.130 Got signal [signal:15(SIGTERM),sender_pid:25329,sender_uid:0,reason:0]. Exiting ... 25294:20160914:220932.225 syncing history data... 25294:20160914:220932.225 syncing history data done 25294:20160914:220932.225 syncing trend data... 25294:20160914:220932.225 syncing trend data done 25294:20160914:220932.225 Zabbix Server stopped. Zabbix 3.2.0 (revision 62485). 25334:20160914:220932.242 Starting Zabbix Server. Zabbix 3.2.0 (revision 62485). 25334:20160914:220932.242 ****** Enabled features ****** 25334:20160914:220932.242 SNMP monitoring: YES 25334:20160914:220932.242 IPMI monitoring: YES 25334:20160914:220932.242 Web monitoring: YES 25334:20160914:220932.242 VMware monitoring: YES 25334:20160914:220932.242 SMTP authentication: YES 25334:20160914:220932.242 Jabber notifications: YES 25334:20160914:220932.242 Ez Texting notifications: YES 25334:20160914:220932.242 ODBC: YES 25334:20160914:220932.242 SSH2 support: YES 25334:20160914:220932.242 IPv6 support: YES 25334:20160914:220932.242 TLS support: YES 25334:20160914:220932.242 ****************************** 25334:20160914:220932.242 using configuration file: /etc/zabbix/zabbix_server.conf 25334:20160914:220932.247 current database version (mandatory/optional): 03010000/03010000 25334:20160914:220932.247 required mandatory version: 03020000 25334:20160914:220932.247 starting automatic database upgrade 25334:20160914:220932.248 [Z3005] query failed: [1091] Can't DROP 'id'; check that column/key exists [alter table history_log drop column id] 25334:20160914:220932.248 database upgrade failed systemd logs Sep 14 22:09:27 hostname systemd[1]: Stopping Zabbix Server... Sep 14 22:09:29 hostname systemd[1]: Starting Zabbix Server... Sep 14 22:09:29 hostname systemd[1]: PID file /run/zabbix/zabbix_server.pid not readable (yet?) after start. Sep 14 22:09:29 hostname systemd[1]: Started Zabbix Server. Sep 14 22:09:30 hostname systemd[1]: Stopping Zabbix Server... Sep 14 22:09:32 hostname systemd[1]: Starting Zabbix Server... Sep 14 22:09:32 hostname systemd[1]: Started Zabbix Server. Sep 14 22:09:32 hostname systemd[1]: zabbix-server.service: main process exited, code=exited, status=1/FAILURE Sep 14 22:09:32 hostname kill[25336]: Usage: Sep 14 22:09:32 hostname kill[25336]: kill [options] <pid|name> [...] Sep 14 22:09:32 hostname kill[25336]: Options: Sep 14 22:09:32 hostname kill[25336]: -a, --all do not restrict the name-to-pid conversion to processes Sep 14 22:09:32 hostname kill[25336]: with the same uid as the present process Sep 14 22:09:32 hostname kill[25336]: -s, --signal <sig> send specified signal Sep 14 22:09:32 hostname kill[25336]: -q, --queue <sig> use sigqueue(2) rather than kill(2) Sep 14 22:09:32 hostname kill[25336]: -p, --pid print pids without signaling them Sep 14 22:09:32 hostname kill[25336]: -l, --list [=<signal>] list signal names, or convert one to a name Sep 14 22:09:32 hostname kill[25336]: -L, --table list signal names and numbers Sep 14 22:09:32 hostname kill[25336]: -h, --help display this help and exit Sep 14 22:09:32 hostname kill[25336]: -V, --version output version information and exit Sep 14 22:09:32 hostname kill[25336]: For more details see kill(1). Sep 14 22:09:32 hostname systemd[1]: zabbix-server.service: control process exited, code=exited status=1 Sep 14 22:09:32 hostname systemd[1]: Unit zabbix-server.service entered failed state. Sep 14 22:09:32 hostname systemd[1]: zabbix-server.service failed. Sep 14 22:09:42 hostname systemd[1]: zabbix-server.service holdoff time over, scheduling restart.
          Hide
          Horigome Yoshihito added a comment -

          I also have the same symptoms has occurred.

          zabbix log

           32673:20160915:102757.191 Starting Zabbix Server. Zabbix 3.2.0 (revision 62485).
           32673:20160915:102757.191 ****** Enabled features ******
           32673:20160915:102757.191 SNMP monitoring:           YES
           32673:20160915:102757.191 IPMI monitoring:           YES
           32673:20160915:102757.191 Web monitoring:            YES
           32673:20160915:102757.191 VMware monitoring:         YES
           32673:20160915:102757.191 SMTP authentication:       YES
           32673:20160915:102757.191 Jabber notifications:      YES
           32673:20160915:102757.191 Ez Texting notifications:  YES
           32673:20160915:102757.191 ODBC:                      YES
           32673:20160915:102757.191 SSH2 support:              YES
           32673:20160915:102757.191 IPv6 support:              YES
           32673:20160915:102757.191 TLS support:               YES
           32673:20160915:102757.191 ******************************
           32673:20160915:102757.191 using configuration file: /etc/zabbix/zabbix_server.conf
           32673:20160915:102757.201 current database version (mandatory/optional): 03010000/03010000
           32673:20160915:102757.201 required mandatory version: 03020000
           32673:20160915:102757.201 starting automatic database upgrade
           32673:20160915:102757.201 [Z3005] query failed: [1091] Can't DROP 'id'; check that column/key exists [alter table history_log drop column id]
           32673:20160915:102757.202 database upgrade failed
          

          system log

          Sep 15 10:28:27 zabbix-kometch systemd[1]: zabbix-server.service holdoff time over, scheduling re
          Sep 15 10:28:27 zabbix-kometch systemd[1]: Starting Zabbix Server...
          Sep 15 10:28:27 zabbix-kometch systemd[1]: PID file /run/zabbix/zabbix_server.pid not readable (y
          Sep 15 10:28:27 zabbix-kometch systemd[1]: Started Zabbix Server.
          Sep 15 10:28:27 zabbix-kometch systemd[1]: zabbix-server.service: main process exited, code=exite
          Sep 15 10:28:27 zabbix-kometch kill[32692]: Usage:
          Sep 15 10:28:27 zabbix-kometch kill[32692]: kill [options] <pid|name> [...]
          Sep 15 10:28:27 zabbix-kometch kill[32692]: Options:
          Sep 15 10:28:27 zabbix-kometch kill[32692]: -a, --all              do not restrict the name-to-pi
          Sep 15 10:28:27 zabbix-kometch kill[32692]: with the same uid as the present process
          Sep 15 10:28:27 zabbix-kometch kill[32692]: -s, --signal <sig>     send specified signal
          Sep 15 10:28:27 zabbix-kometch kill[32692]: -q, --queue <sig>      use sigqueue(2) rather than ki
          Sep 15 10:28:27 zabbix-kometch kill[32692]: -p, --pid              print pids without signaling t
          Sep 15 10:28:27 zabbix-kometch kill[32692]: -l, --list [=<signal>] list signal names, or convert 
          Sep 15 10:28:27 zabbix-kometch kill[32692]: -L, --table            list signal names and numbers
          Sep 15 10:28:27 zabbix-kometch kill[32692]: -h, --help     display this help and exit
          Sep 15 10:28:27 zabbix-kometch kill[32692]: -V, --version  output version information and exit
          Sep 15 10:28:27 zabbix-kometch kill[32692]: For more details see kill(1).
          Sep 15 10:28:27 zabbix-kometch systemd[1]: zabbix-server.service: control process exited, code=ex
          Sep 15 10:28:27 zabbix-kometch systemd[1]: Unit zabbix-server.service entered failed state.
          Sep 15 10:28:27 zabbix-kometch systemd[1]: zabbix-server.service failed.
          

          It has been confirmed that there is no id field.

          DB scheme

          MariaDB [zabbix]> describe history_log;
          +------------+---------------------+------+-----+---------+-------+
          | Field      | Type                | Null | Key | Default | Extra |
          +------------+---------------------+------+-----+---------+-------+
          | itemid     | bigint(20) unsigned | NO   | MUL | NULL    |       |
          | clock      | int(11)             | NO   |     | 0       |       |
          | timestamp  | int(11)             | NO   |     | 0       |       |
          | source     | varchar(64)         | NO   |     |         |       |
          | severity   | int(11)             | NO   |     | 0       |       |
          | value      | text                | NO   |     | NULL    |       |
          | logeventid | int(11)             | NO   |     | 0       |       |
          | ns         | int(11)             | NO   |     | 0       |       |
          +------------+---------------------+------+-----+---------+-------+
          8 rows in set (0.00 sec)
          
          
          Show
          Horigome Yoshihito added a comment - I also have the same symptoms has occurred. zabbix log 32673:20160915:102757.191 Starting Zabbix Server. Zabbix 3.2.0 (revision 62485). 32673:20160915:102757.191 ****** Enabled features ****** 32673:20160915:102757.191 SNMP monitoring: YES 32673:20160915:102757.191 IPMI monitoring: YES 32673:20160915:102757.191 Web monitoring: YES 32673:20160915:102757.191 VMware monitoring: YES 32673:20160915:102757.191 SMTP authentication: YES 32673:20160915:102757.191 Jabber notifications: YES 32673:20160915:102757.191 Ez Texting notifications: YES 32673:20160915:102757.191 ODBC: YES 32673:20160915:102757.191 SSH2 support: YES 32673:20160915:102757.191 IPv6 support: YES 32673:20160915:102757.191 TLS support: YES 32673:20160915:102757.191 ****************************** 32673:20160915:102757.191 using configuration file: /etc/zabbix/zabbix_server.conf 32673:20160915:102757.201 current database version (mandatory/optional): 03010000/03010000 32673:20160915:102757.201 required mandatory version: 03020000 32673:20160915:102757.201 starting automatic database upgrade 32673:20160915:102757.201 [Z3005] query failed: [1091] Can't DROP 'id'; check that column/key exists [alter table history_log drop column id] 32673:20160915:102757.202 database upgrade failed system log Sep 15 10:28:27 zabbix-kometch systemd[1]: zabbix-server.service holdoff time over, scheduling re Sep 15 10:28:27 zabbix-kometch systemd[1]: Starting Zabbix Server... Sep 15 10:28:27 zabbix-kometch systemd[1]: PID file /run/zabbix/zabbix_server.pid not readable (y Sep 15 10:28:27 zabbix-kometch systemd[1]: Started Zabbix Server. Sep 15 10:28:27 zabbix-kometch systemd[1]: zabbix-server.service: main process exited, code=exite Sep 15 10:28:27 zabbix-kometch kill[32692]: Usage: Sep 15 10:28:27 zabbix-kometch kill[32692]: kill [options] <pid|name> [...] Sep 15 10:28:27 zabbix-kometch kill[32692]: Options: Sep 15 10:28:27 zabbix-kometch kill[32692]: -a, --all do not restrict the name-to-pi Sep 15 10:28:27 zabbix-kometch kill[32692]: with the same uid as the present process Sep 15 10:28:27 zabbix-kometch kill[32692]: -s, --signal <sig> send specified signal Sep 15 10:28:27 zabbix-kometch kill[32692]: -q, --queue <sig> use sigqueue(2) rather than ki Sep 15 10:28:27 zabbix-kometch kill[32692]: -p, --pid print pids without signaling t Sep 15 10:28:27 zabbix-kometch kill[32692]: -l, --list [=<signal>] list signal names, or convert Sep 15 10:28:27 zabbix-kometch kill[32692]: -L, --table list signal names and numbers Sep 15 10:28:27 zabbix-kometch kill[32692]: -h, --help display this help and exit Sep 15 10:28:27 zabbix-kometch kill[32692]: -V, --version output version information and exit Sep 15 10:28:27 zabbix-kometch kill[32692]: For more details see kill(1). Sep 15 10:28:27 zabbix-kometch systemd[1]: zabbix-server.service: control process exited, code=ex Sep 15 10:28:27 zabbix-kometch systemd[1]: Unit zabbix-server.service entered failed state. Sep 15 10:28:27 zabbix-kometch systemd[1]: zabbix-server.service failed. It has been confirmed that there is no id field. DB scheme MariaDB [zabbix]> describe history_log; +------------+---------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------+---------------------+------+-----+---------+-------+ | itemid | bigint(20) unsigned | NO | MUL | NULL | | | clock | int (11) | NO | | 0 | | | timestamp | int (11) | NO | | 0 | | | source | varchar(64) | NO | | | | | severity | int (11) | NO | | 0 | | | value | text | NO | | NULL | | | logeventid | int (11) | NO | | 0 | | | ns | int (11) | NO | | 0 | | +------------+---------------------+------+-----+---------+-------+ 8 rows in set (0.00 sec)
          Hide
          sles added a comment -

          Hello!

          Yesterday I completed upgrade by running zabbix server 3.2.0 manually for first run.
          I guess problem here is in systemd- first start is very long , because of database upgrade and
          systemd is not know that zabbix is started, so systemd restarts zabbix server.
          Don't know how to solve this problem , because I'm still don't completely understand how systemd works.
          For now, as workaround, one can do first start after upgrade manually, i.e from command line
          as zabbix-server bla-bla..
          Thank you!

          Show
          sles added a comment - Hello! Yesterday I completed upgrade by running zabbix server 3.2.0 manually for first run. I guess problem here is in systemd- first start is very long , because of database upgrade and systemd is not know that zabbix is started, so systemd restarts zabbix server. Don't know how to solve this problem , because I'm still don't completely understand how systemd works. For now, as workaround, one can do first start after upgrade manually, i.e from command line as zabbix-server bla-bla.. Thank you!
          Hide
          Glebs Ivanovskis added a comment -

          We should probably protect critical DB upgrade sections by blocking signals like was requested in ZBX-10630 for cache operations.

          Workaround for now:

          • figure out which upgrade step is failing;
          • make sure that it was actually completed (but server is not aware because it was killed before seeing the successful query result);
          • increase dbversion by one manually;
          • cross fingers and start Zabbix once again.
          Show
          Glebs Ivanovskis added a comment - We should probably protect critical DB upgrade sections by blocking signals like was requested in ZBX-10630 for cache operations. Workaround for now: figure out which upgrade step is failing; make sure that it was actually completed (but server is not aware because it was killed before seeing the successful query result); increase dbversion by one manually; cross fingers and start Zabbix once again.
          Hide
          Oleksiy Zagorskyi added a comment -

          systemd may send SIGKILL as well, after an app does not react on sent SIGTERM, so protecting zabbix_server by signals will not help, IMO.
          Need to understand reason why systemd (supposedly) tries to stop (restart?) zabbix server. How does it check that daemon is successfully started?
          Above I mentioned PID file of zabbix server. Question is - when zabbix server creates it. I have no time to investigate this point currently, unfortunately.

          Show
          Oleksiy Zagorskyi added a comment - systemd may send SIGKILL as well, after an app does not react on sent SIGTERM, so protecting zabbix_server by signals will not help, IMO. Need to understand reason why systemd (supposedly) tries to stop (restart?) zabbix server. How does it check that daemon is successfully started? Above I mentioned PID file of zabbix server. Question is - when zabbix server creates it. I have no time to investigate this point currently, unfortunately.
          Hide
          richlv added a comment -

          i think systemd unit file says that there must be more than one zabbix server process. if only one is there, something's not good & it must be shot.
          to be verified by packagers

          Show
          richlv added a comment - i think systemd unit file says that there must be more than one zabbix server process. if only one is there, something's not good & it must be shot. to be verified by packagers
          Hide
          Glebs Ivanovskis added a comment -

          Dear Oleksiy Zagorskyi, signals are not sent uniquely by systemd. User can sabotage upgrade process this way too. I've checked and pid file is created long before DB version check. Currently richlv's version looks more probable.

          Show
          Glebs Ivanovskis added a comment - Dear Oleksiy Zagorskyi , signals are not sent uniquely by systemd. User can sabotage upgrade process this way too. I've checked and pid file is created long before DB version check. Currently richlv 's version looks more probable.
          Hide
          sles added a comment -

          > User can sabotage upgrade process this way too.

          No , really, user can't .
          Only server administrator.
          So, I think, there is no need to block signals...

          Show
          sles added a comment - > User can sabotage upgrade process this way too. No , really, user can't . Only server administrator. So, I think, there is no need to block signals...
          Hide
          Blake Burgess added a comment -

          Ok so got hit with this today.
          workaround for me to get back up and running was a bit of a kludge but basically
          Backup Database
          systemctl disable zabbix-server
          systemctl stop zabbix-server
          jump into the zabbix database
          alter table history_text add id varchar;
          manually launch zabbix_server (i.e. without using systemd/init scripts)
          wait for db upgrade to complete
          kill zabbix_server processes
          systemctl enable zabbix-server
          systemctl start zabbix-server

          Show
          Blake Burgess added a comment - Ok so got hit with this today. workaround for me to get back up and running was a bit of a kludge but basically Backup Database systemctl disable zabbix-server systemctl stop zabbix-server jump into the zabbix database alter table history_text add id varchar; manually launch zabbix_server (i.e. without using systemd/init scripts) wait for db upgrade to complete kill zabbix_server processes systemctl enable zabbix-server systemctl start zabbix-server
          Hide
          Oleksiy Zagorskyi added a comment - - edited

          Blake, disable/enable zabbix-server service was not required. Stop/start it would be enough.

          Show
          Oleksiy Zagorskyi added a comment - - edited Blake, disable/enable zabbix-server service was not required. Stop/start it would be enough.
          Hide
          Oleksiy Zagorskyi added a comment -

          Richlv, in unit file I don't see obvious setting which would require more than one process:

          [Unit]
          Description=Zabbix Server
          After=syslog.target
          After=network.target
          
          [Service]
          Environment="CONFFILE=/etc/zabbix/zabbix_server.conf"
          EnvironmentFile=-/etc/sysconfig/zabbix-server
          Type=forking
          Restart=on-failure
          PIDFile=/run/zabbix/zabbix_server.pid
          KillMode=control-group
          ExecStart=/usr/sbin/zabbix_server -c $CONFFILE
          ExecStop=/bin/kill -SIGTERM $MAINPID
          RestartSec=10s
          
          [Install]
          WantedBy=multi-user.target
          

          But closests thing is Type=forking might have have some affect, should be checked
          https://www.freedesktop.org/software/systemd/man/systemd.service.html

          Show
          Oleksiy Zagorskyi added a comment - Richlv, in unit file I don't see obvious setting which would require more than one process: [Unit] Description=Zabbix Server After=syslog.target After=network.target [Service] Environment= "CONFFILE=/etc/zabbix/zabbix_server.conf" EnvironmentFile=-/etc/sysconfig/zabbix-server Type=forking Restart=on-failure PIDFile=/run/zabbix/zabbix_server.pid KillMode=control-group ExecStart=/usr/sbin/zabbix_server -c $CONFFILE ExecStop=/bin/kill -SIGTERM $MAINPID RestartSec=10s [Install] WantedBy=multi-user.target But closests thing is Type=forking might have have some affect, should be checked https://www.freedesktop.org/software/systemd/man/systemd.service.html
          Hide
          Andris Zeila added a comment -

          I'm not sure if the server was restarted because it appeared as failed to systemd or it was just normal upgrade process. From the posted systemd log it looks like server was simply restarted again:

          Sep 14 22:09:27 hostname systemd[1]: Stopping Zabbix Server...
          Sep 14 22:09:29 hostname systemd[1]: Starting Zabbix Server...
          Sep 14 22:09:29 hostname systemd[1]: PID file /run/zabbix/zabbix_server.pid not readable (yet?) after start.
          Sep 14 22:09:29 hostname systemd[1]: Started Zabbix Server.
          Sep 14 22:09:30 hostname systemd[1]: Stopping Zabbix Server...
          Sep 14 22:09:32 hostname systemd[1]: Starting Zabbix Server...
          

          Need to check if zabbix is really restarted during normal upgrade process.

          But the real problem here is killing zabbix during implicit commit statement. Maybe putting such upgrade steps into temporary functions would work, not sure.

          Show
          Andris Zeila added a comment - I'm not sure if the server was restarted because it appeared as failed to systemd or it was just normal upgrade process. From the posted systemd log it looks like server was simply restarted again: Sep 14 22:09:27 hostname systemd[1]: Stopping Zabbix Server... Sep 14 22:09:29 hostname systemd[1]: Starting Zabbix Server... Sep 14 22:09:29 hostname systemd[1]: PID file /run/zabbix/zabbix_server.pid not readable (yet?) after start. Sep 14 22:09:29 hostname systemd[1]: Started Zabbix Server. Sep 14 22:09:30 hostname systemd[1]: Stopping Zabbix Server... Sep 14 22:09:32 hostname systemd[1]: Starting Zabbix Server... Need to check if zabbix is really restarted during normal upgrade process. But the real problem here is killing zabbix during implicit commit statement. Maybe putting such upgrade steps into temporary functions would work, not sure.
          Hide
          Andris Zeila added a comment - - edited

          I tested upgrade with basic configuration and it did not restart zabbix server twice. On the other hand sleeping server for 10 minutes before database upgrade also did not cause systemd to restart it.

          Andris Zeila Tried with larger history_text table (1m records). During yum update the server was restarted and database successfully upgraded.

            4619:20160920:040555.573 Starting Zabbix Server. Zabbix 3.0.4 (revision 61185).
            4619:20160920:040555.574 ****** Enabled features ******
            4619:20160920:040555.574 SNMP monitoring:           YES
            4619:20160920:040555.574 IPMI monitoring:           YES
            4619:20160920:040555.574 Web monitoring:            YES
            4619:20160920:040555.574 VMware monitoring:         YES
            4619:20160920:040555.574 SMTP authentication:       YES
            4619:20160920:040555.574 Jabber notifications:      YES
            4619:20160920:040555.574 Ez Texting notifications:  YES
            4619:20160920:040555.574 ODBC:                      YES
            4619:20160920:040555.574 SSH2 support:              YES
            4619:20160920:040555.574 IPv6 support:              YES
            4619:20160920:040555.574 TLS support:               YES
            4619:20160920:040555.574 ******************************
            4619:20160920:040555.574 using configuration file: /etc/zabbix/zabbix_server.conf
            4619:20160920:040555.596 current database version (mandatory/optional): 03000000/03000000
            4619:20160920:040555.596 required mandatory version: 03000000
            4619:20160920:040555.638 server #0 started [main process]
            4661:20160920:040555.646 server #24 started [escalator #1]
            4663:20160920:040555.647 server #26 started [self-monitoring #1]
            4655:20160920:040555.648 server #18 started [http poller #1]
            4647:20160920:040555.648 server #10 started [trapper #2]
            4649:20160920:040555.649 server #12 started [trapper #4]
            4638:20160920:040555.651 server #2 started [db watchdog #1]
            4640:20160920:040555.651 server #3 started [poller #1]
            4641:20160920:040555.653 server #4 started [poller #2]
            4644:20160920:040555.653 server #7 started [poller #5]
            4651:20160920:040555.655 server #14 started [icmp pinger #1]
            4652:20160920:040555.655 server #15 started [alerter #1]
            4654:20160920:040555.655 server #17 started [timer #1]
            4642:20160920:040555.655 server #5 started [poller #3]
            4643:20160920:040555.657 server #6 started [poller #4]
            4659:20160920:040555.659 server #22 started [history syncer #3]
            4657:20160920:040555.659 server #20 started [history syncer #1]
            4637:20160920:040555.659 server #1 started [configuration syncer #1]
            4660:20160920:040555.662 server #23 started [history syncer #4]
            4662:20160920:040555.662 server #25 started [proxy poller #1]
            4645:20160920:040555.664 server #8 started [unreachable poller #1]
            4646:20160920:040555.666 server #9 started [trapper #1]
            4648:20160920:040555.668 server #11 started [trapper #3]
            4650:20160920:040555.670 server #13 started [trapper #5]
            4653:20160920:040555.672 server #16 started [housekeeper #1]
            4658:20160920:040555.672 server #21 started [history syncer #2]
            4656:20160920:040555.767 server #19 started [discoverer #1]
            4619:20160920:040817.723 Got signal [signal:15(SIGTERM),sender_pid:4819,sender_uid:0,reason:0]. Exiting ...
            4619:20160920:040819.746 syncing history data...
            4619:20160920:040819.746 syncing history data done
            4619:20160920:040819.746 syncing trends data...
            4619:20160920:040819.746 syncing trends data done
            4619:20160920:040819.747 Zabbix Server stopped. Zabbix 3.0.4 (revision 61185).
            4825:20160920:040819.780 Starting Zabbix Server. Zabbix 3.2.0 (revision 62485).
            4825:20160920:040819.780 ****** Enabled features ******
            4825:20160920:040819.780 SNMP monitoring:           YES
            4825:20160920:040819.780 IPMI monitoring:           YES
            4825:20160920:040819.780 Web monitoring:            YES
            4825:20160920:040819.780 VMware monitoring:         YES
            4825:20160920:040819.780 SMTP authentication:       YES
            4825:20160920:040819.780 Jabber notifications:      YES
            4825:20160920:040819.780 Ez Texting notifications:  YES
            4825:20160920:040819.780 ODBC:                      YES
            4825:20160920:040819.780 SSH2 support:              YES
            4825:20160920:040819.780 IPv6 support:              YES
            4825:20160920:040819.780 TLS support:               YES
            4825:20160920:040819.780 ******************************
            4825:20160920:040819.780 using configuration file: /etc/zabbix/zabbix_server.conf
            4825:20160920:040819.793 current database version (mandatory/optional): 03000000/03000000
            4825:20160920:040819.793 required mandatory version: 03020000
            4825:20160920:040819.793 starting automatic database upgrade
            4825:20160920:040819.800 completed 1% of database upgrade
            4825:20160920:040819.807 completed 2% of database upgrade
            4825:20160920:040819.832 completed 3% of database upgrade
            4825:20160920:040827.014 slow query: 7.181938 sec, "alter table history_text drop column id"
            4825:20160920:040827.048 completed 5% of database upgrade
            4825:20160920:040827.097 completed 6% of database upgrade
            4825:20160920:040827.120 completed 7% of database upgrade
            4825:20160920:040827.128 completed 8% of database upgrade
            4825:20160920:040827.143 completed 10% of database upgrade
            4825:20160920:040827.151 completed 11% of database upgrade
            4825:20160920:040827.155 completed 12% of database upgrade
            4825:20160920:040827.174 completed 13% of database upgrade
            4825:20160920:040827.183 completed 15% of database upgrade
            4825:20160920:040827.191 completed 16% of database upgrade
            4825:20160920:040827.199 completed 17% of database upgrade
            4825:20160920:040827.201 completed 18% of database upgrade
            4825:20160920:040827.218 completed 20% of database upgrade
            4825:20160920:040827.227 completed 21% of database upgrade
            4825:20160920:040827.230 completed 22% of database upgrade
            4825:20160920:040827.259 completed 24% of database upgrade
            4825:20160920:040827.302 completed 25% of database upgrade
            4825:20160920:040827.358 completed 26% of database upgrade
            4825:20160920:040827.369 completed 27% of database upgrade
            4825:20160920:040827.377 completed 29% of database upgrade
            4825:20160920:040827.383 completed 30% of database upgrade
            4825:20160920:040827.384 completed 31% of database upgrade
            4825:20160920:040827.389 completed 32% of database upgrade
            4825:20160920:040827.389 Action "Report problems to Zabbix administrators" condition "Trigger value = PROBLEM" will be removed during database upgrade: this type of condition is not supported anymore
            4825:20160920:040827.390 completed 34% of database upgrade
            4825:20160920:040827.402 completed 35% of database upgrade
            4825:20160920:040827.416 completed 36% of database upgrade
            4825:20160920:040827.422 completed 37% of database upgrade
            4825:20160920:040827.429 completed 39% of database upgrade
            4825:20160920:040827.434 completed 40% of database upgrade
            4825:20160920:040827.438 completed 41% of database upgrade
            4825:20160920:040827.446 completed 43% of database upgrade
            4825:20160920:040827.463 completed 44% of database upgrade
            4825:20160920:040827.475 completed 45% of database upgrade
            4825:20160920:040827.480 completed 46% of database upgrade
            4825:20160920:040827.498 completed 48% of database upgrade
            4825:20160920:040827.516 completed 49% of database upgrade
            4825:20160920:040827.528 completed 50% of database upgrade
            4825:20160920:040827.529 completed 51% of database upgrade
            4825:20160920:040827.530 completed 53% of database upgrade
            4825:20160920:040827.536 completed 54% of database upgrade
            4825:20160920:040827.543 completed 55% of database upgrade
            4825:20160920:040827.548 completed 56% of database upgrade
            4825:20160920:040827.565 completed 58% of database upgrade
            4825:20160920:040827.572 completed 59% of database upgrade
            4825:20160920:040827.576 completed 60% of database upgrade
            4825:20160920:040827.593 completed 62% of database upgrade
            4825:20160920:040827.610 completed 63% of database upgrade
            4825:20160920:040827.615 completed 64% of database upgrade
            4825:20160920:040827.631 completed 65% of database upgrade
            4825:20160920:040827.637 completed 67% of database upgrade
            4825:20160920:040827.641 completed 68% of database upgrade
            4825:20160920:040827.646 completed 69% of database upgrade
            4825:20160920:040827.649 completed 70% of database upgrade
            4825:20160920:040827.666 completed 72% of database upgrade
            4825:20160920:040827.675 completed 73% of database upgrade
            4825:20160920:040827.683 completed 74% of database upgrade
            4825:20160920:040827.687 completed 75% of database upgrade
            4825:20160920:040827.693 completed 77% of database upgrade
            4825:20160920:040827.697 completed 78% of database upgrade
            4825:20160920:040827.701 completed 79% of database upgrade
            4825:20160920:040827.703 completed 81% of database upgrade
            4825:20160920:040827.718 completed 82% of database upgrade
            4825:20160920:040827.725 completed 83% of database upgrade
            4825:20160920:040827.726 completed 84% of database upgrade
            4825:20160920:040827.728 completed 86% of database upgrade
            4825:20160920:040827.732 completed 87% of database upgrade
            4825:20160920:040827.739 completed 88% of database upgrade
            4825:20160920:040827.750 completed 89% of database upgrade
            4825:20160920:040827.762 completed 91% of database upgrade
            4825:20160920:040827.768 completed 92% of database upgrade
            4825:20160920:040827.776 completed 93% of database upgrade
            4825:20160920:040827.781 completed 94% of database upgrade
            4825:20160920:040827.786 completed 96% of database upgrade
            4825:20160920:040827.796 completed 97% of database upgrade
            4825:20160920:040827.797 completed 98% of database upgrade
            4825:20160920:040827.797 completed 100% of database upgrade
            4825:20160920:040827.797 database upgrade fully completed
          

          I don't think that server was killed because of slow database update.

          Show
          Andris Zeila added a comment - - edited I tested upgrade with basic configuration and it did not restart zabbix server twice. On the other hand sleeping server for 10 minutes before database upgrade also did not cause systemd to restart it. Andris Zeila Tried with larger history_text table (1m records). During yum update the server was restarted and database successfully upgraded. 4619:20160920:040555.573 Starting Zabbix Server. Zabbix 3.0.4 (revision 61185). 4619:20160920:040555.574 ****** Enabled features ****** 4619:20160920:040555.574 SNMP monitoring: YES 4619:20160920:040555.574 IPMI monitoring: YES 4619:20160920:040555.574 Web monitoring: YES 4619:20160920:040555.574 VMware monitoring: YES 4619:20160920:040555.574 SMTP authentication: YES 4619:20160920:040555.574 Jabber notifications: YES 4619:20160920:040555.574 Ez Texting notifications: YES 4619:20160920:040555.574 ODBC: YES 4619:20160920:040555.574 SSH2 support: YES 4619:20160920:040555.574 IPv6 support: YES 4619:20160920:040555.574 TLS support: YES 4619:20160920:040555.574 ****************************** 4619:20160920:040555.574 using configuration file: /etc/zabbix/zabbix_server.conf 4619:20160920:040555.596 current database version (mandatory/optional): 03000000/03000000 4619:20160920:040555.596 required mandatory version: 03000000 4619:20160920:040555.638 server #0 started [main process] 4661:20160920:040555.646 server #24 started [escalator #1] 4663:20160920:040555.647 server #26 started [self-monitoring #1] 4655:20160920:040555.648 server #18 started [http poller #1] 4647:20160920:040555.648 server #10 started [trapper #2] 4649:20160920:040555.649 server #12 started [trapper #4] 4638:20160920:040555.651 server #2 started [db watchdog #1] 4640:20160920:040555.651 server #3 started [poller #1] 4641:20160920:040555.653 server #4 started [poller #2] 4644:20160920:040555.653 server #7 started [poller #5] 4651:20160920:040555.655 server #14 started [icmp pinger #1] 4652:20160920:040555.655 server #15 started [alerter #1] 4654:20160920:040555.655 server #17 started [timer #1] 4642:20160920:040555.655 server #5 started [poller #3] 4643:20160920:040555.657 server #6 started [poller #4] 4659:20160920:040555.659 server #22 started [history syncer #3] 4657:20160920:040555.659 server #20 started [history syncer #1] 4637:20160920:040555.659 server #1 started [configuration syncer #1] 4660:20160920:040555.662 server #23 started [history syncer #4] 4662:20160920:040555.662 server #25 started [proxy poller #1] 4645:20160920:040555.664 server #8 started [unreachable poller #1] 4646:20160920:040555.666 server #9 started [trapper #1] 4648:20160920:040555.668 server #11 started [trapper #3] 4650:20160920:040555.670 server #13 started [trapper #5] 4653:20160920:040555.672 server #16 started [housekeeper #1] 4658:20160920:040555.672 server #21 started [history syncer #2] 4656:20160920:040555.767 server #19 started [discoverer #1] 4619:20160920:040817.723 Got signal [signal:15(SIGTERM),sender_pid:4819,sender_uid:0,reason:0]. Exiting ... 4619:20160920:040819.746 syncing history data... 4619:20160920:040819.746 syncing history data done 4619:20160920:040819.746 syncing trends data... 4619:20160920:040819.746 syncing trends data done 4619:20160920:040819.747 Zabbix Server stopped. Zabbix 3.0.4 (revision 61185). 4825:20160920:040819.780 Starting Zabbix Server. Zabbix 3.2.0 (revision 62485). 4825:20160920:040819.780 ****** Enabled features ****** 4825:20160920:040819.780 SNMP monitoring: YES 4825:20160920:040819.780 IPMI monitoring: YES 4825:20160920:040819.780 Web monitoring: YES 4825:20160920:040819.780 VMware monitoring: YES 4825:20160920:040819.780 SMTP authentication: YES 4825:20160920:040819.780 Jabber notifications: YES 4825:20160920:040819.780 Ez Texting notifications: YES 4825:20160920:040819.780 ODBC: YES 4825:20160920:040819.780 SSH2 support: YES 4825:20160920:040819.780 IPv6 support: YES 4825:20160920:040819.780 TLS support: YES 4825:20160920:040819.780 ****************************** 4825:20160920:040819.780 using configuration file: /etc/zabbix/zabbix_server.conf 4825:20160920:040819.793 current database version (mandatory/optional): 03000000/03000000 4825:20160920:040819.793 required mandatory version: 03020000 4825:20160920:040819.793 starting automatic database upgrade 4825:20160920:040819.800 completed 1% of database upgrade 4825:20160920:040819.807 completed 2% of database upgrade 4825:20160920:040819.832 completed 3% of database upgrade 4825:20160920:040827.014 slow query: 7.181938 sec, "alter table history_text drop column id" 4825:20160920:040827.048 completed 5% of database upgrade 4825:20160920:040827.097 completed 6% of database upgrade 4825:20160920:040827.120 completed 7% of database upgrade 4825:20160920:040827.128 completed 8% of database upgrade 4825:20160920:040827.143 completed 10% of database upgrade 4825:20160920:040827.151 completed 11% of database upgrade 4825:20160920:040827.155 completed 12% of database upgrade 4825:20160920:040827.174 completed 13% of database upgrade 4825:20160920:040827.183 completed 15% of database upgrade 4825:20160920:040827.191 completed 16% of database upgrade 4825:20160920:040827.199 completed 17% of database upgrade 4825:20160920:040827.201 completed 18% of database upgrade 4825:20160920:040827.218 completed 20% of database upgrade 4825:20160920:040827.227 completed 21% of database upgrade 4825:20160920:040827.230 completed 22% of database upgrade 4825:20160920:040827.259 completed 24% of database upgrade 4825:20160920:040827.302 completed 25% of database upgrade 4825:20160920:040827.358 completed 26% of database upgrade 4825:20160920:040827.369 completed 27% of database upgrade 4825:20160920:040827.377 completed 29% of database upgrade 4825:20160920:040827.383 completed 30% of database upgrade 4825:20160920:040827.384 completed 31% of database upgrade 4825:20160920:040827.389 completed 32% of database upgrade 4825:20160920:040827.389 Action "Report problems to Zabbix administrators" condition "Trigger value = PROBLEM" will be removed during database upgrade: this type of condition is not supported anymore 4825:20160920:040827.390 completed 34% of database upgrade 4825:20160920:040827.402 completed 35% of database upgrade 4825:20160920:040827.416 completed 36% of database upgrade 4825:20160920:040827.422 completed 37% of database upgrade 4825:20160920:040827.429 completed 39% of database upgrade 4825:20160920:040827.434 completed 40% of database upgrade 4825:20160920:040827.438 completed 41% of database upgrade 4825:20160920:040827.446 completed 43% of database upgrade 4825:20160920:040827.463 completed 44% of database upgrade 4825:20160920:040827.475 completed 45% of database upgrade 4825:20160920:040827.480 completed 46% of database upgrade 4825:20160920:040827.498 completed 48% of database upgrade 4825:20160920:040827.516 completed 49% of database upgrade 4825:20160920:040827.528 completed 50% of database upgrade 4825:20160920:040827.529 completed 51% of database upgrade 4825:20160920:040827.530 completed 53% of database upgrade 4825:20160920:040827.536 completed 54% of database upgrade 4825:20160920:040827.543 completed 55% of database upgrade 4825:20160920:040827.548 completed 56% of database upgrade 4825:20160920:040827.565 completed 58% of database upgrade 4825:20160920:040827.572 completed 59% of database upgrade 4825:20160920:040827.576 completed 60% of database upgrade 4825:20160920:040827.593 completed 62% of database upgrade 4825:20160920:040827.610 completed 63% of database upgrade 4825:20160920:040827.615 completed 64% of database upgrade 4825:20160920:040827.631 completed 65% of database upgrade 4825:20160920:040827.637 completed 67% of database upgrade 4825:20160920:040827.641 completed 68% of database upgrade 4825:20160920:040827.646 completed 69% of database upgrade 4825:20160920:040827.649 completed 70% of database upgrade 4825:20160920:040827.666 completed 72% of database upgrade 4825:20160920:040827.675 completed 73% of database upgrade 4825:20160920:040827.683 completed 74% of database upgrade 4825:20160920:040827.687 completed 75% of database upgrade 4825:20160920:040827.693 completed 77% of database upgrade 4825:20160920:040827.697 completed 78% of database upgrade 4825:20160920:040827.701 completed 79% of database upgrade 4825:20160920:040827.703 completed 81% of database upgrade 4825:20160920:040827.718 completed 82% of database upgrade 4825:20160920:040827.725 completed 83% of database upgrade 4825:20160920:040827.726 completed 84% of database upgrade 4825:20160920:040827.728 completed 86% of database upgrade 4825:20160920:040827.732 completed 87% of database upgrade 4825:20160920:040827.739 completed 88% of database upgrade 4825:20160920:040827.750 completed 89% of database upgrade 4825:20160920:040827.762 completed 91% of database upgrade 4825:20160920:040827.768 completed 92% of database upgrade 4825:20160920:040827.776 completed 93% of database upgrade 4825:20160920:040827.781 completed 94% of database upgrade 4825:20160920:040827.786 completed 96% of database upgrade 4825:20160920:040827.796 completed 97% of database upgrade 4825:20160920:040827.797 completed 98% of database upgrade 4825:20160920:040827.797 completed 100% of database upgrade 4825:20160920:040827.797 database upgrade fully completed I don't think that server was killed because of slow database update.
          Hide
          Oleksiy Zagorskyi added a comment - - edited

          Andris, did you do complete 3.0 -> 3.2 upgrade using zabbix packages (yum update..., like people above) ?

          Andris Zeila Yes, as closely I could. I did have empty database (will try with some data later), but I can't find why systemd whould want to kill a running process without enabled watchdog.

          Show
          Oleksiy Zagorskyi added a comment - - edited Andris, did you do complete 3.0 -> 3.2 upgrade using zabbix packages (yum update..., like people above) ? Andris Zeila Yes, as closely I could. I did have empty database (will try with some data later), but I can't find why systemd whould want to kill a running process without enabled watchdog.
          Hide
          Alexander Yesipov added a comment -

          I've tried to stop and disable zabbix-server unit before upgrading, everything went without process killing.

          Show
          Alexander Yesipov added a comment - I've tried to stop and disable zabbix-server unit before upgrading, everything went without process killing.
          Hide
          Vladislavs Sokurenko added a comment -

          Hello

          Could you please be so kind and try this manual below to determine why process was killed ?
          http://www.oracle.com/technetwork/articles/servers-storage-dev/oom-killer-1911807.html

          You can start with
          grep -i kill /var/log/messages*

          Do you think that your system might be running out of memory ?

          Show
          Vladislavs Sokurenko added a comment - Hello Could you please be so kind and try this manual below to determine why process was killed ? http://www.oracle.com/technetwork/articles/servers-storage-dev/oom-killer-1911807.html You can start with grep -i kill /var/log/messages* Do you think that your system might be running out of memory ?
          Hide
          Kodai Terashima added a comment -

          It seems that the problem is related to systemd, however upgrade succeeded with bit large DB on Ubuntu Xenial. Upgrading process took around 30 seconds, but no problem. I will check it on CentOS as well.

          Which minor version of CentOS and systemd are you using? The information below are useful to reproduce this problem.

          cat /etc/redhat-release
          rpm -qa | grep systemd
          
          Show
          Kodai Terashima added a comment - It seems that the problem is related to systemd, however upgrade succeeded with bit large DB on Ubuntu Xenial. Upgrading process took around 30 seconds, but no problem. I will check it on CentOS as well. Which minor version of CentOS and systemd are you using? The information below are useful to reproduce this problem. cat /etc/redhat-release rpm -qa | grep systemd
          Hide
          Oleksiy Zagorskyi added a comment -

          Vladislavs, it's less likely that OOM killer activity could appear at db upgrade stage (especially before all zabbix processes are started), on systems where zabbix server was previously working successfully.

          Show
          Oleksiy Zagorskyi added a comment - Vladislavs, it's less likely that OOM killer activity could appear at db upgrade stage (especially before all zabbix processes are started), on systems where zabbix server was previously working successfully.
          Hide
          sles added a comment -

          [root@zabbix ~]# cat /etc/redhat-release
          CentOS Linux release 7.2.1511 (Core)
          [root@zabbix ~]#

          and this is not out of memory , just because

          https://support.zabbix.com/browse/ZBX-11203?focusedCommentId=195844&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-195844

          Show
          sles added a comment - [root@zabbix ~] # cat /etc/redhat-release CentOS Linux release 7.2.1511 (Core) [root@zabbix ~] # and this is not out of memory , just because https://support.zabbix.com/browse/ZBX-11203?focusedCommentId=195844&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-195844
          Hide
          sles added a comment -

          >Upgrading process took around 30 seconds

          you have small database, right?

          Show
          sles added a comment - >Upgrading process took around 30 seconds you have small database, right?
          Hide
          Andris Zeila added a comment -

          Actually populating problems table from events during 3.0->3.2 database upgrade could easily use a lot of memory on large installations. But this would happen after history table changes and if Zabbix would be killed at this point the dbversion table would correctly reflect the real database state.

          In the logs posted by Alexander the server was restarted right after starting it first time after upgrade. That's why I'm wondering if it's simply being restarted twice during upgrade process rather than systemd forcing restart because of some startup/running error detection.

          Show
          Andris Zeila added a comment - Actually populating problems table from events during 3.0->3.2 database upgrade could easily use a lot of memory on large installations. But this would happen after history table changes and if Zabbix would be killed at this point the dbversion table would correctly reflect the real database state. In the logs posted by Alexander the server was restarted right after starting it first time after upgrade. That's why I'm wondering if it's simply being restarted twice during upgrade process rather than systemd forcing restart because of some startup/running error detection.
          Hide
          Oleksiy Zagorskyi added a comment -

          sles, you missed result of rpm -qa | grep systemd, requested by Kodai
          Please provide it.

          Show
          Oleksiy Zagorskyi added a comment - sles, you missed result of rpm -qa | grep systemd , requested by Kodai Please provide it.
          Hide
          sles added a comment -

          [root@zabbix ~]# rpm -qa | grep systemd
          systemd-sysv-219-19.el7_2.13.x86_64
          systemd-219-19.el7_2.13.x86_64
          systemd-libs-219-19.el7_2.13.x86_64
          [root@zabbix ~]#

          imho, it is impossible to have different versions if one install updates

          Show
          sles added a comment - [root@zabbix ~] # rpm -qa | grep systemd systemd-sysv-219-19.el7_2.13.x86_64 systemd-219-19.el7_2.13.x86_64 systemd-libs-219-19.el7_2.13.x86_64 [root@zabbix ~] # imho, it is impossible to have different versions if one install updates
          Hide
          richlv added a comment -

          it's definitely not oom-killer. oom-killer won't send you a nice sigterm, you would get sigkilled

          Show
          richlv added a comment - it's definitely not oom-killer. oom-killer won't send you a nice sigterm, you would get sigkilled
          Hide
          Anderson Ayres added a comment -

          Any news about that?
          Still the same problem:

           18671:20161122:124324.208 Starting Zabbix Server. Zabbix 3.2.1 (revision 62890).
           18671:20161122:124324.208 ****** Enabled features ******
           18671:20161122:124324.208 SNMP monitoring:           YES
           18671:20161122:124324.208 IPMI monitoring:           YES
           18671:20161122:124324.208 Web monitoring:            YES
           18671:20161122:124324.209 VMware monitoring:         YES
           18671:20161122:124324.209 SMTP authentication:       YES
           18671:20161122:124324.209 Jabber notifications:      YES
           18671:20161122:124324.209 Ez Texting notifications:  YES
           18671:20161122:124324.209 ODBC:                      YES
           18671:20161122:124324.209 SSH2 support:              YES
           18671:20161122:124324.209 IPv6 support:              YES
           18671:20161122:124324.209 TLS support:               YES
           18671:20161122:124324.209 ******************************
           18671:20161122:124324.210 using configuration file: /etc/zabbix/zabbix_server.conf
           18671:20161122:124324.222 current database version (mandatory/optional): 03010002/03010002
           18671:20161122:124324.222 required mandatory version: 03020000
           18671:20161122:124324.222 starting automatic database upgrade
           18671:20161122:124324.223 [Z3005] query failed: [1091] Can't DROP 'id'; check that column/key exists [alter table history_text drop column id]
           18671:20161122:124324.223 database upgrade failed
          
          Show
          Anderson Ayres added a comment - Any news about that? Still the same problem: 18671:20161122:124324.208 Starting Zabbix Server. Zabbix 3.2.1 (revision 62890). 18671:20161122:124324.208 ****** Enabled features ****** 18671:20161122:124324.208 SNMP monitoring: YES 18671:20161122:124324.208 IPMI monitoring: YES 18671:20161122:124324.208 Web monitoring: YES 18671:20161122:124324.209 VMware monitoring: YES 18671:20161122:124324.209 SMTP authentication: YES 18671:20161122:124324.209 Jabber notifications: YES 18671:20161122:124324.209 Ez Texting notifications: YES 18671:20161122:124324.209 ODBC: YES 18671:20161122:124324.209 SSH2 support: YES 18671:20161122:124324.209 IPv6 support: YES 18671:20161122:124324.209 TLS support: YES 18671:20161122:124324.209 ****************************** 18671:20161122:124324.210 using configuration file: /etc/zabbix/zabbix_server.conf 18671:20161122:124324.222 current database version (mandatory/optional): 03010002/03010002 18671:20161122:124324.222 required mandatory version: 03020000 18671:20161122:124324.222 starting automatic database upgrade 18671:20161122:124324.223 [Z3005] query failed: [1091] Can't DROP 'id'; check that column/key exists [alter table history_text drop column id] 18671:20161122:124324.223 database upgrade failed
          Hide
          Anderson Ayres added a comment - - edited

          Well, for those who got this error:

          # systemctl stop zabbix-server
          # mysql -u user -pPassword
          mysql> use zabbix;
          mysql> alter table history_text add column id BIGINT(20);
          mysql> alter table history_log add column id BIGINT(20);
          # systemctl start zabbix-server
          - Wait for upgrade to complete:
          # tailf /var/log/zabbix/zabbix_server.log
          

          Done.

          Show
          Anderson Ayres added a comment - - edited Well, for those who got this error: # systemctl stop zabbix-server # mysql -u user -pPassword mysql> use zabbix; mysql> alter table history_text add column id BIGINT(20); mysql> alter table history_log add column id BIGINT(20); # systemctl start zabbix-server - Wait for upgrade to complete: # tailf / var /log/zabbix/zabbix_server.log Done.
          Hide
          Glebs Ivanovskis added a comment -

          What I can observe in attached logs:

          • server gets killed at very early stage of upgrade (~0.2 seconds after start in Alexander Yesipov's case);
          • kill logs "Usage" for some reason (for both Alexander Yesipov and Horigome Yoshihito);
          • server stop takes a bit longer in production systems where potentially lots of trends and history data needs to be synced (absolute minimum is 2 seconds because of sleep(2) in main process code, Andris Zeila's stop took just a bit over 2 seconds, Alexander Yesipov's stop takes ~2.2 seconds).

          Based on these observations I have a suspicion that proper stopping of Zabbix server by systemd is a problem. My hypothesis of what's happening there:

          1. Old Zabbix server is running  
          2.   systemd gets a request to restart Zabbix server and sends SIGTERM
          3. Old Zabbix server receives SIGTERM and initiates stopping procedure... which takes a while...  
          4.   systemd checks back and sees that Zabbix is still running and decides to send another SIGTERM
          5. Old Zabbix removes PID file and finally stops  
          6.   still determined to kill Zabbix, systemd tries to read PID file and kill Zabbix by PID, but since there is no more PID file kill fails and logs "Usage"
          7. New Zabbix starts creating PID file and starting DB upgrade procedure  
          8.   systemd sees kill failed and tries once more, now successfully killing new Zabbix server in the middle of upgrade

          Non-atomicity of steps 4. and 6. leads to a race condition. Probably there should be a sleep interval between stopping and starting service to let systemd killing instinct calm down.

          Show
          Glebs Ivanovskis added a comment - What I can observe in attached logs: server gets killed at very early stage of upgrade (~0.2 seconds after start in Alexander Yesipov 's case); kill logs "Usage" for some reason (for both Alexander Yesipov and Horigome Yoshihito ); server stop takes a bit longer in production systems where potentially lots of trends and history data needs to be synced (absolute minimum is 2 seconds because of sleep(2) in main process code, Andris Zeila 's stop took just a bit over 2 seconds, Alexander Yesipov 's stop takes ~2.2 seconds). Based on these observations I have a suspicion that proper stopping of Zabbix server by systemd is a problem. My hypothesis of what's happening there: 1. Old Zabbix server is running   2.   systemd gets a request to restart Zabbix server and sends SIGTERM 3. Old Zabbix server receives SIGTERM and initiates stopping procedure... which takes a while...   4.   systemd checks back and sees that Zabbix is still running and decides to send another SIGTERM 5. Old Zabbix removes PID file and finally stops   6.   still determined to kill Zabbix, systemd tries to read PID file and kill Zabbix by PID, but since there is no more PID file kill fails and logs "Usage" 7. New Zabbix starts creating PID file and starting DB upgrade procedure   8.   systemd sees kill failed and tries once more, now successfully killing new Zabbix server in the middle of upgrade Non-atomicity of steps 4. and 6. leads to a race condition. Probably there should be a sleep interval between stopping and starting service to let systemd killing instinct calm down.
          Hide
          Oleksiy Zagorskyi added a comment - - edited

          It's indeed a real scenario, thank you Gleb !
          What I want to add is that in production zabbix server usually needs much longer than 2-3 seconds to stop - it's 20-60+ seconds or so.
          Usually (I'd say in 70+% cases) in zabbix_server.log in different production we do NOT see successful server stop (it shows interrupted on history or trends sync), which means that init/systemd stopped if after timeout by SIGKILL.
          Here are several ZBXs which touches this "timeout" topic.
          I recall another discussion/ZBX, where systemd tries to start zabbix-server service again if it was stopped not under systemd's control for example it was correctly terminated by SIGTERM to main process.
          Hmm - are we sure that zabbix-server daemon during upgrade is stopped using systemd ?
          Not related, but there is a ZBX-9867 discussion around systemd.

          Maybe for upgrade procedure we could suggest or even require to stop zabbix-server daemon before trying to upgrade to major version?
          Not sure all people will follow this recommendation/rule (as people read doc not so often ), but at least we will do something better.

          Andris Zeila At least for major version upgrade I'd strongly recommend making database backup first (it's also mentioned in server upgrade steps). So that would require manual upgrade procedure.

          Glebs Ivanovskis Quick and dirty fix - sleep() before starting actual upgrade?

          Kodai Terashima I think it's not possible to sleep between stop and start for systemd. I guess it's better not to use pidfile, using Type=simple and starting zabbix process in foreground may be a solution for this problem

          Glebs Ivanovskis Some say that service restart is done not by yum/apt-get but by packages' postinst scripts. Kodai, could you please comment on that?

          Kodai Terashima Yes, package try restart processes during upgrade if it is running. This behavior is general, same as other packages.

          Glebs Ivanovskis Thank you for response! Is it possible to use stop-sleep-start instead of restart in postinst script if timeout between stop and start is not possible in systemd's restart? By the way, I can provide a "retarder" loadable module if you need to test a slowly stopping server without creating a huge configuration.

          Show
          Oleksiy Zagorskyi added a comment - - edited It's indeed a real scenario, thank you Gleb ! What I want to add is that in production zabbix server usually needs much longer than 2-3 seconds to stop - it's 20-60+ seconds or so. Usually (I'd say in 70+% cases) in zabbix_server.log in different production we do NOT see successful server stop (it shows interrupted on history or trends sync), which means that init/systemd stopped if after timeout by SIGKILL. Here are several ZBXs which touches this "timeout" topic. I recall another discussion/ZBX, where systemd tries to start zabbix-server service again if it was stopped not under systemd's control for example it was correctly terminated by SIGTERM to main process. Hmm - are we sure that zabbix-server daemon during upgrade is stopped using systemd ? Not related, but there is a ZBX-9867 discussion around systemd. Maybe for upgrade procedure we could suggest or even require to stop zabbix-server daemon before trying to upgrade to major version? Not sure all people will follow this recommendation/rule (as people read doc not so often ), but at least we will do something better. Andris Zeila At least for major version upgrade I'd strongly recommend making database backup first (it's also mentioned in server upgrade steps). So that would require manual upgrade procedure. Glebs Ivanovskis Quick and dirty fix - sleep() before starting actual upgrade? Kodai Terashima I think it's not possible to sleep between stop and start for systemd. I guess it's better not to use pidfile, using Type=simple and starting zabbix process in foreground may be a solution for this problem Glebs Ivanovskis Some say that service restart is done not by yum / apt-get but by packages' postinst scripts. Kodai, could you please comment on that? Kodai Terashima Yes, package try restart processes during upgrade if it is running. This behavior is general, same as other packages. Glebs Ivanovskis Thank you for response! Is it possible to use stop-sleep-start instead of restart in postinst script if timeout between stop and start is not possible in systemd's restart? By the way, I can provide a "retarder" loadable module if you need to test a slowly stopping server without creating a huge configuration.
          Hide
          sles added a comment -

          > Maybe for upgrade procedure we could suggest or even require to stop zabbix-server daemon before trying to upgrade to major version?

          I guess only real way is to disable automatic upgrades in this case...

          Show
          sles added a comment - > Maybe for upgrade procedure we could suggest or even require to stop zabbix-server daemon before trying to upgrade to major version? I guess only real way is to disable automatic upgrades in this case...
          Hide
          Alex added a comment -

          I have the same problrem:

          32494:20161201:163201.452 [Z3005] query failed: [1206] The total number of locks exceeds the lock table size [insert into event_recovery (eventid,r_eventid) values
          32494:20161201:163201.472 database upgrade failed

          Show
          Alex added a comment - I have the same problrem: 32494:20161201:163201.452 [Z3005] query failed: [1206] The total number of locks exceeds the lock table size [insert into event_recovery (eventid,r_eventid) values 32494:20161201:163201.472 database upgrade failed
          Hide
          Sinisa added a comment - - edited

          I had the problem when upgrading from 3.0.7 to 3.2.3. My database is on CentOs 7 with Postgresql 9.2.
          First I stop server with zabbix 3.0.7 and start 3.2.3.

          17366:20170113:122843.317 Got signal [signal:15(SIGTERM),sender_pid:16561,sender_uid:0,reason:0]. Exiting ...
           17365:20170113:122843.318 Got signal [signal:15(SIGTERM),sender_pid:16561,sender_uid:0,reason:0]. Exiting ...
          17363:20170113:122844.950 syncing history data...
           17363:20170113:122844.994 syncing history data done
           17363:20170113:122844.994 syncing trends data...
           16593:20170113:122908.208 Starting Zabbix Server. Zabbix 3.2.3 (revision 64610).
           16593:20170113:122908.208 ****** Enabled features ******
           16593:20170113:122908.209 SNMP monitoring:           YES
           16593:20170113:122908.209 IPMI monitoring:           YES
           16593:20170113:122908.209 Web monitoring:            YES
           16593:20170113:122908.209 VMware monitoring:         YES
           16593:20170113:122908.209 SMTP authentication:        NO
           16593:20170113:122908.209 Jabber notifications:      YES
           16593:20170113:122908.209 Ez Texting notifications:  YES
           16593:20170113:122908.209 ODBC:                      YES
           16593:20170113:122908.209 SSH2 support:              YES
           16593:20170113:122908.209 IPv6 support:               NO
           16593:20170113:122908.209 TLS support:                NO
           16593:20170113:122908.209 ******************************
           16593:20170113:122908.209 using configuration file: /opt/zabbix-3.2.3/etc/zabbix_server.conf
           16593:20170113:122908.226 current database version (mandatory/optional): 03000000/03000000
           16593:20170113:122908.227 required mandatory version: 03020000
           16593:20170113:122908.227 starting automatic database upgrade
           16593:20170113:122909.043 completed 1% of database upgrade
           16593:20170113:122909.477 completed 2% of database upgrade
           16593:20170113:122909.477 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR:  index "history_text_2" does not exist
           [drop index history_text_2]
           16593:20170113:122909.478 database upgrade failed
          

          After add index history_text_2 on history_log (create index history_text_2 on history_log (itemid);) upgrade successfully finish!

          16708:20170113:130258.051 Starting Zabbix Server. Zabbix 3.2.3 (revision 64610).
           16708:20170113:130258.051 ****** Enabled features ******
           16708:20170113:130258.051 SNMP monitoring:           YES
           16708:20170113:130258.051 IPMI monitoring:           YES
          16708:20170113:130258.051 Web monitoring:            YES
           16708:20170113:130258.051 VMware monitoring:         YES
           16708:20170113:130258.051 SMTP authentication:        NO
           16708:20170113:130258.051 Jabber notifications:      YES
           16708:20170113:130258.051 Ez Texting notifications:  YES
           16708:20170113:130258.051 ODBC:                      YES
           16708:20170113:130258.051 SSH2 support:              YES
           16708:20170113:130258.052 IPv6 support:               NO
           16708:20170113:130258.052 TLS support:                NO
           16708:20170113:130258.052 ******************************
           16708:20170113:130258.052 using configuration file: /opt/zabbix-3.2.3/etc/zabbix_server.conf
           16708:20170113:130258.092 current database version (mandatory/optional): 03010001/03010001
           16708:20170113:130258.092 required mandatory version: 03020000
           16708:20170113:130258.092 starting automatic database upgrade
           16708:20170113:130258.340 completed 1% of database upgrade
           16708:20170113:130258.391 completed 2% of database upgrade
           16708:20170113:130258.710 completed 3% of database upgrade
           16708:20170113:130259.137 completed 5% of database upgrade
          NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "trigger_tag_pkey" for table "trigger_tag"
           16708:20170113:130259.145 completed 6% of database upgrade
           16708:20170113:130259.147 completed 7% of database upgrade
           16708:20170113:130259.165 completed 9% of database upgrade
          NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "event_tag_pkey" for table "event_tag"
           16708:20170113:130259.175 completed 10% of database upgrade
           16708:20170113:130259.177 completed 11% of database upgrade
           16708:20170113:130259.186 completed 12% of database upgrade
           16708:20170113:130259.221 completed 14% of database upgrade
           16708:20170113:130259.309 completed 15% of database upgrade
          NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "problem_pkey" for table "problem"
           16708:20170113:130259.313 completed 16% of database upgrade
          

          Maybe someone help!

          Show
          Sinisa added a comment - - edited I had the problem when upgrading from 3.0.7 to 3.2.3. My database is on CentOs 7 with Postgresql 9.2. First I stop server with zabbix 3.0.7 and start 3.2.3. 17366:20170113:122843.317 Got signal [signal:15(SIGTERM),sender_pid:16561,sender_uid:0,reason:0]. Exiting ... 17365:20170113:122843.318 Got signal [signal:15(SIGTERM),sender_pid:16561,sender_uid:0,reason:0]. Exiting ... 17363:20170113:122844.950 syncing history data... 17363:20170113:122844.994 syncing history data done 17363:20170113:122844.994 syncing trends data... 16593:20170113:122908.208 Starting Zabbix Server. Zabbix 3.2.3 (revision 64610). 16593:20170113:122908.208 ****** Enabled features ****** 16593:20170113:122908.209 SNMP monitoring: YES 16593:20170113:122908.209 IPMI monitoring: YES 16593:20170113:122908.209 Web monitoring: YES 16593:20170113:122908.209 VMware monitoring: YES 16593:20170113:122908.209 SMTP authentication: NO 16593:20170113:122908.209 Jabber notifications: YES 16593:20170113:122908.209 Ez Texting notifications: YES 16593:20170113:122908.209 ODBC: YES 16593:20170113:122908.209 SSH2 support: YES 16593:20170113:122908.209 IPv6 support: NO 16593:20170113:122908.209 TLS support: NO 16593:20170113:122908.209 ****************************** 16593:20170113:122908.209 using configuration file: /opt/zabbix-3.2.3/etc/zabbix_server.conf 16593:20170113:122908.226 current database version (mandatory/optional): 03000000/03000000 16593:20170113:122908.227 required mandatory version: 03020000 16593:20170113:122908.227 starting automatic database upgrade 16593:20170113:122909.043 completed 1% of database upgrade 16593:20170113:122909.477 completed 2% of database upgrade 16593:20170113:122909.477 [Z3005] query failed: [0] PGRES_FATAL_ERROR:ERROR: index "history_text_2" does not exist [drop index history_text_2] 16593:20170113:122909.478 database upgrade failed After add index history_text_2 on history_log (create index history_text_2 on history_log (itemid);) upgrade successfully finish! 16708:20170113:130258.051 Starting Zabbix Server. Zabbix 3.2.3 (revision 64610). 16708:20170113:130258.051 ****** Enabled features ****** 16708:20170113:130258.051 SNMP monitoring: YES 16708:20170113:130258.051 IPMI monitoring: YES 16708:20170113:130258.051 Web monitoring: YES 16708:20170113:130258.051 VMware monitoring: YES 16708:20170113:130258.051 SMTP authentication: NO 16708:20170113:130258.051 Jabber notifications: YES 16708:20170113:130258.051 Ez Texting notifications: YES 16708:20170113:130258.051 ODBC: YES 16708:20170113:130258.051 SSH2 support: YES 16708:20170113:130258.052 IPv6 support: NO 16708:20170113:130258.052 TLS support: NO 16708:20170113:130258.052 ****************************** 16708:20170113:130258.052 using configuration file: /opt/zabbix-3.2.3/etc/zabbix_server.conf 16708:20170113:130258.092 current database version (mandatory/optional): 03010001/03010001 16708:20170113:130258.092 required mandatory version: 03020000 16708:20170113:130258.092 starting automatic database upgrade 16708:20170113:130258.340 completed 1% of database upgrade 16708:20170113:130258.391 completed 2% of database upgrade 16708:20170113:130258.710 completed 3% of database upgrade 16708:20170113:130259.137 completed 5% of database upgrade NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "trigger_tag_pkey" for table "trigger_tag" 16708:20170113:130259.145 completed 6% of database upgrade 16708:20170113:130259.147 completed 7% of database upgrade 16708:20170113:130259.165 completed 9% of database upgrade NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "event_tag_pkey" for table "event_tag" 16708:20170113:130259.175 completed 10% of database upgrade 16708:20170113:130259.177 completed 11% of database upgrade 16708:20170113:130259.186 completed 12% of database upgrade 16708:20170113:130259.221 completed 14% of database upgrade 16708:20170113:130259.309 completed 15% of database upgrade NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "problem_pkey" for table "problem" 16708:20170113:130259.313 completed 16% of database upgrade Maybe someone help!
          Hide
          max kub added a comment -

          Hello, I have a similar issue. Upgrading from 3.0.9 to 3.2.x running on debian jessie.

          I stopped the server befor "apt-get install --only-upgrade zabbix-server-mysql zabbix-frontend-php zabbix-agent".
          The appended "slow query: 3.095132 sec, "insert into problem", persists and pollutes the logfile afterwards...
          Do I just need patience? Yesterday it ran like 35 minutes, until I decided to break it up.

          Btw. first I had this error too. But I could avoid it, by not restart the server while upgrading database:

           18671:20161122:124324.223 [Z3005] query failed: [1091] Can't DROP 'id'; check that column/key exists [alter table history_text drop column id]
           18671:20161122:124324.223 database upgrade failed
          

          I also changed to recommended collation 'utf8_bin' with no luck.

          What is the solution? How to upgrade DB manually? Thanks.

          zabbix-server.log

            3713:20170620:082346.484 IPv6 support:              YES
            3713:20170620:082346.484 TLS support:               YES
            3713:20170620:082346.484 ******************************
            3713:20170620:082346.484 using configuration file: /etc/zabbix/zabbix_server.conf
            3713:20170620:082346.489 current database version (mandatory/optional): 03000000/03000000
            3713:20170620:082346.490 required mandatory version: 03020000
            3713:20170620:082346.490 starting automatic database upgrade
            3713:20170620:082346.861 completed 1% of database upgrade
            3713:20170620:082347.473 completed 2% of database upgrade
            3713:20170620:082347.724 completed 3% of database upgrade
            3713:20170620:082939.847 slow query: 352.122721 sec, "alter table history_text drop column id"
            3713:20170620:082939.882 completed 5% of database upgrade
            3713:20170620:082940.747 completed 6% of database upgrade
            3713:20170620:082941.299 completed 7% of database upgrade
            3713:20170620:082941.913 completed 8% of database upgrade
            3713:20170620:082942.529 completed 10% of database upgrade
            3713:20170620:082942.800 completed 11% of database upgrade
            3713:20170620:082942.966 completed 12% of database upgrade
            3713:20170620:082943.195 completed 13% of database upgrade
            3713:20170620:082943.620 completed 15% of database upgrade
            3713:20170620:082943.893 completed 16% of database upgrade
            3713:20170620:082944.218 completed 17% of database upgrade
            3713:20170620:082944.465 completed 18% of database upgrade
            3713:20170620:082944.933 completed 20% of database upgrade
            3713:20170620:082945.168 completed 21% of database upgrade
            3713:20170620:082945.283 completed 22% of database upgrade
            3713:20170620:082945.534 completed 24% of database upgrade
            3713:20170620:082945.803 completed 25% of database upgrade
            3713:20170620:082946.067 completed 26% of database upgrade
            3713:20170620:083153.385 slow query: 3.095132 sec, "insert into problem (eventid,source,object,objectid) values (16685497,3,4,4780613),(14294956,3,4,2632664),(15805118,3,4,3993602),(15867477,3,4,4049926),(15038907,3,4,3299055),(16946137,3,4,5024993),(14922493,3,4,3194783),(15390572,3,4,3620347),(16958406,3,4,5036151),(14171859,3,4,2516020),(14227407,3,4,2569501),(15733479,3,4,3930744),(16394568,3,4,4504454),(16092718,3,4,4248719),(16759622,3,4,4850282),(15071737,3,4,3328635),(14210347,3,4,2553331),(14016084,3,4,2370426),(16551244,3,4,4651256),(16921719,3,4,5002650),(16847391,3,4,4933320),(15177345,3,4,3425227),(15357457,3,4,3589815),(14045158,3,4,2397444),(15538086,3,4,3756376),(14755681,3,4,3043446),(16967651,3,4,5044542),(16430661,3,4,4538659),(16934494,3,4,5014289),(14452590,3,4,2775284),(15618484,3,4,3830922),(15223677,3,4,3468255),(16716887,3,4,4810222),(13938165,3,4,2297530),(
          
          Show
          max kub added a comment - Hello, I have a similar issue. Upgrading from 3.0.9 to 3.2.x running on debian jessie. I stopped the server befor "apt-get install --only-upgrade zabbix-server-mysql zabbix-frontend-php zabbix-agent". The appended "slow query: 3.095132 sec, "insert into problem", persists and pollutes the logfile afterwards... Do I just need patience? Yesterday it ran like 35 minutes, until I decided to break it up. Btw. first I had this error too. But I could avoid it, by not restart the server while upgrading database: 18671:20161122:124324.223 [Z3005] query failed: [1091] Can't DROP 'id'; check that column/key exists [alter table history_text drop column id] 18671:20161122:124324.223 database upgrade failed I also changed to recommended collation 'utf8_bin' with no luck. What is the solution? How to upgrade DB manually? Thanks. zabbix-server.log 3713:20170620:082346.484 IPv6 support: YES 3713:20170620:082346.484 TLS support: YES 3713:20170620:082346.484 ****************************** 3713:20170620:082346.484 using configuration file: /etc/zabbix/zabbix_server.conf 3713:20170620:082346.489 current database version (mandatory/optional): 03000000/03000000 3713:20170620:082346.490 required mandatory version: 03020000 3713:20170620:082346.490 starting automatic database upgrade 3713:20170620:082346.861 completed 1% of database upgrade 3713:20170620:082347.473 completed 2% of database upgrade 3713:20170620:082347.724 completed 3% of database upgrade 3713:20170620:082939.847 slow query: 352.122721 sec, "alter table history_text drop column id" 3713:20170620:082939.882 completed 5% of database upgrade 3713:20170620:082940.747 completed 6% of database upgrade 3713:20170620:082941.299 completed 7% of database upgrade 3713:20170620:082941.913 completed 8% of database upgrade 3713:20170620:082942.529 completed 10% of database upgrade 3713:20170620:082942.800 completed 11% of database upgrade 3713:20170620:082942.966 completed 12% of database upgrade 3713:20170620:082943.195 completed 13% of database upgrade 3713:20170620:082943.620 completed 15% of database upgrade 3713:20170620:082943.893 completed 16% of database upgrade 3713:20170620:082944.218 completed 17% of database upgrade 3713:20170620:082944.465 completed 18% of database upgrade 3713:20170620:082944.933 completed 20% of database upgrade 3713:20170620:082945.168 completed 21% of database upgrade 3713:20170620:082945.283 completed 22% of database upgrade 3713:20170620:082945.534 completed 24% of database upgrade 3713:20170620:082945.803 completed 25% of database upgrade 3713:20170620:082946.067 completed 26% of database upgrade 3713:20170620:083153.385 slow query: 3.095132 sec, "insert into problem (eventid,source,object,objectid) values (16685497,3,4,4780613),(14294956,3,4,2632664),(15805118,3,4,3993602),(15867477,3,4,4049926),(15038907,3,4,3299055),(16946137,3,4,5024993),(14922493,3,4,3194783),(15390572,3,4,3620347),(16958406,3,4,5036151),(14171859,3,4,2516020),(14227407,3,4,2569501),(15733479,3,4,3930744),(16394568,3,4,4504454),(16092718,3,4,4248719),(16759622,3,4,4850282),(15071737,3,4,3328635),(14210347,3,4,2553331),(14016084,3,4,2370426),(16551244,3,4,4651256),(16921719,3,4,5002650),(16847391,3,4,4933320),(15177345,3,4,3425227),(15357457,3,4,3589815),(14045158,3,4,2397444),(15538086,3,4,3756376),(14755681,3,4,3043446),(16967651,3,4,5044542),(16430661,3,4,4538659),(16934494,3,4,5014289),(14452590,3,4,2775284),(15618484,3,4,3830922),(15223677,3,4,3468255),(16716887,3,4,4810222),(13938165,3,4,2297530),(
          Hide
          max kub added a comment -

          Nevermind. It took 2 hours... "patience.. patience, you must have, young padawan..."

          Show
          max kub added a comment - Nevermind. It took 2 hours... "patience.. patience, you must have, young padawan..."
          Hide
          James Murray added a comment -

          I'm in the middle of a database upgrade. I don't want to stop it, but I am concerned it isn't actually doing anything. According to the logs it has been at 3% coming up on an hour.

          It is a VM running CentOS7.

          I found the size earlier, it is around 300GB stored on SAS. Top shows that mysql is chugging around 50% of 12 assigned cores of a UCS blade.

          Any recommendations? I'm going to give it a few hours, but I'm really hoping to see 4% soon. 1-3 happened very quickly.

          [root@mci2-zabbix-master zabbix]# date
          Mon Jul 17 10:53:33 CDT 2017
          [root@mci2-zabbix-master zabbix]# tailf /var/log/zabbix/zabbix_server.log
            3377:20170717:100433.659 TLS support:               YES
            3377:20170717:100433.659 ******************************
            3377:20170717:100433.659 using configuration file: /etc/zabbix/zabbix_server.conf
            3377:20170717:100433.698 current database version (mandatory/optional): 03000000/03000000
            3377:20170717:100433.698 required mandatory version: 03020000
            3377:20170717:100433.698 starting automatic database upgrade
            3377:20170717:100433.786 completed 1% of database upgrade
            3377:20170717:100433.839 completed 2% of database upgrade
            3377:20170717:100438.502 slow query: 4.662223 sec, "drop index history_text_2 on history_text"
            3377:20170717:100438.504 completed 3% of database upgrade
          
          Show
          James Murray added a comment - I'm in the middle of a database upgrade. I don't want to stop it, but I am concerned it isn't actually doing anything. According to the logs it has been at 3% coming up on an hour. It is a VM running CentOS7. I found the size earlier, it is around 300GB stored on SAS. Top shows that mysql is chugging around 50% of 12 assigned cores of a UCS blade. Any recommendations? I'm going to give it a few hours, but I'm really hoping to see 4% soon. 1-3 happened very quickly. [root@mci2-zabbix-master zabbix]# date Mon Jul 17 10:53:33 CDT 2017 [root@mci2-zabbix-master zabbix]# tailf / var /log/zabbix/zabbix_server.log 3377:20170717:100433.659 TLS support: YES 3377:20170717:100433.659 ****************************** 3377:20170717:100433.659 using configuration file: /etc/zabbix/zabbix_server.conf 3377:20170717:100433.698 current database version (mandatory/optional): 03000000/03000000 3377:20170717:100433.698 required mandatory version: 03020000 3377:20170717:100433.698 starting automatic database upgrade 3377:20170717:100433.786 completed 1% of database upgrade 3377:20170717:100433.839 completed 2% of database upgrade 3377:20170717:100438.502 slow query: 4.662223 sec, "drop index history_text_2 on history_text" 3377:20170717:100438.504 completed 3% of database upgrade
          Hide
          Glebs Ivanovskis added a comment -

          FYI: It is dropping id field of history_text table.

          Show
          Glebs Ivanovskis added a comment - FYI: It is dropping id field of history_text table.
          Hide
          James Murray added a comment - - edited

          Thanks man. I'm not SQL man, but I just took a look at the process and it is moving.

          MariaDB [(none)]> SHOW PROCESSLIST;
          +------+--------+-----------+----------+---------+------+-------------------+-----------------------------------------+----------+
          | Id   | User   | Host      | db       | Command | Time | State             | Info                                    | Progress |
          +------+--------+-----------+----------+---------+------+-------------------+-----------------------------------------+----------+
          | 1348 | zabbix | localhost | zabbixdb | Query   | 8192 | copy to tmp table | alter table history_text drop column id |    5.110 |
          | 7995 | root   | localhost | NULL     | Query   |    0 | NULL              | SHOW PROCESSLIST                        |    0.000 |
          +------+--------+-----------+----------+---------+------+-------------------+-----------------------------------------+----------+
          2 rows in set (0.00 sec)
          
          Show
          James Murray added a comment - - edited Thanks man. I'm not SQL man, but I just took a look at the process and it is moving. MariaDB [(none)]> SHOW PROCESSLIST; +------+--------+-----------+----------+---------+------+-------------------+-----------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +------+--------+-----------+----------+---------+------+-------------------+-----------------------------------------+----------+ | 1348 | zabbix | localhost | zabbixdb | Query | 8192 | copy to tmp table | alter table history_text drop column id | 5.110 | | 7995 | root | localhost | NULL | Query | 0 | NULL | SHOW PROCESSLIST | 0.000 | +------+--------+-----------+----------+---------+------+-------------------+-----------------------------------------+----------+ 2 rows in set (0.00 sec)
          Hide
          Glebs Ivanovskis added a comment -

          I am no DB expert either, but as far as I know this operation shouldn't take that much time. Do you have custom indexes in Zabbix DB schema?

          Show
          Glebs Ivanovskis added a comment - I am no DB expert either, but as far as I know this operation shouldn't take that much time. Do you have custom indexes in Zabbix DB schema?
          Hide
          James Murray added a comment -

          I don't believe we have anything custom. We are an MSP and we've dumped a lot of proxies into this. I'd imagine we're on the large side of Zabbix instances and it has been collecting for a at least a couple months.

          Show
          James Murray added a comment - I don't believe we have anything custom. We are an MSP and we've dumped a lot of proxies into this. I'd imagine we're on the large side of Zabbix instances and it has been collecting for a at least a couple months.

            People

            • Assignee:
              Unassigned
              Reporter:
              sles
            • Votes:
              4 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

              • Created:
                Updated: