-
Incident report
-
Resolution: Unresolved
-
Trivial
-
None
-
5.0.15
Here is a stat of a zabbix upgrade from 4.0 to 5.0 on an installation with 20K NVPS and ~8TB of DB:
26483:20210909:091847.145 slow query: 1055.682584 sec, "alter table items drop column lastlogsize" 26483:20210909:093408.292 slow query: 921.146598 sec, "alter table items drop column state" 26483:20210909:094840.298 slow query: 872.005162 sec, "alter table items drop column mtime" 26483:20210909:100245.589 slow query: 845.290774 sec, "alter table items drop column error" ... 26483:20210909:102709.796 slow query: 813.845949 sec, "alter table items drop column snmp_community" 26483:20210909:103842.398 slow query: 692.601635 sec, "alter table items drop column snmpv3_securityname" 26483:20210909:104935.980 slow query: 653.581455 sec, "alter table items drop column snmpv3_securitylevel" 26483:20210909:105940.098 slow query: 604.117102 sec, "alter table items drop column snmpv3_authpassphrase" 26483:20210909:110908.887 slow query: 568.787719 sec, "alter table items drop column snmpv3_privpassphrase" 26483:20210909:111813.512 slow query: 544.624860 sec, "alter table items drop column snmpv3_authprotocol" 26483:20210909:112703.259 slow query: 529.746740 sec, "alter table items drop column snmpv3_privprotocol" 26483:20210909:113518.192 slow query: 494.931948 sec, "alter table items drop column snmpv3_contextname" 26483:20210909:114314.041 slow query: 475.847986 sec, "alter table items drop column port" ... 26483:20210909:115120.856 slow query: 469.411551 sec, "alter table `items` add `discover` integer default '0' not null"
The items table is ~2GB in size.
Those SQLs to drop columns are grouped in 2 bathes, where go in a row. Why would not group them together to perform at least in 2 SQLs? Or even group all together and execute at the end on patching by singe SQL?
That would save ~2 hours of zabbix server downtime at least.
Just in case, FYI: before performing the upgrade on PROD server, that db was cloned (without history/trends data) to another TEST server, where it took ~10GB (7-events, 2-items, etc) and those SQLs took there much less time:
50174:20210907:131331.066 slow query: 121.276720 sec, "alter table items drop column lastlogsize" 50174:20210907:131454.719 slow query: 83.651428 sec, "alter table items drop column state" 50174:20210907:131620.080 slow query: 85.358175 sec, "alter table items drop column mtime" 50174:20210907:131752.555 slow query: 92.471447 sec, "alter table items drop column error" ... 50174:20210907:132444.921 slow query: 144.348237 sec, "alter table items drop column snmp_community" 50174:20210907:132604.628 slow query: 79.702222 sec, "alter table items drop column snmpv3_securityname" 50174:20210907:132725.016 slow query: 80.386013 sec, "alter table items drop column snmpv3_securitylevel" 50174:20210907:132848.619 slow query: 83.601115 sec, "alter table items drop column snmpv3_authpassphrase" 50174:20210907:133016.665 slow query: 88.040098 sec, "alter table items drop column snmpv3_privpassphrase" 50174:20210907:133144.709 slow query: 88.041629 sec, "alter table items drop column snmpv3_authprotocol" 50174:20210907:133313.969 slow query: 89.258007 sec, "alter table items drop column snmpv3_privprotocol" 50174:20210907:133442.710 slow query: 88.738946 sec, "alter table items drop column snmpv3_contextname" 50174:20210907:133612.105 slow query: 89.392558 sec, "alter table items drop column port" ... 50174:20210907:133754.237 slow query: 99.491908 sec, "alter table `items` add `discover` integer default '0' not null"