[ZBX-7253] unknown column errors with node setup Created: 2013 Oct 30 Updated: 2017 May 30 Resolved: 2015 Feb 13 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 1.8.17, 2.0.9, 2.2.0 |
Fix Version/s: | None |
Type: | Incident report | Priority: | Critical |
Reporter: | Marc Schoechlin | Assignee: | Unassigned |
Resolution: | Won't fix | Votes: | 7 |
Labels: | dm | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
One Master node, one subordinate node, 7 proxies. |
Attachments: | applications_cksums_broken applications_cksums_working dbschema.c | ||||||||||||||||||||
Issue Links: |
|
Description |
We migrated our 1.8.17 setup to 2.0.9. After the migration the subordinate node starts complaining with hundreds of messages which look like the following messages: 9833:20131030:181502.203 [Z3005] query failed: [1054] Unknown column 'macro' in 'field list' [select macro from graphs_items where gitemid=200200000002550] The hostids and gitemids are changing always in every logline. Naturally we executed the database migration script and we checked the output for errors. It seems that zabbix-server uses a dynamically created sql statement - because "select macro from graphs_items" cannot be found in the zabbix sourcecode, The error message complains about the columns "macro" and "internal" - these columns are not part of any 1.8.17 or 2.0.9 database schema. I you need to analyze this situation in detail, please requests additional information. |
Comments |
Comment by richlv [ 2013 Oct 30 ] |
you're not running 2.1 there accidentally - or maybe you have run 2.1 against some of those databases at some point ? |
Comment by Marc Schoechlin [ 2013 Oct 30 ] |
No, i'm really sure that i never downloaded/installed a 2.1 release. |
Comment by Marc Schoechlin [ 2013 Nov 04 ] |
Found another strange message: cannot find table [hosts_profiles] This table also does not exist on zabbix 2.0 installations. It seems that there is 1.8.x code in this release. |
Comment by Marc Schoechlin [ 2013 Nov 04 ] |
I compared the process ids of the messages with the startup message. The problems are caused by the following process: 5021:20131104:163024.841 server #33 started node watcher #1 See also http://pastebin.com/pFGamEa7 (available for 24 hours) |
Comment by Marc Schoechlin [ 2013 Nov 05 ] |
I compiled zabbix with debug info and inspected the situation using gdb. break db.c:1161 commands 1 print sql info locals backtrace continue end break db.c:921 commands 2 print sql info locals backtrace continue end break db.c:955 print sql info locals commands 3 backtrace continue end I got the follwing results: Problem: [Z3005] query failed: [1054] Unknown column 'macro' in 'field list' [select macro from graphs_items where gitemid=200200000002550]
Problem: [Z3005] query failed: [1054] Unknown column 'internal' in 'field list' [select name,internal,name,hsize,vsize,templateid from hosts where hostid=200200000010246]
|
Comment by Marc Schoechlin [ 2013 Nov 05 ] |
DMget_config_data seems to be the source of the problem. Where is "tables" filled with data? It seems that this problem is sourced in the contents of the database. You can send me GDB or SQL commands to get deeper details. |
Comment by Alexander Vladishev [ 2013 Nov 06 ] |
Thank you for investigation. Please attach dbschema.c file. It is located in src/libs/zbxdbhigh/. Seems it's broken. These SQL statements are built on the basis of data from this file. |
Comment by Marc Schoechlin [ 2013 Nov 06 ] |
The requested file. The files is part of the source distribution downloaded at: RELEASE="2.0.9"
|
Comment by richlv [ 2013 Dec 05 ] |
|
Comment by richlv [ 2013 Dec 05 ] |
for the record, attached dbschema matches 2.0 dbschema.c |
Comment by richlv [ 2013 Dec 05 ] |
how was zabbix compiled and installed ? you don't have multiple source archives or binaries by accident ? |
Comment by Marc Schoechlin [ 2013 Dec 07 ] |
As described in comment of "2013 Nov 06 12:36" we ensured that the file is part of the distribution. |
Comment by Oleksii Zagorskyi [ 2013 Dec 29 ] |
I had: 11981:20131228:034110.868 NODE 10: Received configuration changes from slave node 30 for node 30 datalen 54 11981:20131228:034208.226 [Z3005] query failed: [1054] Unknown column 'type' in 'field list' [select type from applications where applicationid=3001000000000001] 11981:20131228:034208.227 [Z3005] query failed: [1054] Unknown column 'type' in 'field list' [select type from applications where applicationid=3001000000000002] 11981:20131228:034208.227 [Z3005] query failed: [1054] Unknown column 'type' in 'field list' [select type from applications where applicationid=3001000000000003] ... many lines here with increasing applicationid, totally 1540 ... 11981:20131228:034208.395 [Z3005] query failed: [1054] Unknown column 'type' in 'field list' [select type from applications where applicationid=3003000000000961] 11981:20131228:034208.395 [Z3005] query failed: [1054] Unknown column 'type' in 'field list' [select type from applications where applicationid=3003000000000962] 11981:20131228:034208.395 [Z3005] query failed: [1054] Unknown column 'type' in 'field list' [select type from applications where applicationid=3003000000000963] 11981:20131228:034210.257 NODE 10: sending configuration changes to slave node 30 for node 30 datalen 10 errors and I was able to fix that. These errors appeared every time when master node received configuration changes from child node. As we can see above there is a problem when master nodeid10 receiving configuration changes from child nodeid30. Of course we already sure that there are no so obvious errors in zabbix code or db schemas. I have a part of the table dump when the issue presents, you could investigate it, attached.
# echo 'select * from node_cksum where tablename like "applications" and recordid like "300%"' | mysql zabbix -uzabbix -p > applications_cksums_broken
Then I've cleared up related entries on master node database: mysql> delete from node_cksum where tablename = "applications" and recordid like "300%"; Query OK, 3248 rows affected (44.19 sec) (do it with stopped the child nodeid30 !!!) And the issue is gone, there just was a bit increased resync: # grep -E "configuration changes|Unknown column" zabbix_server.log | tail -n50 12022:20131229:001048.278 [Z3005] query failed: [1054] Unknown column 'type' in 'field list' [select type from applications where applicationid=3003000000000962] 12022:20131229:001048.278 [Z3005] query failed: [1054] Unknown column 'type' in 'field list' [select type from applications where applicationid=3003000000000963] 12022:20131229:001050.138 NODE 10: sending configuration changes to slave node 30 for node 30 datalen 10 11954:20131229:001102.985 NODE 10: Received configuration changes from slave node 30 for node 30 datalen 54 11954:20131229:001318.859 NODE 10: sending configuration changes to slave node 30 for node 30 datalen 10 12027:20131229:001332.303 NODE 10: Received configuration changes from slave node 30 for node 30 datalen 354 12027:20131229:001510.907 NODE 10: sending configuration changes to slave node 30 for node 30 datalen 122748 11957:20131229:001536.854 NODE 10: Received configuration changes from slave node 30 for node 30 datalen 10 11957:20131229:001648.072 NODE 10: sending configuration changes to slave node 30 for node 30 datalen 10 Then I did similar working snap shot, attached:
# echo 'select * from node_cksum where tablename like "applications" and recordid like "300%"' | mysql zabbix -uzabbix -p > applications_cksums_working
If we compare them - we will see they have noticeably different data structures. Probably it possible to try to reproduce it on correctly(together) and not correctly upgraded nodes, but I'm not really sure I need it. Questions ? |
Comment by Thomas Spengler [ 2014 Jan 14 ] |
same issue Migration from 2.0.4 -> 2.2.1 First Master, then Node migrated |
Comment by Gael Denizot [ 2014 Feb 03 ] |
Same issue here. Upgraded my zabbix servers and nodes to 2.2.1 and got messages 1628:20140203:155439.140 [Z3005] query failed: [1054] Unknown column 'type' in 'field list' [select type from applications where applicationid=1501500000000461] Every approx 5 min in the logs. any solution ? |
Comment by Karol Pucynski [ 2014 Mar 14 ] |
I had the same problem with "Unknown column 'type' in 'field list'". remark: master database doesn't had any data from child earlier. |
Comment by Fabio [ 2014 Mar 19 ] |
I have a 1 zabbix father and 7 child. My database is postgresql 9.2.4. |
Comment by Nigel Kukard [ 2014 Apr 02 ] |
Same problem 2.0.8 to 2.2.2 |
Comment by Mohamed Mansoor [ 2014 Jul 30 ] |
Upgraded 2.0.8 ==> 2.2.2 Facing the same problem First Upgrade Child Node then Upgraded Master Node. 29523:20140730:194542.347 [Z3005] query failed: [1054] Unknown column 'type' in 'field list' [select type from applications where applicationid=200200000000355] Best regards, |
Comment by Vilem Kebrt [ 2014 Sep 08 ] |
Same problem here, zabbix 2.2.5 on centos 6.5. |
Comment by richlv [ 2015 Feb 13 ] |
given that nodes have been removed since 2.4, this issue is unlikely to be looked in -> closing |