-
Incident report
-
Resolution: Won't fix
-
Blocker
-
None
-
1.9.8 (beta)
-
None
-
RHEL6
I upgraded from 1.9.6 to 1.9.8. Updated the database with no errors and had the server and agent binaries running.
Immediately the server logs start logging failures for item checks, if you check the graph for a particular item there are huge gaps where it didn't collect any data. Also the 'queue' is growing to hundreds of items.
Strangely I never had an error trying to fetch the same items with zabbix_get
Rolled back to 1.9.6 and all is fine again.
The following is an example of the errors seen in the server log
13719:20120116:140618.123 Zabbix agent item [vfs.fs.size[/tmp, pfree]] on host [lin020] failed: first network error, wait for 15 seconds
13719:20120116:140629.126 Zabbix agent item [vfs.dev.read[/dev/disk/by-id/dm-name-rootvg-root,sectors]] on host [lin021] failed: first network error, wait for 15 seconds
13721:20120116:140633.910 Zabbix agent item [vfs.fs.size[/var, free]] on host [lin020] failed: another network error, wait for 15 seconds
13721:20120116:140644.912 Zabbix agent item [vfs.dev.read[/dev/disk/by-id/dm-name-rootvg-root,sectors]] on host [lin021] failed: another network error, wait for 15 seconds
13721:20120116:140648.917 resuming Zabbix agent checks on host [lin020]: connection restored
13717:20120116:140649.133 Zabbix agent item [vfs.fs.size[/var, pfree]] on host [lin020] failed: first network error, wait for 15 seconds
13721:20120116:140659.919 Zabbix agent item [vfs.dev.read[/dev/disk/by-id/dm-name-rootvg-root,sectors]] on host [lin021] failed: another network error, wait for 15 seconds
13721:20120116:140704.920 resuming Zabbix agent checks on host [lin020]: connection restored
13718:20120116:140705.141 Zabbix agent item [net.if.total[eth0,dropped]] on host [lin020] failed: first network error, wait for 15 seconds
13721:20120116:140714.923 resuming Zabbix agent checks on host [lin021]: connection restored
I turned debugging on and saw some entries like this as well.
13593:20120116:131456.257 In substitute_simple_macros() data:'vfs.fs.size[/var, free]'
13593:20120116:131456.257 In substitute_simple_macros() data:EMPTY
13593:20120116:131456.257 In deactivate_host() hostid:10047 itemid:28946 type:0
13593:20120116:131456.257 deactivate_host() errors_from:0 available:1