-
Incident report
-
Resolution: Cannot Reproduce
-
Major
-
None
-
1.8.14
-
None
-
None
-
Fedora 17 x86_64, selinux in permissive mode, kernel 3.5.2-1.fc17.x86_64
Server log file is full of following:
30907:20120821:225740.503 Zabbix agent item [net.tcp.service[smtp]] on host [localhost] failed: another network error, wait for 15 seconds
30907:20120821:225755.505 resuming Zabbix agent checks on host [localhost]: connection restored
30904:20120821:225822.907 Zabbix agent item [net.tcp.service[smtp]] on host [localhost] failed: first network error, wait for 15 seconds
30907:20120821:225840.602 Zabbix agent item [net.tcp.service[smtp]] on host [localhost] failed: another network error, wait for 15 seconds
30907:20120821:225855.605 resuming Zabbix agent checks on host [localhost]: connection restored
There are problems collecting items (there are random gaps in graphs).
There's nothing relevant in agent log.
smtp server is running:
$ sudo -u zabbix telnet localhost smtp
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Selinux is in permissive mode.
Running in debug more adds more information:
Server log:
30679:20120821:225421.743 sleeping for 3 seconds
30702:20120821:225421.947 In collect_selfmon_stats()
30702:20120821:225421.948 End of collect_selfmon_stats()
30702:20120821:225421.948 sleeping for 1 seconds
30682:20120821:225422.239 Item [localhost:net.tcp.service[smtp]] error: Get value from agent failed: ZBX_TCP_READ() failed: [4] Interrupted system call
30682:20120821:225422.239 In zabbix_log()
30682:20120821:225422.239 In DCconfig_get_items() hostid:0 key:'zabbix[log]'
30682:20120821:225422.239 End of DCconfig_get_items():0
30682:20120821:225422.239 End of zabbix_log()
30682:20120821:225422.239 End of get_value():NETWORK_ERROR
30682:20120821:225422.239 In deactivate_host() hostid:10017 itemid:79 type:0
30682:20120821:225422.239 deactivate_host() errors_from:0 available:1
30682:20120821:225422.239 query [txnlev:1] [begin;]
30682:20120821:225422.239 query [txnlev:1] [update hosts set errors_from=1345586062,disable_until=1345586077 where hostid=10017]
30682:20120821:225422.240 query [txnlev:1] [commit;]
30682:20120821:225422.306 Zabbix agent item [net.tcp.service[smtp]] on host [localhost] failed: first network error, wait for 15 seconds
30682:20120821:225422.306 In zabbix_log()
30682:20120821:225422.306 In DCconfig_get_items() hostid:0 key:'zabbix[log]'
30682:20120821:225422.306 End of DCconfig_get_items():0
30682:20120821:225422.306 End of zabbix_log()
30682:20120821:225422.306 End of deactivate_host()
30682:20120821:225422.306 End of get_values()
30682:20120821:225422.307 poller #4 spent 3.068150 seconds while updating 1 values
30682:20120821:225422.307 In DCconfig_get_poller_nextcheck() poller_type:0
30682:20120821:225422.307 End of DCconfig_get_poller_nextcheck():1345586064
30682:20120821:225422.307 sleeping for 2 seconds
agent log:
29584:20120821:225421.275 Processing request.
29584:20120821:225421.275 Requested [proc.num[,,run]]
29584:20120821:225421.295 Sending back [1]
29585:20120821:225421.615 In send_buffer() host:'127.0.0.1' port:10051 values:0/100
29585:20120821:225421.615 End of send_buffer():SUCCEED
29585:20120821:225421.615 Sleeping for 1 second(s)
29581:20120821:225421.713 In update_cpustats()
29581:20120821:225421.713 End of update_cpustats()
29583:20120821:225422.239 TCP expect network error: ZBX_TCP_READ() failed: [4] Interrupted system call
29583:20120821:225422.239 Sending back [0]
29583:20120821:225422.239 Got signal [signal:13(SIGPIPE),sender_pid:29583]. Ignoring ...
29583:20120821:225422.239 Process listener error: ZBX_TCP_WRITE() failed: [32] Broken pipe
29585:20120821:225422.615 In send_buffer() host:'127.0.0.1' port:10051 values:0/100
29585:20120821:225422.615 End of send_buffer():SUCCEED
29585:20120821:225422.615 Sleeping for 1 second(s)
29581:20120821:225422.713 In update_cpustats()
29581:20120821:225422.713 End of update_cpustats()
29585:20120821:225423.615 In send_buffer() host:'127.0.0.1' port:10051 values:0/100
29585:20120821:225423.615 End of send_buffer():SUCCEED
29585:20120821:225423.615 Sleeping for 1 second(s)
29581:20120821:225423.714 In update_cpustats()
29581:20120821:225423.714 End of update_cpustats()
29585:20120821:225424.615 In send_buffer() host:'127.0.0.1' port:10051 values:0/100
29585:20120821:225424.615 End of send_buffer():SUCCEED