[ZBX-4232] Unclear log message "first network error" Created: 2011 Oct 13  Updated: 2017 May 30  Resolved: 2012 Jan 29

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 1.8.8
Fix Version/s: 1.8.9, 1.9.7 (beta)

Type: Incident report Priority: Trivial
Reporter: Attilla de Groot Assignee: dimir
Resolution: Fixed Votes: 0
Labels: logging, usability
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian


Issue Links:
Duplicate
is duplicated by ZBX-2091 Zabbix server network error, says it ... Closed
is duplicated by ZBX-9501 "first network error" Closed

 Description   

Currently I'm getting the following message for a *nix host:

21081:20111013:102745.243 Zabbix Host [xxx]: first network error, wait for 15 seconds

Probably not a big issue, just some item that can't be retreived. However, in the interface everything looks ok and no unsupported items. I'd like to resolve this, but I don't know which item is causing this issue. Please include an item name in log message.



 Comments   
Comment by dimir [ 2011 Oct 13 ]

Fixed in development branch svn://svn.zabbix.com/branches/dev/ZBX-4232 .

Comment by Oleksii Zagorskyi [ 2011 Oct 13 ]

Maybe the massages:
6641:20111007:091459.168 Disabling Zabbix host [Zabbix server]
6641:20111007:091959.201 Enabling Zabbix host [Zabbix server]
could be improved too?
It's not clear what means that host is Disabling, Enabling. Sometimes this confuses me.

I recently ran into this problem.

Comment by dimir [ 2011 Oct 13 ]

What's your suggestion? "Disabling checks on Zabbix host [Zabbix server], no responce while processing item [system.uptime]"?

Comment by dimir [ 2011 Oct 13 ]

<Rich> dimir, if i recall correctly, several items can lead to disabling a host
<Rich> so giving item there might be a bit misleading

Couple of more suggestions.

<dimir> "Disabling checks on Zabbix Host [Zabbix server] until it becomes available"?
<Rich> "Temporarily disabling host [Zabbix server]" ?

Comment by dimir [ 2011 Oct 13 ]

I vote for

"Temporarily disabling checks on Zabbix host [Zabbix server]: host unavailable"

Any objections?

Comment by richlv [ 2011 Oct 13 ]

"Zabbix" in "Zabbix host" seems a bit redundant. either only "host", or "Zabbix agent host" (assuming it's for passive agents only ?)

Comment by dimir [ 2011 Oct 13 ]

"Zabbix" in "Zabbix host" is a host type. We have it everywhere so we know what "type of host" (item) is disabled. So this is dynamic. Another example:

"Temporarily disabling checks on SNMP host [Zabbix server]: host unavailable"

Comment by Oleksii Zagorskyi [ 2011 Oct 13 ]

I don't know how exactly the disabling works depending of item types and if it really depends of types then i suggest something similar to:

"Temporarily disabling SNMP checks on host [Zabbix server]: host unavailable"
"Temporarily disabling Zabbix agent checks on host [Zabbix server]: host unavailable"
"Temporarily disabling ICMP checks on host [Zabbix server]: host unavailable"

Comment by dimir [ 2011 Oct 13 ]

Looks great. So, here is how it looks now:

25990:20111013:165818.574 SNMP item [snmp.uptime] on host [Zabbix server] failed: first network error, wait for 15 seconds
25995:20111013:165839.704 SNMP item [snmp.uptime] on host [Zabbix server] failed: another network error, wait for 15 seconds
25995:20111013:165900.736 SNMP item [snmp.uptime] on host [Zabbix server] failed: another network error, wait for 15 seconds
25995:20111013:165921.756 temporarily disabling SNMP checks on host [Zabbix server]: host unavailable
25995:20111013:170021.761 enabling SNMP checks on host [Zabbix server]

And if connection is restored before host is disabled:

26559:20111013:170913.022 SNMP item [snmp.uptime] on host [Zabbix server] failed: first network error, wait for 15 seconds
26561:20111013:170928.999 SNMP checks on host [Zabbix server]: connection restored

Same with Zabbix agent (many items):

26558:20111013:171008.264 Zabbix agent item [system.cpu.load[,avg15]] on host [Zabbix server] failed: first network error, wait for 15 seconds
26559:20111013:171008.264 Zabbix agent item [vfs.fs.size[/tmp,free]] on host [Zabbix server] failed: another network error, wait for 15 seconds
26561:20111013:171023.006 Zabbix agent item [vfs.fs.size[/tmp,free]] on host [Zabbix server] failed: another network error, wait for 15 seconds
26561:20111013:171038.008 Zabbix agent item [vfs.fs.inode[/opt,free]] on host [Zabbix server] failed: another network error, wait for 15 seconds
26561:20111013:171053.011 Zabbix agent item [net.tcp.service[pop]] on host [Zabbix server] failed: another network error, wait for 15 seconds
26561:20111013:171108.091 temporarily disabling Zabbix agent checks on host [Zabbix server]: host unavailable
26561:20111013:171208.101 enabling Zabbix agent checks on host [Zabbix server]

How's that?

Comment by Attilla de Groot [ 2011 Oct 13 ]

For me this is great!

Comment by Oleksii Zagorskyi [ 2011 Oct 13 ]

Perfectly !

Comment by richlv [ 2011 Oct 14 ]

awesome, except

"SNMP checks on host [Zabbix server]: connection restored"

i think you the error message. how about "resuming..." ?

Comment by dimir [ 2011 Oct 14 ]

Well, I could not think of anything to add there but I guess "resuming" could be it.

Comment by dimir [ 2011 Oct 14 ]

Fixed in pre-1.8.9 r22410, pre-1.9.7 r22412 .

Comment by Javier Barroso [ 2013 Sep 12 ]

Hello,

We had 55 minutes of not monitoring at one of our server (proxy02). We are using 1.8.10 which should solve this issue.

Do you know why this could be happenning?
$ grep proxy02 zabbix_server.log
11339:20130912:110557.170 Zabbix agent item [kern.sockets.orphan] on host [proxy02] failed: first network error, wait for 15 seconds
11342:20130912:115932.286 resuming Zabbix agent checks on host [proxy02]: connection restored

Thank you
PD: We will try to update our zabbix, I now this is a big NOT LOOK THIS COMMENT ...

Generated at Thu Apr 25 11:43:18 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.