[ZBX-11726] icmp ping detects duplicate answer as ping loss Created: 2017 Jan 23  Updated: 2024 Apr 10  Resolved: 2017 Feb 27

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: None
Fix Version/s: None

Type: Incident report Priority: Trivial
Reporter: Marko Sandholm Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: fping, icmpping
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian Wheezy


Attachments: PNG File Template_ICMPLoss.PNG    
Issue Links:
Duplicate
Team: Team A
Sprint: Sprint 1, Sprint 2
Story Points: 4

 Description   

Destination host has loadbalancer in front of set of hosts. Pinging loadbalancer causes host to respond and thus multiple respond are received and duplicate answer is misinterpreted as packet loss even-though there's none.

Global script ping is configured as: /bin/ping -c 3 {HOST.CONN} 2>&1

Below is example response from live host(IP removed/masked):

PING xxx.xx.xxx.xxx (xxx.xx.xxx.xxx) 56(84) bytes of data.
64 bytes from xxx.xx.xxx.xxx: icmp_req=1 ttl=117 time=4.26 ms
64 bytes from xxx.xx.xxx.xxx: icmp_req=1 ttl=117 time=4.28 ms (DUP!)
64 bytes from xxx.xx.xxx.xxx: icmp_req=2 ttl=117 time=4.36 ms
64 bytes from xxx.xx.xxx.xxx: icmp_req=2 ttl=117 time=4.38 ms (DUP!)
64 bytes from xxx.xx.xxx.xxx: icmp_req=3 ttl=117 time=4.32 ms

--- xxx.xx.xxx.xxx ping statistics ---
3 packets transmitted, 3 received, +2 duplicates, 0% packet loss, time 2001ms
rtt min/avg/max/mdev = 4.260/4.324/4.388/0.064 ms

See comments below for correct source data.



 Comments   
Comment by Oleksii Zagorskyi [ 2017 Jan 23 ]

Previous comment was incorrect, should be ignored. Here is fixed version.
The result comes from "ping" command line utility. It does not depend on zabbix component in any way.
This report to be closed.

Comment by Marko Sandholm [ 2017 Jan 23 ]

Yes, results come from ping command but result is incorrectly parsed by zabbix server causing ping loss trigger to launch eventhough there is no ping loss.

Comment by Marko Sandholm [ 2017 Jan 23 ]

This issue is related to Simple check that I forgot to mention. This is related to Simple check and it's item icmppingloss. Result information was there just to show how it seems from Global script point.

Comment by Oleksii Zagorskyi [ 2017 Jan 23 ]

When creating issue you mentioned only "Global script" as a feature.

If you talk about triggers, it should be related to simple checks, which performed by another tool - fping.
Just in case - their bug tracker is https://github.com/schweikert/fping/issues

Comment by Oleksii Zagorskyi [ 2017 Jan 23 ]

Show please those examples, using fping on zabbix server command line. No needs to use global scripts.

Comment by Marko Sandholm [ 2017 Jan 23 ]

This is from server using fping:

fping -c 3 xxx.xx.xxx.xxx
xxx.xx.xxx.xxx : [0], 96 bytes, 4.35 ms (4.35 avg, 0% loss)
xxx.xx.xxx.xxx : [0], 96 bytes, 4.41 ms (4.38 avg, 200% return)
xxx.xx.xxx.xxx : [1], 96 bytes, 4.38 ms (4.38 avg, 150% return)

xxx.xx.xxx.xxx : xmt/rcv/%return = 2/3/150%, min/avg/max = 4.35/4.38/4.41
Comment by Oleksii Zagorskyi [ 2017 Jan 23 ]

Show also exact item key and what values does it collect.

Comment by Marko Sandholm [ 2017 Jan 23 ]

Simple check icmploss item from template.

Comment by dimir [ 2017 Jan 23 ]

Could you please also attach some of the latest values?

Comment by Marko Sandholm [ 2017 Jan 24 ]

Yes, here:

2017-01-24 08:54:03	33.3333
2017-01-24 08:53:03	33.3333
2017-01-24 08:52:03	33.3333
2017-01-24 08:51:03	33.3333
2017-01-24 08:50:03	33.3333
2017-01-24 08:49:03	33.3333
2017-01-24 08:48:03	33.3333
2017-01-24 08:47:03	33.3333
2017-01-24 08:46:03	33.3333
2017-01-24 08:45:03	33.3333
Comment by Viktors Tjarve [ 2017 Feb 17 ]

Hi Marko,
It seams that this case is quite specific but an issue exists. To understand how to handle this in Zabbix could you please increase server DebugLevel to 4 and attach log or part of the log from icmp pinger process. The important bit in which I'm interested in is to know how much and what exactly in your case Zabbix is getting from fping. It should look something similar to this:

 22400:20170217:092350.288 read line [xxx.xx.xxx.xxx : [0], 96 bytes, 4.35 ms (4.35 avg, 0% loss)]
 22400:20170217:092350.288 read line [xxx.xx.xxx.xxx : [0], 96 bytes, 4.41 ms (4.38 avg, 200% return)]
 22400:20170217:092350.288 read line [xxx.xx.xxx.xxx : [1], 96 bytes, 4.38 ms (4.38 avg, 150% return)]
 22400:20170217:092350.288 read line []
 22400:20170217:092350.288 read line [xxx.xx.xxx.xxx : 4.35 4.38 4.41]
 22400:20170217:092350.288 End of process_ping()
 22400:20170217:092350.288 End of do_ping():SUCCEED
 22400:20170217:092350.288 In process_values()
 22400:20170217:092350.288 host [xxx.xx.xxx.xxx] cnt=3 rcv=2 min=0.004350 max=0.004380 sum=0.008730
 22400:20170217:092350.288 In process_value()
 22400:20170217:092350.288 End of process_value()
 22400:20170217:092350.288 End of process_values()
 22400:20170217:092350.289 End of process_pinger_hosts()
 22400:20170217:092350.289 In DCconfig_get_poller_nextcheck() poller_type:3
 22400:20170217:092350.289 End of DCconfig_get_poller_nextcheck():1487316232
 22400:20170217:092350.289 __zbx_zbx_setproctitle() title:'icmp pinger #1 [got 1 values in 2.597363 sec, idle 2 sec]'

Thanks in advance.

Comment by Marko Sandholm [ 2017 Feb 17 ]

Hi,
is this enough:

 24983:20170217:101203.465 read line [aaa.bb.ccc.ccc : [0], 96 bytes, 4.35 ms (4.35 avg, 0% loss)]
 24983:20170217:101203.465 read line [aaa.bb.ccc.ccc : [0], 96 bytes, 4.37 ms (4.36 avg, 200% return)]
 24983:20170217:101204.465 read line [aaa.bb.ccc.ccc : [1], 96 bytes, 4.42 ms (4.38 avg, 150% return)]
 24983:20170217:101204.465 read line [aaa.bb.ccc.ccc : [1], 96 bytes, 4.69 ms (4.45 avg, 200% return)]
 24983:20170217:101205.520 read line [aaa.bb.ccc.ccc : 4.35 4.42]
 24983:20170217:101205.524 host [aaa.bb.ccc.ccc] cnt=3 rcv=2 min=0.004350 max=0.004420 sum=0.008770
 25025:20170217:101303.146 In add_icmpping_item() addr:'aaa.bb.ccc.ccc' count:3 interval:0 size:0 timeout:0
 25025:20170217:101303.146 In add_icmpping_item() addr:'aaa.bb.ccc.ccc' count:3 interval:0 size:0 timeout:0
 25025:20170217:101303.146 In add_icmpping_item() addr:'aaa.bb.ccc.ccc' count:3 interval:0 size:0 timeout:0
 25025:20170217:101303.147 In add_pinger_host() addr:'aaa.bb.ccc.ccc'
 25025:20170217:101303.147 In add_pinger_host() addr:'aaa.bb.ccc.ccc'
 25025:20170217:101303.147 In add_pinger_host() addr:'aaa.bb.ccc.ccc'
 25025:20170217:101303.147     aaa.bb.ccc.ccc
 25025:20170217:101303.285 read line [aaa.bb.ccc.ccc: [0], 96 bytes, 4.22 ms (4.22 avg, 0% loss)]
 25025:20170217:101303.286 read line [aaa.bb.ccc.ccc: [0], 96 bytes, 4.38 ms (4.30 avg, 200% return)]
 25025:20170217:101304.291 read line [aaa.bb.ccc.ccc: [1], 96 bytes, 4.51 ms (4.37 avg, 150% return)]
 25025:20170217:101304.291 read line [aaa.bb.ccc.ccc: [1], 96 bytes, 4.53 ms (4.41 avg, 200% return)]
 25025:20170217:101305.343 read line [aaa.bb.ccc.ccc: 4.22 4.51]
 25025:20170217:101305.348 host [aaa.bb.ccc.ccc] cnt=3 rcv=2 min=0.004220 max=0.004510 sum=0.008730

viktors.tjarveThanks. This should do for now.

Comment by Viktors Tjarve [ 2017 Feb 21 ]

Hi Marko,
The behavior of fping seems very unusual. In some versions there are known bugs. Which version of fping are you using?

Comment by Marko Sandholm [ 2017 Feb 21 ]

Hi,

version seems to be 3.2-1.

:~$ fping -v
fping: Version 3.2
fping: comments to [email protected]

viktors.tjarve Please upgrade to latest fping version. That will solve this problem. In version 3.2 duplicates are mishandled. The behavior is different in version 3.13 or 3.15.

viktors.tjarve Marko, did you try upgrading fping and did that solve the problem?

Comment by Viktors Tjarve [ 2017 Feb 21 ]

(1) [D]
Add recommendation to documentation about recommended version of fping.

viktors.tjarve RESOLVED.
Added information about this bug to known issues in documentation 2.0-3.4 and added links to it from the ICMP ping section in 'Simple checks'.

martins-v Reviewed, thanks. I simplified the link formatting a bit. CLOSED.

Comment by Marko Sandholm [ 2017 Feb 22 ]

Hi,
I installed 3.13 from sources and now ping check is working ok.

viktors.tjarve Good to know. Thanks.

Generated at Fri Apr 26 21:37:52 EEST 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.