[ZBX-25824] Unreachable poller ignores update interval for host with flapping availability Created: 2025 Jan 03  Updated: 2025 Mar 18  Resolved: 2025 Feb 04

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 7.0.6, 7.2.1
Fix Version/s: 7.0.9rc1, 7.2.3rc1, 7.4.0alpha1

Type: Problem report Priority: Blocker
Reporter: Edgar Akhmetshin Assignee: Dmitrijs Goloscapovs
Resolution: Fixed Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: 40h
Original Estimate: Not Specified
Environment:

RHEL 9.5


Attachments: Text File 0001-.-ZBX-25824-patched.patch     PNG File image-2025-01-12-16-29-01-949.png    
Issue Links:
Duplicate
Team: Team A
Sprint: S24-W50/51/52/1, S25-W2/3, DOC S25-W2/3
Story Points: 2

 Description   

Steps to reproduce:

  1. Monitor with SNMP template some big device like Extreme Networks
  2. Use old SNMP approach, with discovery[] rules configured update interval 1h
  3. Use Timeout 30 (Server) and 29 (Proxy)
  4. Start monitor device through Proxy, get first connection errors
  5. See unreachable pollers go high, no metrics at all can be gathered from the device, since devices is flooded with discovery[] requests. 
  6. See tcpdump traffic and logs - Zabbix starts overloading device with discovery[] walks every 1 minute.  See the logs, discovery OID's every 1 minute are listed.
  7. Disabling discovery rules offload unreachable pollers immediately + device starts answering with common items. 

Result:

1714:20250102:131040.025 SNMP agent item ".1.3.6.1.4.1.2272.1.101.1.1.2.1.3.[1]" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1737:20250102:131046.043 SNMP agent item "1.3.6.1.4.1.2272.1.4.8.1.1.2.[1]" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1735:20250102:131047.080 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1777:20250102:131102.645 resuming SNMP agent checks on host "KNOWHERE": connection restored
1773:20250102:131126.792 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: first network error, wait for 15 seconds
1775:20250102:131141.652 resuming SNMP agent checks on host "KNOWHERE": connection restored
1741:20250102:131205.705 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: first network error, wait for 15 seconds
1763:20250102:131238.697 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1711:20250102:131240.722 resuming SNMP agent checks on host "KNOWHERE": connection restored
1797:20250102:131316.077 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: first network error, wait for 15 seconds
1748:20250102:131343.402 SNMP agent item ".1.3.6.1.4.1.2272.1.101.1.1.2.1.3.[5]" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1719:20250102:131349.421 SNMP agent item ".1.3.6.1.4.1.2272.1.101.1.1.2.1.3.[9]" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1764:20250102:131404.735 resuming SNMP agent checks on host "KNOWHERE": connection restored
1730:20250102:131428.539 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: first network error, wait for 15 seconds
1749:20250102:131443.780 resuming SNMP agent checks on host "KNOWHERE": connection restored
1780:20250102:131507.888 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: first network error, wait for 15 seconds
1794:20250102:131540.125 SNMP agent item ".1.3.6.1.4.1.2272.1.101.1.1.2.1.3.[9]" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1760:20250102:131555.637 resuming SNMP agent checks on host "KNOWHERE": connection restored
1740:20250102:131619.612 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: first network error, wait for 15 seconds
1800:20250102:131634.632 resuming SNMP agent checks on host "KNOWHERE": connection restored
1755:20250102:131658.803 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: first network error, wait for 15 seconds
1756:20250102:131731.782 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1713:20250102:131746.636 resuming SNMP agent checks on host "KNOWHERE": connection restored
1748:20250102:131810.314 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: first network error, wait for 15 seconds
1747:20250102:131840.284 SNMP agent item "1.3.6.1.4.1.2272.1.4.8.1.1.2.[1]" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1710:20250102:131843.309 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1786:20250102:131844.853 resuming SNMP agent checks on host "KNOWHERE": connection restored
1768:20250102:131916.676 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: first network error, wait for 15 seconds
1765:20250102:131943.156 SNMP agent item "1.3.6.1.4.1.2272.1.4.8.1.1.2.[1]" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1713:20250102:131949.724 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: another network error, wait for 15 seconds
1753:20250102:132004.725 resuming SNMP agent checks on host "KNOWHERE": connection restored
1722:20250102:132028.392 SNMP agent item "1.3.6.1.4.1.2272.1.4.8" on host "KNOWHERE" failed: first network error, wait for 15 seconds

Expected:
Discovery polling follows update interval and target device is not overloaded with endless requests every 1 minute if even is data relieving is flapping due to the device slowness or timeouts. 

 

Such behaviour doesn't observed with LTS 6.0.37, which was used before an upgrade. Can be related ZBX-22864



 Comments   
Comment by Anton [ 2025 Jan 12 ]

Have the same issues. Added new template where discovery was walk[oid1...]. Resolved was increase "Max repetition count" in my case

Comment by Vladislavs Sokurenko [ 2025 Jan 14 ]

Is it in option to replace discovery with walk and to see if issue persist or if need to allow more retries for walk ?

Comment by Dmitrijs Goloscapovs [ 2025 Jan 17 ]

Available in versions:

Comment by Andrii Fediuk [ 2025 Jan 28 ]

Updated documentation:

  • SNMP agent: 7.0, 7.2, 7.4 (Note on UnavailableDelay for reducing request frequency and partial data issues).
Generated at Sun Apr 20 20:47:42 EEST 2025 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.