Loading...

XML

Word

Printable

Type: Incident report
Resolution: Duplicate
Priority: Major
Fix Version/s: None
Affects Version/s: 2.4.1
Component/s: Proxy (P)
Labels:
None
Environment:

Hide
RHEL 6.6.

Uname -a: Linux zabbix5-net-proxy2.doit.missouri.edu 2.6.32-504.1.3.el6.x86_64 #1 SMP Fri Oct 31 11:37:10 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux

Memory: 4 GB real, 2.5 GB swap.

Net-SNMP 5.5-50

Zabbix proxy package: zabbix-proxy-2.4.1-1.el6.x86_64

500 SNMP pollers

Show
RHEL 6.6. Uname -a: Linux zabbix5-net-proxy2.doit.missouri.edu 2.6.32-504.1.3.el6.x86_64 #1 SMP Fri Oct 31 11:37:10 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux Memory: 4 GB real, 2.5 GB swap. Net-SNMP 5.5-50 Zabbix proxy package: zabbix-proxy-2.4.1-1.el6.x86_64 500 SNMP pollers

We have a number of network switches assigned to proxies where each switch has 1000-4000 Items, based on the number of physical ports on the switch. In one test case, the switch has 1800 Items.

We saw that the queues for the proxies were getting quite long, with tens of thousands of Items over 10 minutes old. We looked in the log and found lots of these messages:

Nov 26 07:42:32 zabbix5-net-proxy2 zabbix_proxy[2758]: SNMP agent item "mib-2.ifoutdiscards.["10115"]" on host "c2960s202-AlphaEpsilonPi-1" failed: first network error, wait for 1 seconds

SNMP tests from the command line and packet captures confirmed that SNMP queries are working. There are no access issues and the OIDs queried do exist. We see this same Host appear in the logs throughout the day, but the specific OID listed in the error changes.

When bulk requests were disabled on the host, the log messages went away.

I performed a packet capture on another monitored host that had similar log messages and where bulk requests were left enabled. I noticed that the SNMP queries sent did not actually request more than one OID in any packet. So "bulk requests" were enabled, but not actually being sent.

Another difference is that with bulk requests enabled, all of the SNMP requests to the Host were sent at the same time. With bulk requests disabled, the SNMP requests for the various Items were spread out over the entire polling period. (Most of the SNMP Items are queried every 300 seconds for all Hosts.)

duplicates

ZBX-8538 allow single retry on libnetsnmp level - it will give positive effect

Closed

Assignee:: Unassigned
Reporter:: Justin McNutt
Votes:: 0 Vote for this issue
Watchers:: 1 Start watching this issue

Created:: 2014 Nov 26 15:49
Updated:: 2017 May 30 17:56
Resolved:: 2014 Nov 26 16:18

Details

Description

Attachments

Issue Links

Activity

People

Dates