[ZBX-4164] SNMPv3 stops working sometimes Created: 2011 Sep 22 Updated: 2017 May 30 Resolved: 2013 Sep 23 |
|
Status: | Closed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 1.8.6 |
Fix Version/s: | None |
Type: | Incident report | Priority: | Major |
Reporter: | Michael Schwartzkopff | Assignee: | Unassigned |
Resolution: | Won't fix | Votes: | 0 |
Labels: | snmpv3 | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
SLES11.1 |
Description |
From time to time SNMPv3 requests (items) just stop delivering data. snmpget from the command line for the same OID works. I looked a little bit deeper into the packets exchanged between Zabbix server and the network node. It seems that the Zabbix server gets confused the the time since last boot of the device and thus only gets SNMPv3 errors "usmStatsNotInTimeWindows.0" Packets on the line: { USM B=0 T=0 U= } { ScopedPDU E= C= { GetRequest(14) R=1978855259 } } } No boot and time set (B=0, T=0). This paket is exchanged to get the EngineID from the node. 2) y.y.y.y:161 > x.x.x.x:60674: { SNMPv3 { F= } { USM B=0 T=0 U= } { ScopedPDU E= 0x800x000x000x090x030x000x050x730xB70x9D0x40 C= { Report(33) R=1978855259 .1.3.6.1.6.3.15.1.1.4.0=914879 } } } 3) x.x.x.x:60674 > y.y.y.y:161: { SNMPv3 { F=ar } { USM B=19 T=3528723 U=xxxx } { ScopedPDU E= 0x800x000x000x090x030x000x050x730xB70x9D0x40 C= { GetRequest(34) R=1978855258 .1.3.6.1.2.1.2.2.1.2.436207616 } } }The Zabbix server asks the node for the OID in question. Please note that Zabbix suddenly "knows" the nuber of boots and time since last boot of the node (B=19, T=3528723). 4) y.y.y.y:161 > x.x.x.x:60674: { SNMPv3 { F=a } { USM B=19 T=153362 U=xxxx } { ScopedPDU E= 0x800x000x000x090x030x000x050x730xB70x9D0x40 C= { Report(33) R=1978855258 .1.3.6.1.6.3.15.1.1.2.0=144804 } } } The node reports the usmStatsNotInTimeWindows.0 error (OID 1.3.6.1.6.3.15.1.1.2.0) and reports it REAL time since last boot: T=153362 5) x.x.x.x:60674 > y.y.y.y:161: { SNMPv3 { F=ar } { USM B=19 T=3528724 U=xxxx } { ScopedPDU E= 0x800x000x000x090x030x000x050x730xB70x9D0x40 C= { GetRequest(34) R=1978855258 .1.3.6.1.2.1.2.2.1.2.436207616 } } } 6) y.y.y.y:161 > x.x.x.x:60674: { SNMPv3 { F=a }{ USM B=19 T=153363 U=oper } { ScopedPDU E= 0x800x000x000x090x030x000x050x730xB70x9D0x40 C= { Report(33) R=1978855258 .1.3.6.1.6.3.15.1.1.2.0=144805 } } } |
Comments |
Comment by Aleksandrs Saveljevs [ 2011 Sep 23 ] |
A quick thought: this might or might not be similar to |
Comment by Michael Schwartzkopff [ 2011 Sep 23 ] |
Hi, it really seems to be the same problem as described in I looked, but we do NOT have any duplicated snmpEngineIDs in our net. Two other indicators show me that the real cause cannot be duplicated EngineIDs: 1) After a reboot of the device Zabbix got data again. 2) If you have a close look to the 3rd packet in the trace Zabbix / net-snmp sends the *wrong* snmpEngineTime to the agent. Zabbix/net-snmp does not even bother to correct its snmpEngineTime even after the device reported the correct value. Michael |
Comment by richlv [ 2011 Sep 23 ] |
"After a reboot of the device Zabbix got data again." maybe some other device stops delivering the data, though ? |
Comment by Michael Schwartzkopff [ 2011 Sep 23 ] |
Hi, as written in my first comment snmpget to the device always worked. This fact excludes such a trivial explanation. |
Comment by richlv [ 2011 Sep 23 ] |
as far as i'm aware, snmpget is not a valid test for such a problem, as it only tests a single device at a time. snmpget also worked flawlessly in |
Comment by Michael Schwartzkopff [ 2011 Sep 23 ] |
Yes. That is why it cannot be a problem of the SNMP agent. It must be a problem on the master side being located in Zabbix or net-snmp. |
Comment by richlv [ 2011 Sep 23 ] |
to clarify, if engine ids match, that is a problem on the agent side, but only if such agents are queried by the same client (or management station). snmpget just does not expose the problem. |
Comment by Michael Schwartzkopff [ 2011 Sep 23 ] |
Questions:
Perhaps this is not a Zabbix problem, but related to net-snmp. Please see also my posts on the mailing list there. |
Comment by richlv [ 2011 Sep 23 ] |
not sure about the difference - it could be lib-net-snmp doing something. i was just pointing out that snmpget working does not exclude engineid being the source of the problem |
Comment by Michael Schwartzkopff [ 2012 Mar 08 ] |
Please see my discussion with on the net-snmp mailing list: |
Comment by Eric Gearhart [ 2012 Apr 16 ] |
I think I am being bitten by an issue that is closely related to what Michael is reporting... I'm running Zabbix 2.0.0rc2, and all my hosts are SNMpv3 hosts. Sometimes Zabbix simply "quits working" periodically when doing its SNMP item polls, but snmpgets/snmpwalks work perfectly. At the company I work at, It's getting to the point where this is pushing us to abandon Zabbix as a possible monitoring solution, and use Cacti+thold instead. |
Comment by Michael Schwartzkopff [ 2012 Apr 17 ] |
Please see my discussion on the mailing list. It seems to be a Zabbix issue. Sorry that the company ist not able to debug this issue further. We killed SNMPv3 because of this problem. Cacti is not a good alternative to Zabbix. Perhaps you have a look on opennms.org. Greetings, Michael. |
Comment by Michael Schwartzkopff [ 2013 Sep 23 ] |
Root cause of the problem: Duplicate engineID of the host. See also: Solution: Add the line engineIDType 3 to your snmpd.conf of net-snmp and restart the agent. The agent will calculate a new, RFC conformant engineID and zabbix will resume to work. |
Comment by Michael Schwartzkopff [ 2013 Sep 23 ] |
It is not a Zabbix issue. Can be solved by resonconfiguration of the snmp agent on the monitored host. See solution in the comment. |
Comment by Michael Schwartzkopff [ 2013 Sep 23 ] |
Case closed. |
Comment by Michael Schwartzkopff [ 2014 Dec 06 ] |
Update in Hi, the problem is documented in When I restart the Zabbix server, the items get collected again. This is the proof, that the fault is located within the Zabbix server. Michael. |