[ZBX-19775] [bulk processing] reccuring tooBig error-status Created: 2021 Aug 04 Updated: 2022 Oct 08 |
|
Status: | Confirmed |
Project: | ZABBIX BUGS AND ISSUES |
Component/s: | Server (S) |
Affects Version/s: | 5.4.2 |
Fix Version/s: | None |
Type: | Problem report | Priority: | Major |
Reporter: | thomas | Assignee: | Zabbix Development Team |
Resolution: | Unresolved | Votes: | 3 |
Labels: | SNMP, bulk | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified | ||
Environment: |
Zabbix installation "All In One" VM |
Attachments: |
![]() |
||||||||
Issue Links: |
|
Description |
I'm new to Zabbix, but my first impression is that you do a very good job. Bulk processing is a great feature !
Steps to reproduce: Difficult to reproduce : I'm polling a Cisco device with "Use bulk requests" configured and every hour I got a log message on my Cisco device like this one : 2021 Aug 3 01:00:16 switch %SNMPD-3-ERROR: SNMP log error : SNMP Operation (GET) failed. Reason:1 reqId (904207711) errno (2) error index (0) It seems to be related to the bulk processing mecanism which try to determine the maximum SNMP items Zabbix can retreive in one SNMP request for a device. For my device, it seems that max_succeed=59 works most part of the time but every hour for a specific get-resquest, related to network discovry rule, it fails. Result: 2021 Aug 3 01:00:16 switch %SNMPD-3-ERROR: SNMP log error : SNMP Operation (GET) failed. Reason:1 reqId (904207711) errno (2) error index (0) Expected: |
Comments |
Comment by thomas [ 2021 Aug 10 ] |
Hello, I dit another test (see attached picture " Wireshark I/O Graphe - snmp.variable_bindings") and I found that :
(1) Cisco host is added to monitoring with usual template around 16:12:43 PM : (2) At 17 PM, all regular items are polled and discovery processing occurs. Then discovered items and regular items are polled at regular intervals accordingly to configuration, max_items is increased by 3/2 at the end of each successfull polling (3) First failure occurs at 17:09:43 PM (SNMP error-status = tooBig following execution of zbx_snmp_process_standard, polling 63x items). At this point, max_succeed = 42 and min_fail = 63 (4) Then max_items is increased again at the end of each successfull polling but one by one (5) 18:00:43 PM new failure occurs (SNMP error-status = tooBig following execution of zbx_snmp_process_standard, polling 59x items). At this point, max_succeed = 58 and min_fail = 59 (6) max_items isn't increased anymore because in DCconfig_get_suggested_snmp_vars_nolock, MAX(dc_snmp->max_succeed + 1 - 2, dc_snmp->min_fail - 1) is returned that is 58 (58+1-2=57 < 59-1=58) (7) 19:00:43 PM new failure occurs (SNMP error-status = tooBig following execution of zbx_snmp_process_standard, polling 58x items). At this point, max_succeed = 58 and min_fail = 58 (8) max_items isn't increased anymore because in DCconfig_get_suggested_snmp_vars_nolock, MAX(dc_snmp->max_succeed + 1 - 2, dc_snmp->min_fail - 1) is returned that is 57 (58+1-2=57 == 58-1=57) I think that if max_succeed is greater than min_fail, it should be lowered to min_fail. Or max_items should be configurable in GUI. |
Comment by thomas [ 2021 Aug 11 ] |
Side note : Cisco device's log file is spammed only when severity level for snmpd process is greater than or equal to 3. |