[ZBX-9666] Incorrect dynamic index cache in case of different community names only Created: 2015 Jun 29  Updated: 2017 May 30  Resolved: 2015 Oct 08

Status: Closed
Project: ZABBIX BUGS AND ISSUES
Component/s: Proxy (P), Server (S)
Affects Version/s: 2.4.5
Fix Version/s: 2.2.11rc1, 2.4.7rc1, 3.0.0alpha3

Type: Incident report Priority: Critical
Reporter: Alexey Pustovalov Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: snmpdynamicindex
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate

 Description   

Zabbix has snmpwalk cache for dynamic snmp indexes. Currently Zabbix checks only IP, port and OID to be sure that is the same data. But in case of SNMP context or different community cache does not work correctly.

For example, the SNMP OID for JVM heap free is:
.1.3.6.1.4.1.140.625.340.1.25
If we want that value for the admin server, I send the request to:
IP=1.1.1.1, PORT=16021, COMMUNITY=public
If we want that value for the node1 server, we send the request to same admin agent:
IP=1.1.1.1, PORT=16021, COMMUNITY=public@node1
If we want that value for the node2 server, we send the request to yet again same agent:
IP=1.1.1.1, PORT=16021, COMMUNITY=public@node2



 Comments   
Comment by Igors Homjakovs (Inactive) [ 2015 Jul 07 ]

Fixed in svn://svn.zabbix.com/branches/dev/ZBX-9666

Comment by Sandis Neilands (Inactive) [ 2015 Jul 20 ]

(1) In SNMPv3 using just the context name in addition to IP address and port is not enough. One user can get a cached index that belongs to another user regardless of the configured access control in the SNMP agent.

In addition to contextName we should use:

  • snmpEngineID;
  • securityName (user name).

<dimir> We currently do not store snmpEngineID, we do securityName though. RESOLVED in r55173

<sandis.neilands> CLOSED.

Comment by Sandis Neilands (Inactive) [ 2015 Jul 20 ]

(2) We should also use version of the SNMP protocol used to access this particular OID as part of the key. It is feasible for a single SNMP to process messages from all three versions of the SNMP.

<dimir> This should be addressed in a separate issue: ZBXNEXT-2932. CLOSED

Comment by Sandis Neilands (Inactive) [ 2015 Jul 20 ]

(3) According to RFC 1901 (SNMPv2c) section 3.(3) in case of SNMPv2c, SNMPv3 the agent can send response from any of it's transport addresses. In this case using IP address and port for OID cache is wrong. However in Zabbix we seem to not have implemented this requirement but please recheck.

  • Check if Zabbix supports receiving response from different IP address/port;
  • If no then either create a new ZBX for server or documentation.

<dimir> Created a separate feature request: ZBXNEXT-2933. CLOSED

Comment by Sandis Neilands (Inactive) [ 2015 Jul 22 ]

Test scenarios

#1: unnecessary cache rebuilding (performance issue)

Configuration

  1. In zabbix_server.conf configure a single poller process. Restart the server.
  2. Set "Refresh unsupported items (in sec)" setting in the admin interface to 10 seconds.
  3. Configure SNMP proxy with two agents (see below).
  4. Create a new host in Zabbix server, configure SNMP interface to the SNMP proxy.
  5. Add the following items to the newly created host.
item #1
Name: agent_1_top_memory_usage
Type: SNMPv2 agent
Key: agent_1_top_memory_usage
SNMP OID: HOST-RESOURCES-MIB::hrSWRunPerfMem["index","HOST-RESOURCES-MIB::hrSWRunPath", "top"]
SNMP community: community_agent_1
Type of information: Numeric
Data type: Decimal
Update interval (in sec): 10
item #2
Name: agent_2_top_memory_usage
Type: SNMPv2 agent
Key: agent_2_top_memory_usage
SNMP OID: HOST-RESOURCES-MIB::hrSWRunPerfMem["index","HOST-RESOURCES-MIB::hrSWRunPath", "top"]
SNMP community: community_agent_2
Type of information: Numeric
Data type: Decimal
Update interval (in sec): 10

Test scenario

  1. Run program "top" in each agent's host. Leave it running throughout the test.
  2. Monitor the traffic towards your SNMP proxy with tcpdump or Wireshark

Expected results: Zabbix server walks the HOST-RESOURCES-MIB::hrSWRunPath tree for each community string only once and then just rechecks the index for the value "top" (gets it from the SNMP proxy) for each community.

Actual results: Zabbix server walks the HOST-RESOURCES-MIB::hrSWRunPath tree each time before getting the HOST-RESOURCES-MIB::hrSWRunPerfMem value for the particular agent (selected with the community string). E.g. it rebuilds the cache every time it polls one of the configured dynamic items.

Variation

  • Use different SNMP versions (but same for all items).
  • Use different SNMP versions (different for all items).
  • Close the "top" program in one or both of the agents, then restart it.

Setting up SNMP proxy with Net-SNMP

The following section describes setting up a SNMP proxy and configuring community strings as host selectors.

For reference see Net-SNMP wiki page on setting up SNMP proxy.

Environment

      +------------+           +------------+
      |            |           |            |
      |  agent #1  +-----+-----+  agent #2  |
      |            |     |     |            |
      +------------+     |     +------------+
                         |                   
+---------+        +-----+-----+             
|         |        |           |             
|   NMS   +--------+   proxy   |             
|         |        |           |             
+---------+        +-----------+             
  • All nodes running on separate machines (proxy and agents on virtual machines).
  • Net-SNMP installed on all nodes. Note that on Debian-derived Linux distribution one has to install IANA and IETF MIBs separately from the Net-SNMP package using snmp-mibs-downloader package or other means.
  • Zabbix server is running on the NMS.

SNMP configuration

Agent #1

/etc/snmp/snmpd.conf
rocommunity  public

syslocation  snmp_agent_1
syscontact  [email protected]

Agent #2

/etc/snmp/snmpd.conf
rocommunity  public

syslocation  snmp_agent_2
syscontact  [email protected]

Proxy

/etc/snmp/snmpd.conf
syslocation snmp_proxy
syscontact  [email protected]

view system_view included .1.3

com2sec proxy_user default public

group proxy_group v1 proxy_user
group proxy_group v2c proxy_user

access proxy_group "" any noauth exact system_view none none

com2sec -Cn context_agent_1 proxy_user default community_agent_1
com2sec -Cn context_agent_2 proxy_user default community_agent_2

access proxy_group context_agent any noauth prefix system_view none none

# Change the IP addresses of the IP addresses of your agents!
proxy -Cn context_agent_1 -v 2c -c public 192.168.0.2 .1.3
proxy -Cn context_agent_2 -v 2c -c public 192.168.0.3 .1.3

Testing the SNMP proxy

  • Restart snmpd on agents and proxies. On Debian-derived distributions:
    $ sudo /etc/init.d/snmpd restart
    
  • Use snmpwalk command with IP address of proxy and with community string "community_agent_1" to access agent #1.
    # Change the IP address to the IP address of your proxy!
    snmpwalk -v2c -c community_agent_1 192.168.0.4 .1.3
    
  • Use snmpwalk command with IP address of proxy and with community string "community_agent_2" to access agent #2.
    # Change the IP address to the IP address of your proxy!
    snmpwalk -v2c -c community_agent_2 192.168.0.4 .1.3
    

As you can see the output is different. This solution will also work with SNMPv1.

Comment by Aleksandrs Saveljevs [ 2015 Jul 27 ]

(4) The following warnings are not related to this development, but it would be nice to fix them in 2.2 and up, so adding a comment here:

setproctitle.c:40:26: warning: expression which evaluates to zero treated as a null pointer constant of type
      'char *' [-Wnon-literal-null-conversion]
static char     *empty_str = '\0';
                             ^~~~
1 warning generated.
expression.c:4326:3: warning: variable 'key_type' is used uninitialized whenever switch default is taken
      [-Wsometimes-uninitialized]
                default:
                ^~~~~~~
expression.c:4330:37: note: uninitialized use occurs here
        ret = replace_key_params_dyn(data, key_type, replace_key_param, &replace_key_param_data, error...
                                           ^~~~~~~~
expression.c:4309:17: note: initialize the variable 'key_type' to silence this warning
        int                             key_type, ret;
                                                ^
                                                 = 0
1 warning generated.

The first is clearly a bug, the second occurs if we are compiling with -DNDEBUG. There, assert(0) should be replaced with something like THIS_SHOULD_NEVER_HAPPEN; exit(EXIT_FAILURE).

<dimir> RESOLVED in r55143,r55150

<sandis.neilands> CLOSED. Rolled back setproctitle empty string fix in 55508 (should be investigated and tested in a separate ZBX)
Reasons:

  • SVN history is missing for the relevant changes;
  • something else might have been intended;
  • platform dependent code - there might be some weird reason why it is this way.

See ZBX-9863.

<sandis.neilands> This correction fixed Coverity defect CID 118964. Variable key_type could be (theoretically) used without initialization if the control went to the "default" clause of the switch. Thanks!

Comment by dimir [ 2015 Aug 26 ]

Somebody should review changes in r55144, r55173.

<sandis.neilands> CLOSED.

Comment by dimir [ 2015 Sep 14 ]

sandis.neilands, your comments in C-code look great, thanks!

Comment by dimir [ 2015 Sep 14 ]

Fixed in pre-2.2.11 (r55533), pre-2.4.7 (r55534), pre-3.0.0 (r55536).

Generated at Fri Mar 29 14:59:18 EET 2024 using Jira 9.12.4#9120004-sha1:625303b708afdb767e17cb2838290c41888e9ff0.