-
Type:
Problem report
-
Resolution: Unresolved
-
Priority:
Trivial
-
None
-
Affects Version/s: 7.0.25
-
Component/s: Server (S)
-
None
-
Environment:**Zabbix Server:** 7.0.25 LTS, packaged deb on Debian Trixie (kernel 6.12.74)
- **DB:** PostgreSQL 17 + TimescaleDB
- **SNMP hosts:** ~30 monitored devices (HP/Aruba switches, Ubiquiti UniFi APs, HP iLO BMCs, Kyocera printers, HW-Group temp sensors, Aten/Supermicro BMC)
- **All on the same management VLAN** (10.10.x.x), polled from a single Zabbix server (no proxies)
- **Templates:** `Generic by SNMP`, `HP Enterprise Switch by SNMP`, `Supermicro Aten by SNMP` — all stock vendor-supplied
- **Relevant config:**
```
StartSNMPPollers = 5 # bug present
StartPollers = 10
StartPreprocessors = 8
Timeout = 10
```
**Zabbix Server:** 7.0.25 LTS, packaged deb on Debian Trixie (kernel 6.12.74) - **DB:** PostgreSQL 17 + TimescaleDB - **SNMP hosts:** ~30 monitored devices (HP/Aruba switches, Ubiquiti UniFi APs, HP iLO BMCs, Kyocera printers, HW-Group temp sensors, Aten/Supermicro BMC) - **All on the same management VLAN** (10.10.x.x), polled from a single Zabbix server (no proxies) - **Templates:** `Generic by SNMP`, `HP Enterprise Switch by SNMP`, `Supermicro Aten by SNMP` — all stock vendor-supplied - **Relevant config:** ``` StartSNMPPollers = 5 # bug present StartPollers = 10 StartPreprocessors = 8 Timeout = 10 ```
-
- Steps to Reproduce
1. Configure a Zabbix 7.0.25 server with *5 or more SNMP pollers* (`StartSNMPPollers=5+`).
2. Add ≥ 20 SNMP-monitored hosts in the same subnet, each with the `system.name[sysName.0]` item from `Generic by SNMP` (or any `cpqHe*` / vendor-specific SNMP item from vendor templates).
3. Tested with both `bulk=1` (default) and `bulk=0` on the SNMP interfaces — no difference.
4. Tested with both `useip=1` (fixed IP) and `useip=0` (DNS resolution) — no difference.
5. Allow the pollers to run for 5+ minutes.
-
- Expected Behaviour
Each host's SNMP item is populated with the SNMP response from *that host's* IP/DNS endpoint.
-
- Actual Behaviour
SNMP responses are intermittently routed to the *wrong host's items*. Concrete observations:
| Polled host (technical) | Item `system.name[sysName.0]` value | Expected |
| — | — | — |
| `AP05` | `AP15` | `AP05` |
| `AP04` | `AP16` | `AP04` |
| `AP02` | `AP14` | `AP02` |
| `AP01` | `AP12` | `AP01` |
| `PLL01` | `PWEA2` | `PLL01` |
| `SWBZ09` | `SWSW01` | `SWBZ09` |
| `SWBZ05` | `SWBZ06` | `SWBZ05` |
The "incorrect" sysName values are always those of *other hosts in the same poll batch*, never random / garbage strings. This rules out network-level corruption and points to a response-to-host mapping race in the async poller layer.
The triggering effect: stock template trigger `"System name has changed"` fires repeatedly (and incorrectly).
Manual `snmpget -v2c -c <community> <ip> .1.3.6.1.2.1.1.5.0` from the Zabbix server host always returns the *correct* sysName for the queried IP — the underlying SNMP agents and the network are fine.
-
- Workaround
Setting `StartSNMPPollers=1` in `zabbix_server.conf` and restarting the server *completely eliminates* the cross-contamination. All SNMP items immediately start showing correct values from their own hosts. Trade-off: a single poller is slower at clearing the queue but works correctly.
We tried (without success):
- `bulk=0` on all SNMP interfaces
- Switching `useip=0` (DNS mode) on all interfaces
- `config_cache_reload`, `snmp_cache_reload`, full server restart
- Reducing item update intervals
Only `StartSNMPPollers=1` resolved the issue.
-
- Hypothesis
The async SNMP poller infrastructure dispatches multiple in-flight SNMP requests in parallel (one per poller worker), and the response-to-item mapping appears to confuse responses when:
- multiple in-flight requests are outstanding simultaneously, AND
- response packets arrive in close timing windows (likely on the same UDP socket or a shared completion queue).
-
- Evidence / Logs
(Collected on 2026-05-07. Excerpts redacted — IPs / hostnames available on request.)
- Item history shows the exact moment when responses got mismapped (timestamps correlate with poller process activity).
- After `StartSNMPPollers=1` + server restart, *all subsequent polls* (verified for AP01-08, SWBZ09 etc.) show correct sysName values.
- No firewall / network changes occurred at the boundaries of the affected time window.
- Reproducer is robust: pre-restart, contamination resumes within ~5 minutes; post-restart with 1 poller, never reoccurs.
-
- Severity (Suggested)
*Major* — silent data corruption (items receive values from the wrong source), causing false-positive alerts on stock-template triggers (e.g. `"System name has changed"`, `"Disk media type changed"`, `"Sensor name changed"`). Affects any monitoring setup with mid-to-large SNMP fleets.
-
- Submitter Notes
- The issue did NOT exist on Zabbix 6.x with traditional synchronous SNMP pollers.
- The issue ONLY appeared after enabling 5 parallel async SNMP pollers in 7.0.25.
- We have not tested 7.0.26+ — the bug may already be fixed upstream. Filing for visibility.